In the current era of rapid digital development, the demand for data processing and storage is growing increasingly. The traditional model of separated computation and storage is gradually becoming inadequate to meet the requirements of high efficiency and low energy consumption. Therefore, the industry is in urgent need of a new type of solution to address this challenge. Compute-in-memory is one of the hottest solutions currently.
01
Advantages of Compute-in-Memory Architecture
Compute-in-memory technology helps to solve the "memory wall" and "power wall" problems under the traditional von Neumann architecture. The von Neumann architecture requires data to be constantly "read and written" between the storage unit and the processing unit. This back-and-forth data transfer between the two consumes a significant amount of transfer power. According to Intel's research, when semiconductor processes reach 7nm, the data transfer power consumption is as high as 35pJ/bit, accounting for 63.7% of the total power consumption. The power loss caused by data transfer is becoming increasingly severe, limiting the speed and efficiency of chip development, forming the "power wall" problem.
The "memory wall" refers to the situation where the performance of the memory cannot keep up with the performance of the CPU, causing the CPU to spend a lot of time waiting for the memory to complete read and write operations, thereby reducing the overall performance of the system. The "memory wall" has become a major obstacle in data computation applications. In particular, the biggest challenge for deep learning acceleration is the frequent movement of data between the computing unit and the storage unit.
Advertisement
The advantage of compute-in-memory is to break the memory wall, eliminate unnecessary data transfer delays and power consumption, and use the storage unit to enhance computing power, increasing computational efficiency by hundreds to thousands of times and reducing costs.
Compute-in-memory belongs to the non-von Neumann architecture, which can provide greater computing power (over 1000TOPS) and higher energy efficiency (exceeding 10-100TOPS/W) in specific fields, significantly surpassing the existing ASIC computing power chips.In addition to being used for AI computing, memory-computing technology can also be applied to in-memory computing chips and neuromorphic chips, representing the future mainstream architecture for big data computing chips.
02
Classification of In-Memory Computing Technology
Currently, there is no unified classification for in-memory computing technology. The mainstream method of classification is based on the proximity of the computing units to the storage units, which can be roughly divided into three categories: Processing-Near-Memory (PNM), Processing-In-Memory (PIM), and Computing-In-Memory (CIM).
Processing-Near-Memory is a relatively mature technological path. It utilizes advanced packaging techniques to encapsulate the computing logic chips and memory together, achieving high I/O density by reducing the path between memory and processing units, which in turn leads to high memory bandwidth and lower access overhead. PNM is mainly implemented through technologies such as 2.5D and 3D stacking and is widely used in various CPUs and GPUs.
Processing-In-Memory focuses primarily on embedding the computation process as much as possible within the memory itself. This approach aims to reduce the frequency of processor access to memory, as most of the computation has already been completed within the memory. This design helps to eliminate the issues caused by the von Neumann bottleneck, enhancing data processing speed and efficiency.
Computing-In-Memory is also a technology that combines computation and storage. It has two main approaches. The first approach is through circuit innovation, enabling the memory itself to have computing capabilities. This usually requires modifications to storage devices such as SRAM or MRAM to implement computational functions in places like the decoder during data readout. This method typically has a high energy efficiency ratio, but computational accuracy may be limited.
The other approach is to integrate additional computing units within the memory to support high-precision computation. This approach is mainly targeted at memories with high access overhead for the main processor, such as DRAM. However, the DRAM process is not very friendly to computational logic circuits, so the challenge of integrating computing units is significant.
Computing-In-Memory is also what most domestic startups refer to as in-memory computing.
It is worth noting that different companies have chosen different tracks to bet on in the research and development and practice in this field. Some companies focus on optimizing the collaborative efficiency between storage and computation, striving to achieve a qualitative leap in big data processing; while others pay more attention to the flexibility and scalability of the architecture to adapt to the constantly changing market demands. In addition, the storage media relied upon by in-memory computing also shows diversity, such as volatile memory represented by SRAM and DRAM, and non-volatile memory represented by Flash. Overall, different storage media have their own strengths and weaknesses.Major manufacturers, each with their bets
Looking at the development of in-memory computing, since 2017, major companies such as NVIDIA, Microsoft, and Samsung have proposed prototypes of in-memory computing. In the same year, domestic in-memory computing chip companies began to emerge.
The demand from major manufacturers for in-memory computing architecture is practical and quick to implement, and as the technology closest to engineering implementation, near-memory computing has become the first choice for major manufacturers. Major manufacturers with rich ecosystems, such as Tesla and Samsung, as well as traditional chip manufacturers like Intel and IBM, are all laying out near-memory computing.
Research progress of major international manufacturers
On the research path of in-memory computing, Samsung has chosen multiple technical routes for attempts. At the beginning of 2021, Samsung released a new type of memory based on HBM, which integrates an AI processor capable of achieving up to 1.2 TFLOPS of computing power. The new HBM-PIM chip introduces an AI engine into each memory bank, thereby transferring processing operations to HBM, which can reduce the burden of moving data between memory and processors. Samsung stated that the new HBM-PIM chip can provide twice the system performance while reducing energy consumption by more than 70%.
In January 2022, Samsung Electronics also brought new research results. The company published the world's first in-memory computing research based on MRAM (magnetic random access memory) in the top academic journal Nature. It is reported that the research team of Samsung Electronics built a new MRAM array structure and ran AI algorithms such as handwritten digit recognition and face detection on an MRAM array chip based on 28nm CMOS process, with accuracy rates of 98% and 93%, respectively.
In February 2022, SK Hynix also announced the development of the next-generation intelligent memory chip technology PIM. SK Hynix also developed the company's first product based on PIM technology - the GDDR6-AiM sample. GDDR6-AiM is a product that adds computing functions to GDDR6 memory with a data transfer speed of 16Gbps. Compared with traditional DRAM, a system that combines GDDR6-AiM with CPU and GPU can increase the calculation speed to up to 16 times in specific computing environments. GDDR6-AiM is expected to have a wide range of applications in fields such as machine learning, high-performance computing, big data computing, and storage. Subsequently, in October 2022, SK Hynix announced the launch of a computational memory solution (CMS) based on CXL.
TSMC is also conducting research on in-memory computing. The company's researchers proposed an in-memory computing solution based on a digitally improved SRAM design at the International Solid-State Circuits Conference (ISSCC 2021) at the beginning of 2021, which can support larger neural networks. In January 2024, TSMC and the Industrial Technology Research Institute announced the successful development of a spin-orbit torque magnetic memory (SOT-MRAM) array chip, marking a major breakthrough in the field of next-generation MRAM memory technology. This innovative product not only adopts an advanced computing architecture, but its power consumption is only 1% of the same technology STT-MRAM. The cooperation between the Industrial Technology Research Institute and TSMC has made SOT-MRAM reach a working speed of 10ns, further improving the performance of in-memory computing.
Intel is also a major promoter of MRAM technology. The company uses a 22nm process based on FinFET technology. At the end of 2018, Intel first publicly introduced its MRAM research results and launched an STT-MRAM based on a 22nm FinFET process. At that time, the company stated that this was the first FinFET-based MRAM product and stated that it already had the mass production capability of the product.Research Progress of Major Domestic Manufacturers
Domestic startups are focusing on in-memory computing that does not require consideration of advanced process technologies. Among them, startups such as Zhi Cun Technology, Yi Zhu Science and Technology, and Jiu Tian Rui Xin are betting on PIM (Processing-in-Memory), CIM (Computing-in-Memory), and other "storage" and "computation" closer integrated storage-computing technologies. Companies like Yi Zhu Science and Technology and Qian Xin Technology are focusing on AI high-computing scenarios such as large model computing and autonomous driving; while Shan Yi, Xin Yi Technology, Ping Xin Technology, and Zhi Cun Technology are focusing on edge computing scenarios with low computing power, such as the Internet of Things, wearable devices, and smart homes.
So, how is the research and mass production progress of various companies currently? What are the differences in their technological paths? What is the overall trend for the future of integrated storage-computing technology?
Cloud and Edge High-Computing Power Enterprises
Yi Zhu Science and Technology
Founded in June 2020, Yi Zhu Science and Technology is committed to designing AI high-computing power chips with an integrated storage-computing architecture. It is the first to combine the memristor ReRAM with the integrated storage-computing architecture, providing a new path for the development of AI high-computing power chips with higher cost-effectiveness, higher energy efficiency, and greater computing power development space through a fully digital chip design approach, based on the current industrial structure. In 2023, Yi Zhu Science and Technology was the first to propose the "integrated storage-computing hyper-architecture" as a new technological development path, adding new momentum to the further development of China's AI computing power chips.
At present, Yi Zhu Science and Technology has successfully demonstrated a high-precision, low-power integrated storage-computing AI high-computing power POC chip based on the memristor ReRAM. The energy efficiency performance, based on traditional process technology, has been verified by a third-party organization and exceeds the average performance of traditional architecture AI chips by more than 10 times.
Qian Xin TechnologyQianxin Technology, established in 2019, specializes in the development of high-performance computing and storage integrated chips and computing solutions for artificial intelligence and scientific computing fields. In 2019, it was the first to propose a reconfigurable computing and storage integrated technology product architecture, which can increase the computing throughput by 10-40 times compared to traditional AI chips. Currently, Qianxin Technology's reconfigurable computing and storage integrated chips (prototype) have been tested or implemented in fields such as cloud computing, autonomous driving perception, image classification, and license plate recognition; its high-performance computing and storage integrated chip product prototype has also been the first in the country to pass internal testing by major internet companies.
Houmo Intelligence
Houmo Intelligence was founded in 2020, and in May 2023, Houmo officially launched the computing and storage integrated smart driving chip Houmo Hongtu H30, with a physical computing power of 256 TOPS and a typical power consumption of 35W. According to the results of Houmo Laboratory and MLPerf public testing, in the performance and power consumption comparison of ResNet50, the H30, which uses a 12nm process, has more than double the performance and more than 50% less power consumption compared to similar chips.
According to Houmo Intelligence's co-founder and Vice President of R&D, Chen Liang, Hongtu H30 has achieved six major technological breakthroughs with its innovative computing and storage integrated architecture, namely high computing power, full precision, low power consumption, automotive grade, mass production capability, and universality. Hongtu H30 is based on SRAM storage media and uses a digital computing and storage integrated architecture, with extremely low memory access power consumption and ultra-high computing density. Under the Int8 data precision condition, its AI core IPU has an energy efficiency ratio of up to 15 Tops/W, which is more than 7 times that of traditional architecture chips. At the same time, Houmo Intelligence's second-generation product, Hongtu H50, is being developed with full force and is expected to be launched in 2024, supporting customers' mass production models in 2025.
Edge and Small Computing Power Enterprises
Zhicun Technology
Zhicun Technology's solution is to redesign the storage device, using the physical characteristics of Flash memory storage cells, to transform and redesign the storage array and peripheral circuits to accommodate more data, while also storing the operators in the storage device, enabling each unit to perform analog calculations and directly output the calculation results, achieving the goal of computing and storage integration.
Zhicun Technology's computing and storage integrated chips have been introduced into multiple wearable device products, with an annual sales volume expected to reach millions. In 2020, Zhicun Technology launched the computing and storage integrated accelerator WTM1001, and in 2022, Zhicun Technology launched the world's first large-scale mass-produced computing-in-memory chip WTM2101. This chip has been used by several internationally renowned enterprises for smart voice and AI health monitoring scenarios. Compared with traditional chips, this chip has significant advantages in computing power and power consumption, empowering industry users to enhance end-side AI capabilities and promote the application.
At present, Zhicun Technology's self-developed edge-side computing power chip WTM-8 series is about to be mass-produced. This series of chips can provide at least 24 Tops of computing power, with power consumption only 5% of similar market solutions, helping mobile devices to achieve higher performance in image processing and spatial computing. Around 2025, Zhicun Technology will launch the WTM-C series of products, which can be used for edge servers, etc. With the advancement of technology in integration scale and process, it is expected that computing-in-memory products will see an average annual increase of 5-10 times in computing power in the next few years.
Jiutian Rui Xin
(The information for "Jiutian Rui Xin" is not provided in the original text, so it cannot be translated.)Jiutian Ruixin specializes in the research and development of neuromorphic sensory-computing integrated chips, providing the latest solutions for efficient and low-power operation of artificial intelligence systems. It is widely used in fields with strong demands for low power consumption and latency, such as AIoT, offering AI chips for both audio and visual ends. Based on years of global leading research, learning, and practical experience in the field of vision, as well as research and development cooperation and strategic investment from the world's top image sensor companies, Jiutian Ruixin has designed an ultra-high energy efficiency (20Tops/W) SRAM-based sensory-computing integrated architecture chip ADA20X, which is widely applicable in the field of vision.
04
The technology of sensory-computing integration is on the verge of large-scale application
With the continuous growth of AI computing power demand, the technology of sensory-computing integration is approaching the node of large-scale mass production. As the technology matures and is widely commercially implemented, its market space is expected to grow explosively.
According to the latest report "Global Sensory-Computing Integrated Technology Market Report 2023-2029" by the QYResearch research team, it is estimated that the global market size for sensory-computing integrated technology will reach 30.63 billion US dollars by 2029, with a compound annual growth rate (CAGR) of 154.7% in the coming years. Behind this high growth rate is the widespread application and deep integration of sensory-computing integrated technology in various fields such as data processing, artificial intelligence, and the Internet of Things.
With the rapid development of technologies such as big data, cloud computing, and artificial intelligence, sensory-computing integrated technology, as a key technology for efficient data storage and computing, is becoming increasingly important. Faced with such a huge market space and development opportunities, we also need to be aware of the challenges and difficulties faced by sensory-computing integrated technology. For example, sensory-computing integrated technology is a very complex comprehensive innovation, and the industry is not yet mature. There are still many challenges in the industry chain, such as insufficient upstream support and mismatched downstream applications. However, these challenges also form a comprehensive barrier that can be built in the future innovation of sensory-computing integration.
In the future, with the continuous advancement of technology and the continuous expansion of applications, sensory-computing integrated technology will play an important role in more fields, injecting new momentum into global economic development. At the same time, it will also have a profound impact on the related industry chain, promoting innovation and upgrading of the entire technology industry.
Comment Box