Largest Computer Chip Accelerates Deep Learning

Artificial intelligence uses algorithms which automatically learn and adapt through a process known as machine learning or deep learning. Machine learning is where algorithms spot patterns from data and perform actions, therefore, it can automatically improve from experience. A type of machine learning, deep learning uses neural networks which is a system based on imitating the functions in a human brain. The difference is that deep learning can optimize itself without any human intervention, whereas machine learning, in general, requires some input.

However, the training procedure is very intensive, requiring huge amounts of processing power and energy. GPU designers such as Nvidia are ever as important in artificial intelligence industry as GPU chips have been found the be perfect for the maths related to the deep learning technique. As a result, training normally involved multiple GPUs bundled together to speed up the process.

As you can imagine these chips are generally very small and can fit in many devices from as small as your smartphone. Cerebras, a Californian based start-up company, aims to upgrade the chips used for training by a long way. Called the Cerebras Wafer-Scale Engine (WSE) and 215 mm in length, it’s the worlds largest chip ever built by over 56x the size of any other chip. The WSE contains 1.2 trillion transistors and 400,000 AI-optimized cores, by comparison, the largest GPU has 21.1 billion transistors.

Cerebras Wafer-Scale Engine in comparison to the largest AI chip.

Cerebras Wafer-Scale Engine in comparison to the largest AI chip.

To construct this magnificent “chip” Cerebras needed to work with a contract chip manufacturer called TSMC which has worked with companies including Apple and Nvidia. Normally chips are made from circular wafers of silicon where many chips are placed in a grid onto the wafer which is then cut up to produce around 100 separate chips. However, for Cerebras’s chip, a new process was needed where one giant chip as placed onto the wafer. Finally, the largest possible square was cut out from the wafer to produce the WSE.

So why is it so good? Andrew Feldman, CEO of Cerebras, stated that the WSE can do the provide a processing speed in parallel to hundreds of GPUs combined while consuming much less energy and space. Furthermore, it contains more memory than any other chip in the world with over 18 gigabytes worth of memory circuits. Feldman adds that the data can move around the chip around 1,000 times quicker than it can between the usual GPUs connected together.

Nonetheless, although Feldman has not discussed the price of these chips or the complete server built for the specific chip, Jim McGregor, founder of Tirias Research, speculates that the system might cost millions of dollars. Moreover, data centers may need to change to accommodate the servers which add to the cost. Thus, for larger companies like Amazon or Facebook, investment in these chips may be more suitable.