MIT Researchers Delivers Faster AI Creation
When a programmer creates an Artificial Neural Network, what it really means is that they have designed a network that trains itself to transform the inputs to the outputs given. However, this still means that a worker has to design a structure for the neural network, for example, they have to define how many neurons a network has, how many layers it uses and the activation function.
To get over this, scientists designed algorithms that can define networks that are ideal for a given problem called NAS (Neural Architecture Search); programs that design AI - sounds scary, doesn’t it? There really is no need to fear though, these algorithms required a massive amount of computation power and time. To give you a sense of how much power is required, it took a squad of GPUs at Google 48,000 GPU hours to work through their state-of-the-art algorithm and produce 1 single convolutional neural network for image recognition.
However, in a recently published paper at an international conference, a team of 3 researchers from MIT describe an algorithm that could cut this time to just 200 hours (that’s 1/240th of what Google managed with their best algorithm). Song Han, the co-author of the research paper, says the aim of the team of researchers is to “democratise AI” and create “Push-button solutions” for both experts and non-experts.
This accelerated formula is achieved mainly using 2 techniques. The researchers use one technique called “Path-Level Binarization” which doesn’t store potential path candidates but instead generates and then processes them one by one. The other technique is “Path-Level Pruning”. Usually, when a NAS algorithm encounters unnecessary neurons, it deletes them, however, the researchers realised they can speed up the entire process by deleting the entire category of potential architecture without losing any quality.
Another benefit of this innovation is that outputted convolutional neural networks run 1.8x faster than counterparts generated through other NAS algorithms for certain hardware. This is done by discarding some parts of the network without having any noticeable detriment on the final result, this means each time you use the network, less computation is needed to get output. This also means the network trains and increases in accuracy faster.
The possible consequences for this are fantastic, small startups can create an optimal neural network for any potential task with far greater efficiency than before, lowering the time needed and costs as well. However, there's a flip side to this; it could potentially lead people being laid off by firms. For example, designing a neural network for a project that could previously require a team of 6 people can easily now be replaced with one person who uses this algorithm.