Neural networks have become so associated with machine learning that sometimes us computer scientists forget the origin of the name. The silicon neural networks we work with were inspired by the biological neural networks in our very own brains. And while we may have made remarkable strides towards matching the ability of the computational engines in our skulls, we still have a long way to go: as Mitchel Waldrop describes in this Nature article, the brain “can carry out computations that challenge the world's largest supercomputers […] in a package that is smaller than a shoebox, consumes less power than a household light bulb, and contains nothing remotely like a central processor”
In recent years, some artificial intelligence researchers have gone back to the brain for further inspiration. They seek to find something that can be transferred to artificial neural networks, something which will allow us to unlock more of the raw efficiency and performance tantalizingly promised by biology. One idea that has stood out is sparse coding. Peter Kloppenburg and Martin Paul Nawrot’s 2014 paper summarizes sparse coding as when “a specific stimulus [in the brain’s sensory system] activates only a few spikes in a small number of neurons.” In other words, although the brain’s neural network is densely connected, most stimuli only activate (“spike”) a small number of neurons, thus leading to a sparse representation of a stimuli in the brain.
Sparse coding has spent the past several years as a cutting edge, but not quite practical, approach to improving AI technology. ThirdAI’s new sparse coding implementation finally delivers the promise of a brain-inspired neural network that isn’t just different from it’s more traditional peers, but is actually better than them.
If sparse coding is why the brain is so efficient, why aren’t we already using it to train our neural networks? As it turns out, naïve implementations of sparse coding actually slow down, rather than speed up, training. The problem is that in order to port sparse coding from brain to computer, we need to teach our computers the brain’s trick of knowing exactly which few neurons to spike. The intuitive way to identify the highest activation neurons that we want to spike is to have our computer first compute all the activations of the many candidate neurons, and then sort them, selecting the best from there. In practice, however, these computational tasks are more expensive than just doing a dense matrix multiplication against all neurons, so modern neural networks do not use sparse coding. The brain, however, figures out the highest activation neurons to spike as if by magic, without communicating with and comparing all candidate neurons.
But replicating that biological magic trick is no longer out of reach. With clever use of data structures and memory lookups rather than expensive brute-force computations, we can achieve the efficiency promised by sparse coding in the brain. Essentially, instead of exhaustively evaluating and then sorting neuron activations, we can reorganize the neurons themselves in computer memory so that neurons with similar activation patterns are stored close together. As neurons get updated, they adjust their locations. When training, we receive the activations from the previous layer and query memory for a cluster of neurons that are similar to the current activation pattern. We can then spike the returned neurons, all without performing an exhaustive search! See this talk for much more details on how this can be achieved: Click here
The first documented idea of using hash tables for neural networks was described in this award winning talk at NIPS 2014 (9th Dec 2014, see slide 5).
Even with more advanced memory-based frameworks, most information retrieval algorithms for retrieving similar neurons simply have too much overhead compared to optimized dense matrix multiplication algorithms. For every sample and for every layer, a sparse coding algorithm must query the neuron similarity index and return a list of neurons to spike, a performance cost that quickly adds up. At ThirdAI, we’ve built an efficient associative memory step that makes our neural networks faster than the state of the art.
Instead of regular memory access followed by many arithmetic operations, our BOLT Engine manipulates how parameters of the neural network are organized and accessed in memory. We can get away with just a small number of arithmetic operations (in the form of hash computations), followed by a few irregular memory accesses, followed by a tiny amount of machine learning arithmetic. As a result, we use orders of magnitude lower FLOPS on a standard CPU to achieve the same accuracy as traditional algorithms on a GPU. We’d love for you to try our techniques out, and experience for yourself the miraculous efficiency of sparse coding. After all, you’re already using it in your brain to read this right now.