Ahoy there š¢ā,
Matt Squire here, CTO and co-founder of Fuzzy Labs, and this is the 2nd edition of MLOps.WTF, a fortnightly newsletter where I discuss topics in Machine Learning, AI, and MLOps.
Super-intelligent aliens, electronic brains, and the end of GPUs
I believe thereās a non-trivial chance that the Hungarian mathematician and physicist John von Neumann was actually a super-intelligent alien sent here to study our species. I just canāt convince any historians to take me seriously.
Among his many revolutionary ideas was the blueprint for computer design that is still used today, largely unchanged for 80 years. In the von Neumann architecture, thereās a random-access memory in which programs and data are stored, and a processor that executes instructions. These instructions represent different kinds of operation, like adding two numbers, branching, or reading (writing) to (from) memory.
Perhaps youāre thinking thatās an obvious idea. But thatās only because this is the model weāre all used to now: no matter what programming language we use, it all reduces down to simple instruction sequences that get loaded into memory and executed by a processor.
This architecture is universal, which means it can run any program you can think of. From games, to advanced text editors (I obviously mean Emacs), to large language models, itās all possible. And we can build enough abstraction layers that when you run a Hugging Face pipeline, you donāt have to think about the insane machinations necessary to make that work on the hardware.
Yet despite its flexibility, it isnāt always the most efficient. If you want to search a database index, itās perfect, but if you want to train a neural network, it turns out to be a very poor choice. The problem lies in the strong separation of state (memory) and computation, making training and inference strongly IO-bound. In nature, thereās no such bottleneck: every neuron contains its own state and processing capability, so instead of one big processor with one block of memory, we have billions of small processors, each with its own internal memory.
Essentially weāve borrowed an information processing paradigm from nature, and forced it onto an entirely unsuitable architecture.
This all makes sense when you consider energy use. While ChatGPT needs a nation stateās worth of power, a bowl of porridge in the morning is enough for your brain to outperform it at most tasks.
For this reason, many people have asked whether the neural architecture found in nature can be replicated in silicon, potentially unlocking more speed and scale, with lower energy consumption. This so-called neuromorphic hardware might just be what renders GPUs obsolete for AI.
But wait, donāt rip that RTX4090 out of your PC yet! Despite some promising results, weāre still a way out from true commercial viability.
One offering comes from Intel, whose Loihi chip presents 128 āneuromorphic coresā. Originally hosting 1024 neurons per core, this increased to 8192 with Loihi 2. Intel recently demonstrated how far Loihi can scale with Hala Point, which combines 1152 Loihi 2 chips to simulate a total of 1.15 billion neurons.
Another notable project is SpiNNaker, a research project from the University of Manchester aiming to simulate brain-like architectures, with applications in robotics, edge computing, and neuroscience research. In its current form it consists of ~1 million ARM processor cores, taking up an entire room filled with server racks.
While the larger-scale demonstrations like Hala Point and SpiNNaker arenāt really practical outside of the lab, at the lower end of the scale, something like a Loihi chip is certainly viable for edge applications, where models are small and reducing power consumption is important.
Now, since this is an MLOps newsletter, it would be remiss of me not to discuss the tooling ecosystem. The Open Neuromorphic Project aims to bring together collaborators across academia and industry to build open source tools, platforms, and standards for running models on neuromorphic hardware. They maintain a list of frameworks for model training and deployment.
To finish off, there is one important detail that Iāve missed out: these processors are designed to run a particular flavour of network, known as a spiking neural network, which happens to be a very different approach to the one weāre all familiar with, meaning thereās no straightforward way to just run Mistral on this strange new hardware.Ā
And while there is some work to bridge transformer models with spiking networks, whatās really on offer with neuromorphic computing is an entirely new way of looking at machine learning, one where our models can continuously learn and adapt, and where the lines between inference and training become fuzzy.
Whatās clear is that the paradigm followed by the current generation of large language models is power hungry and data hungry in a way that limits our ability to continue improving through scale alone. Whether neuromorphic computing offers an alternative paradigm free from these constraints remains to be seen.
And finally
Assorted things of interest
You may remember the furore a couple of years ago, when an AI generated image won an art competition at the Colorado State Fair. Well, last week, justice was restored, when Miles Astray won an AI photo competition with a genuine photo of a seemingly headless flamingo. Of course, after coming clean, Astrayās photo was disqualified from the competition. While he has no intention of appealing the decision, I imagine that if he did, he wouldnāt have a leg to stand on.
If you liked Martin Scorceseās 3.5 hour long masterpiece The Irishman, but were disappointed that it teaches you nothing about the current state of AI, then Andrej Karpathy has you covered. In his new 4 hour epic āLet's reproduce GPT-2ā, he covers the end-to-end process of building an LLM from scratch. Grab your popcorn and your warm up your GPU šæ
Thanks for reading!
Matt
About Matt
Matt Squire is a human being (we swear), programmer, and tech nerd who likes AI and MLOps. Matt enjoys unusual programming languages, dabbling with hardware, and computer science esoterica. Heās the CTO and co-founder of Fuzzy Labs, an MLOps company based in the UK.
Each edition of the MLOps.WTF newsletter is a deep dive into a certain topic relating to productionising machine learning. If youād like to suggest a topic, drop us an email!