FRANCESCO GADALETA
ARE LARGE LANGUAGE MODELS THE ULTIMATE DATABASE?
I
n a recent article, software engineer François Chollet (the mind behind the Python library Keras) put forward a bold interpretation of large language models (LLMs): he claimed that modern LLMs act like databases of programs. Chollet’s interpretation might on the surface sound strange, but I believe there are lots of ways his analogy is an astute one. In this article, I’ll explore whether LLMs really are a new breed of database – and deepdive into the intricate structure of LLMs, revealing how these powerful algorithms exploit concepts from the past.
FRANCESCO GADALETA is a seasoned professional in the field of technology, AI and data science. He is the founder of Amethix Technologies, a firm specialising in advanced data and robotics solutions. Francesco also shares his insights and knowledge as the host of the podcast Data Science at Home. His illustrious career includes a significant tenure as the Chief Data Officer at Abe AI, which was later acquired by Envestnet Yodlee Inc. Francesco was a pivotal member of the Advanced Analytics Team at Johnson & Johnson. His professional interests are diverse, spanning applied mathematics, advanced machine learning, computer programming, robotics, and the study of decentralised and distributed systems. Francesco’s expertise spans domains including healthcare, defence, pharma, energy, and finance.
THE ORIGINS OF LLMS Before we explore Chollet’s analogy, let’s consider his take on the development of LLMs’ core methodology. Chollet sees this development stem from a key advancement in the field of natural language processing (NLP) made a decade ago by Tomáš Mikolov and the folks at Google. Mikolov introduced the Word2vec arithmetic, which solved the problem of how to numerically compute text. The Word2vec
36 | THE DATA SCIENTIST
algorithm works by translating words, phrases and paragraphs into vectors called ‘word vectors’ and then operates on those vectors. Today, of course, this is a normal concept, but back in 2013, it was very innovative (though I should point out there have been many papers prior to 2013 in which the concept of embedding was very well-known to academics and researchers). Back in 2013, these word vectors could do things like arithmetic