Premium

This is an archive article published on February 26, 2024

Explained: What is an LLM, the backbone of AI chatbots like ChatGPT, Gemini?

The ability of Generative AI models to “converse” with humans is due to something known as the Large Language Model, or LLM. Here's all you need to know about LLMs.

As more data and parameters are infused into LLMs, their performance improves. (FreePik)

Ever since the launch of OpenAI’s sensational chatbot ChatGPT, conversations about artificial intelligence have become common from living rooms to boardrooms. When computers were invented, they were machines that executed instructions given by programmers. Now, computers have now gained the ability to learn, think and hold conversations. Not only that, they can perform several creative and intellectual tasks once only limited to humans. This is what we call generative AI. The ability of Generative AI models to “converse” with humans and predict the next word or sentence is due to something known as the Large Language Model, or LLM.

It is to be noted that while not all generative AI tools are built on LLMs, all LLMs are forms of Generative AI which in itself is a broad and ever-expanding category or type of AI. In order to grasp the science behind ChatGPT’s efficiency, it is crucial to understand what an LLM is.

What is an LLM?

According to Google, LLMs are large general-purpose language models that can be pre-trained and then fine-tuned for specific purposes. In simple words, these models are trained to solve common language problems such as text classification, question answering, text generation across industries, document summarisation, etc.

Story continues below this ad

The LLMs can also be tailored to solve specific problems in a variety of domains such as finance, retail, entertainment, etc., using perhaps a relatively small size of field datasets.

The meaning of LLMs can be understood with its three primary features. Firstly, the ‘Large’ indicates two meanings — the enormous size of training data; and the parameter count. In Machine Learning, parameters, also known as hyperparameters, are essentially the memories and knowledge that a machine learned during its model training. Parameters define the skill of the model in solving a specific problem.

Also in Explained | Google introduces Gemma open source AI models: What does it mean for responsible AI?

The second most important thing to understand about LLM is the General Purpose. This means the model is sufficient to solve general problems that are based on the commonality of human language regardless of specific tasks, and resource restrictions.

In essence, an LLM is like a super smart computer program that can comprehend and create human-like text. It is trained on massive data sets which are essentially patterns, structures, and relationships with languages. An LLM can also be seen as a tool that helps computers understand and produce human language.

Story continues below this ad

How many types of LLMs are there?

There are various ways to categorise LLMS. It is to be noted that the type depends on the specific aspect of tasks they are meant to do. On the basis of architecture, there are three types — autoregressive, transformer-based, and encoder-decoder. GPT-3 is an example of an autoregressive model as they predict the next word in a sequence based on previous words.

Similarly, LaMDA or Gemini (formerly Bard) are transformer-based as they use a specific type of neural network architecture for language processing. Then there are the encoder-decoder models that encode input text into a representation and then decode it into another language or format.

Based on training data, there are three types of LLMs — pretrained and fine-tuned, multilingual or models that can understand and generate text in multiple languages, and domain-specific or models that are trained on data related to specific domains such as legal, finance or healthcare.

LLMs can also vary based on their size as large models usually require more computational resources. However, they offer better performance.

Story continues below this ad

They can also be categorised as open-source and closed-source based on availability as some are freely available while some are proprietary. LLaMA2, BlOOM, Google BERT, Falcon 180B, OPT-175 B are some open-source LLMs, while Claude 2, Bard, GPT-4, are some proprietary LLMs.

How do LLMs work?

At the core of it is a technique known as “deep learning”. It involves the training of artificial neural networks, which are mathematical models which are believed to be inspired by the structure and functions of the human brain.

For LLMs, this neural network learns to predict the probability of a word or sequence of words given the previous words in a sentence. As mentioned earlier, this is done by analysing the patterns and relationships between words in the data set used for training. Once trained, an LLM can predict the most likely next word or sequence of words based on inputs also known as prompts.

An LLM’s learning ability can be best described as similar to how a baby learns to speak. You don’t give a baby an instruction manual, he/she learns to understand language by listening to people speak.

Story continues below this ad

What can LLMs do?

LLMs come with an array of applications across domains. They generate text and are capable of producing human-like content for purposes ranging from stories to articles to poetry and songs. They can strike up a conversation or function as virtual assistants.

Considering their rigorous training and expansive data set, they show proficiency in language understanding tasks, including sentiment analysis, language translation, and summarisation of dense texts. In conversational settings, LLMs engage with users, providing information, answering questions, and maintaining context over multiple exchanges.

Additionally, they play a crucial role in content creation and personalisation, aiding in marketing strategies, offering personalised product recommendations, and tailoring content to specific target audiences.

What are the advantages of LLMs?

Perhaps, the biggest advantage of LLMs is their versatility. A single model can be used for a wide variety of tasks. Since they are trained on large data sets, they are capable of generalising patterns which can be later applied to different problems or tasks.

Story continues below this ad

When it comes to data, LLMs can reportedly perform well even with limited amounts of domain or industry-specific data. This is possible because LLMs can leverage the knowledge they learned from general language training data.

Another important aspect is their ability to continuously improve their performance. As more data and parameters are infused into LLMs, their performance improves.

LLMs are continuously developing and proliferating into new dimensions. The above information has been compiled based on popular definitions and an understanding of the underlying technology that fuels these AI models. Watch this space to learn more about LLMs and AI as they continue to evolve.

Bijin Jose

Bijin Jose, an Assistant Editor at Indian Express Online in New Delhi, is a technology journalist with a portfolio spanning various prestigious publications. Starting as a citizen journalist with The Times of India in 2013, he transitioned through roles at India Today Digital and The Economic Times, before finding his niche at The Indian Express. With a BA in English from Maharaja Sayajirao University, Vadodara, and an MA in English Literature, Bijin's expertise extends from crime reporting to cultural features. With a keen interest in closely covering developments in artificial intelligence, Bijin provides nuanced perspectives on its implications for society and beyond. ... Read More

Will go after other targets if no peace, says Trump as US strikes 3 nuclear sites in Iran

WorldJune 25, 2025

President Donald Trump said that the United States military had struck three nuclear-related facilities in Iran, marking a dramatic escalation in the ongoing conflict and a direct American role in efforts to dismantle Tehran’s infrastructure. The strikes represent the most significant direct US military action against Iranian territory in years and come amid rising regional tensions.

View all shorts

More Explained

Explained

Trump’s gamble in Iran: Implications for the US, its allies, and Tehran