ChatGPT can engage us in engrossing conversations, coming up with interesting answers to a variety of questions. What you see and interact with is the chat interface, much like WhatsApp or any other messenger, and the one you are conversing with is the AI. In this case, the AI is GPT 3.5 or GPT-4, the large language model (LLM) that powers the chatbot. But have you ever wondered how the chatbot 'talks' to you? How does it remember the earlier prompts you shared during the conversation? The amount of conversation a chatbot can remember depends on something called context window. While we can read everything we input on the chat interface, it doesn't quite work in the same way for the chatbot. The text it can 'see' or 'read' at a given moment is called its context window. By this simple definition, it would seem that the larger the context window, the better it is, because then the AI model can see more information and thus give better responses. But that is not always the case. We explain why. What are context windows? As stated earlier, the amount of conversation that an AI can read and write at any given time is called the context window, and they are measured in something called tokens. During the OpenAI Dev Day in 2023, Sam Altman announced GPT-4 Turbo with a massive context window of 128K tokens, which translates to around 300 pages of a book. With regards to LLMs, tokens are the basic unit of data processed by these models — say, the maximum number of tokens that a model can consider at once when generating text. For text, a token can be a word, a part of a word, or even a character. This largely depends on the tokenisation process (a process to convert text into a format that can be used as an input in a machine learning model) employed. A rule of thumb is one token corresponds to approximately or equal to four characters of text from English. This is around three-fourths of a word, which makes 100 tokens equal to 75 words. So, 32,000 tokens would be equivalent to 1,28,000 characters (this is a rough estimate). In order to understand how OpenAI tokenises text, the AI giant has a tool where one can input text and see how it translates into tokens. We used the sentence - ‘The capital of India is New Delhi.’ Here we have 7 words and 34 characters, and the tool translated this as 8 tokens. The tokenisation process varies for different LLMs. Why are context windows important? According to the definition by Google Deepmind researchers, context windows are crucial as they help AI models recall information during a session. It is context windows that help AI models or LLMs capture the contextual nuances of languages and it enables these models to understand and generate human-like responses. How do context windows work? In simple words, context windows operate by creating a sliding window over the input text, focussing on one specific word at a time. Here, the size of the context window is a key parameter, as based on it, the scope of contextual information assimilated by the AI system is determined. Context windows in LLMs work like reading a book: a window slides over text, analysing a few words at a time. Each word is like a code representing its meaning, and the programme considers words within the window to understand their relationships. What’s in the size? Months after Sam Altman announced a 128K token size for GPT-4 Turbo, Google announced its AI model Gemini 1.5 Pro with a context window of up to 1 million tokens. While larger windows can mean better performance or accuracy of the AI model, sometimes, the benefits may hit a stagnation point, and too big a window may mean that irrelevant information is included. Thus, advantages of a bigger context window are that they allow models to reference more information, understand the flow of the narrative, maintain coherence in longer passages, and generate contextually rich responses. On the other hand, the most apparent disadvantage of a large window is the requirement of massive computational power during training and inference times. Escalating hardware requirements and costs is also a problem. With large context windows, AI models may end up repeating or contradicting themselves. Also, greater computational power spells an increased carbon footprint, which is a looming concern in sustainable AI development. Besides, training models with large context windows would also translate to significant usage of memory bandwidth and storage. This could mean that only large corporations would be able to invest in the costly infrastructure.