Journalism of Courage
In focus
Advertisement
Premium

How OpenAI’s Project Strawberry promises to be AI’s next big breakthrough

The influential California-based tech industry business publication The Information reported on August 26 that Project Strawberry would be better at math and programming than any existing chatbot, quoting “two people who have been involved in the effort”.

4 min read
An illustration of a high-tech AI lab, where a big strawberry is being analysed by advanced AI systems. (Image source: DALL-E text-to-image model created by OpenAI)

OpenAI, the world’s premier artificial intelligence research organisation, will likely release its most powerful AI model this fall (September-November), and could integrate it into ChatGPT-5, the new version of the chatbot and virtual assistant that it launched in late 2022.

The secretive project, on which OpenAI has been working for long, was earlier known as Project Q* (Q-star), and is now codenamed Project Strawberry. It is expected to feature autonomous Internet research and dramatically improve AI reasoning capabilities, and has been billed as OpenAI’s push to create Artificial General Intelligence — AI with capabilities similar to that of the human brain.

On August 7, OpenAI CEO Sam Altman posted an image of strawberries growing in two pots to his X account. The tweet was seen as confirmation that OpenAI is working on the new and powerful large language model (LLM).

OpenAI was reported to have demonstrated a version of the new model to national security officials, seemingly a statement of its commitment to transparency at a time when the rapid development of AI has raised serious security concerns among national governments.

A wizard at math

The influential California-based tech industry business publication The Information reported on August 26 that Project Strawberry would be better at math and programming than any existing chatbot, quoting “two people who have been involved in the effort”.

Integration with ChatGPT will make the latter the most powerful AI chatbot there is, the report said. ChatGPT has sometimes struggled with math, and experts think the errors could be due to the absence of adequate mathematical information in the training data.

The Information report said that a demo by Project Strawberry staff had shown that the new AI model is capable of advanced levels of thinking, which allowed it to solve puzzles, including The New York Times ‘Connections’, a particularly difficult word puzzle.

Story continues below this ad

Need for training

The Information said that Project Strawberry aims to raise more capital, which OpenAI needs for its next-frontier model, codenamed Orion.

The generation of high-quality training data for Orion is believed to be one of Project Strawberry’s key applications. This is significant because most of the training data on the Internet has already been used, and there is now a dearth of information that is outside paywalls and authentication, and is free to access for the purpose of training AI models. Indeed, OpenAI has been of late making deals with publications to use their content for training.

Project Orion, which is being designed to outperform GPT-4, could use a combination of Project Strawberry and high-quality synthetic data that would likely reduce errors and hallucinations compared to its predecessors and other AI models.

Creating synthetic data

Altman has said that in order to try out different ways to train AI models, OpenAI has been testing how to generate large amounts of synthetic data. Generative AI models create synthetic data based on real-world data samples. The algorithms learn patterns, correlations, and statistical properties of the sample data; after it is trained, the model can produce statistically identical synthetic data.

Story continues below this ad

The large datasets that AI models rely on could be prone to biases and errors, or could have incomplete or inaccurate information — high-quality synthetic data produced by Project Strawberry can fill gaps in real-world data sets and provide a more wholesome, inclusive, and balanced training set.

Many believe that the use of synthetic data can help make future AI models more neutral and fair, and reduce noise and irrelevant information — thereby improving both the efficiency of training and the accuracy of the model.

Big Strawberry leap

Based on what is known, Project Strawberry’s improved reasoning, logic, and the ability to plan and carry out research, could allow the model to autonomously conduct experiments, analyse data, and come up with new hypotheses. This could potentially lead to scientific breakthroughs, including the discovery of new drugs. The models could also offer personalised education, creating educational content and interactive lessons.

Bijin Jose, an Assistant Editor at Indian Express Online in New Delhi, is a technology journalist with a portfolio spanning various prestigious publications. Starting as a citizen journalist with The Times of India in 2013, he transitioned through roles at India Today Digital and The Economic Times, before finding his niche at The Indian Express. With a BA in English from Maharaja Sayajirao University, Vadodara, and an MA in English Literature, Bijin's expertise extends from crime reporting to cultural features. With a keen interest in closely covering developments in artificial intelligence, Bijin provides nuanced perspectives on its implications for society and beyond. ... Read More

Tags:
  • Express Explained Openai
Edition
Install the Express App for
a better experience
Featured
Trending Topics
News
Multimedia
Follow Us
Trump’s gamble in IranImplications for the US, its allies, and a weakened Tehran
X