Premium
This is an archive article published on September 29, 2023

ChatGPT gets image recognition: 6 wild things people are using it for

From breaking down complex diagrams with remarkable skill to producing code from images alone, here are some examples of how people are using ChatGPT's new vision feature.

chatgpt can see hear speak featuredOpenAI's ChatGPT is now more human-like than ever. (Image: Zohaib Ahmed/The Indian Express)
Listen to this article
ChatGPT gets image recognition: 6 wild things people are using it for
x
00:00
1x 1.5x 1.8x

When ChatGPT first came out, people were flabbergasted at its remarkably human-like understanding of queries and the way it responded to those. The AI chatbot became an overnight sensation and was all over social media. In fact, global Google searches for the term “artificial intelligence” reached an all-time high, demonstrating the intense consumer interest in the technology.

short article insert But people move on quick. And yet just when the hype seemed like it was fizzling out, OpenAI dropped a couple of new update bombs introducing the ability to ‘see,’ ‘hear,’ and browse the web. The vision feature is particularly impressive, as ChatGPT can now analyse images with a level of detail that almost seems beyond human capabilities.

Naturally, people started to talk about ChatGPT again, and below we have compiled some of the best examples of how people have used the new image recognition feature.

Story continues below this ad

Understanding complex diagrams

Diagrams are used to better represent complex information, but what happens when the diagrams themselves are too convoluted? ChatGPT’s new image capabilities come to the rescue, breaking those down in a language that can easily be grasped even by a toddler. For instance, one Twitter user was able to get the AI chatbot to explain an image packed with a flow diagram comprised of hundreds of elements.

Helping you learn

It works the other way around too. If you need additional context or notes for a simple diagram/flowchart – or simply want to figure out what it’s even about – ChatGPT does an excellent job at it as well.

Identifying image sources

One Twitter user uploaded a screengrab from the movie Gladiator asking ChatGPT its source and what the person in the scene is saying. The chatbot answered like it had actually watched the movie, not only responding to the original query but also topping it off with additional context.

It remains to be seen if the feature works for random shots from movies as well or if it’s limited to popular scenes. But regardless, the tool can come in super handy for reverse image searches, especially when combined with its ability to browse the web.

Story continues below this ad

Interpreting memes and concepts

You either get it or you don’t. Understanding viral memes is sometimes impossible if you are missing the context. Or maybe the post is just too nonsensical or cliche for you to find the humour. If you can’t for the life of you figure out why a meme has received hundreds of thousands of likes, ChatGPT can help.

Additionally, ChatGPT can also help you understand ‘deep’ images with hidden meanings.

Translation

Yes, tools like Google Lens and Microsoft’s Visual Lens exist, but things can sometimes get lost in translation. ChatGPT can come in helpful as a substitute when an attempt to translate text on a hoarding, road sign, shop board, or anywhere else returns gibberish.

Writing code based on images

But perhaps the most impressive application for the feature is its ability to figure out the code for websites and other projects – from screenshots alone – and replicating it accordingly. For example, a user uploaded a screenshot of a SaaS dashboard and ChatGPT produced the complete code for it. Upon checking if the code worked, the developer was astonished to see it indeed got most things right.

Story continues below this ad

Of course, it’s not even been a full week since ChatGPT gained the ability to see, hear, and speak, so it’s fair to assume that these use cases only scratch the surface. People are continuing to experiment with different types of inputs and there’s probably a host of cool new applications waiting to be discovered.

ChatGPT’s new image and voice capabilities are still undergoing rollout and are currently exclusive to Plus and Enterprise users.

Zohaib is a tech enthusiast and a journalist who covers the latest trends and innovations at The Indian Express's Tech Desk. A graduate in Computer Applications, he firmly believes that technology exists to serve us and not the other way around. He is fascinated by artificial intelligence and all kinds of gizmos, and enjoys writing about how they impact our lives and society. After a day's work, he winds down by putting on the latest sci-fi flick. • Experience: 3 years • Education: Bachelor in Computer Applications • Previous experience: Android Police, Gizmochina • Social: Instagram, Twitter, LinkedIn ... Read More

Technology on smartphone reviews, in-depth reports on privacy and security, AI, and more. We aim to simplify the most complex developments and make them succinct and accessible for tech enthusiasts and all readers. Stay updated with our daily news stories, monthly gadget roundups, and special reports and features that explore the vast possibilities of AI, consumer tech, quantum computing, etc.on smartphone reviews, in-depth reports on privacy and security, AI, and more. We aim to simplify the most complex developments and make them succinct and accessible for tech enthusiasts and all readers. Stay updated with our daily news stories, monthly gadget roundups, and special reports and features that explore the vast possibilities of AI, consumer tech, quantum computing, etc.

Latest Comment
Post Comment
Read Comments