Journalism of Courage
Advertisement
Premium

Testing InstructPix2Pix: An AI tool that edits photos with ChatGPT-style text inputs

We test out InstructPix2Pix -- the AI image editor that brings your photo editing ideas to life with just text instructions -- while also demonstrating how you can use the same.

InstructPix2Pix featured(1)InstructPix2Pix is based on GPT-3 and Stable Diffusion (Image: InstructPix2Pix)
Listen to this article Your browser does not support the audio element.

If you ever looked at AI image generators like Dall-E and Midjourney and thought, hey, I wish could upload my photos to them and have them edited with just text descriptions, then you are in luck. A new AI image editor tool called InstructPix2Pix allows you to do just that.

Available on the AI tool website Hugging Face, which also hosted the viral bot Dall-E Mini, InstructPix2Pix asks for an input image and prompt instructions and outputs a touched-up image with the changes you requested.

The InstructPix2Pix web app interface on Hugging Face (Express photo)

To obtain the training data for the AI tool, its creators harnessed the knowledge of language models GPT-3 and Stable Diffusion to generate a large dataset of image editing examples. This dataset was then used to train InstructPix2Pix. But unlike Stable Diffusion which is an image generation model (text-to-image), InstructPix2Pix is an image editing diffusion model.

The model was first introduced back on November 17, 2022, in a paper by Tim Brooks, Aleksander Holynski, and Alexei A. Efros – several days before the launch of ChatGPT.

How to use InstructPix2Pix to edit photos

The easiest way to access InstructPix2Pix, as already mentioned, is through its Hugging Face web app. You can get straight to it from https://huggingface.co/spaces/timbrooks/instruct-pix2pix.

The next few steps are easy:

1. Tap on the Input Image section and upload the image you wish to edit
2. Add your Edit Instruction – or the changes you want InstructPix2Pix to the uploaded image – in the text field on the same page.
3. Hit the Generate button and wait for your output

Note that the process takes its sweet time, sometimes even around ten minutes, so patience is of essence here. But considering making those edits manually would probably take a lot longer, the wait is worth it.

Story continues below this ad

Testing InstructPix2Pix out

The first go with InstructPix2Pix was disappointing. Uploading a scenic picture of Noida’s skyline with the prompt “replace the buildings with mountains” funnily returned the same image, except it was highly distorted.

The original image of the Noida skyline (Image: Zohaib Ahmed/Indian Express) The distorted output from the first run (Express photo)

That was not the result I was hoping for, so I tried adjusting the CFG and Text CFG weights. By default, they’re set to 7.5 and 1.5, respectively. I changed them to 8.5 and 1, and ran the program again.

The usable output after adjusting values (Express photo)

This time, the output was actually usable. The buildings had disappeared almost completely and were replaced by a neat mountain range as asked for. If you don’t look too hard, the only thing that’d look off is the yellow-black road divider, which for some reason has two rows lined on top of each other now.

With take two, I tried out an image of a white cat, feeding the prompt “change the cat’s colour to black” to the tool.

Story continues below this ad
The original white cat image (Image: Pixabay) The AI-edited result with black fur instead of white (Express photo)

InstructPix2Pix did a good job with this one even without changing the CFG values. The tool replaced the cat’s white fur with black with near-perfect accuracy, while retaining the white whiskers. Cat whiskers do not contain melanin so the tool seemed to know what it was doing. The only thing off with the image is the distorted pupils.

Conclusion

In its current state, InstructPix2Pix is a great AI tool to mess around with. It isn’t capable of producing images as convincing as Midjourney’s Pope in a Balenciaga jacket photo that fooled the internet. But it’s still a glimpse into the future of photo editing, where complex editing procedures perhaps will one day be replaced by simple text-based prompts.

Zohaib is a tech enthusiast and a journalist who covers the latest trends and innovations at The Indian Express's Tech Desk. A graduate in Computer Applications, he firmly believes that technology exists to serve us and not the other way around. He is fascinated by artificial intelligence and all kinds of gizmos, and enjoys writing about how they impact our lives and society. After a day's work, he winds down by putting on the latest sci-fi flick. • Experience: 3 years • Education: Bachelor in Computer Applications • Previous experience: Android Police, Gizmochina • Social: Instagram, Twitter, LinkedIn ... Read More

Tags:
  • artificial intelligence
Edition
Install the Express App for
a better experience
Featured
Trending Topics
News
Multimedia
Follow Us
Express PremiumFrom kings and landlords to communities and corporates: The changing face of Durga Puja
X