Here's how you can use GPT-4 Vision to transform a screenshot into source code.
OpenAI’s release of GPT-4 Vision has sparked a wave of groundbreaking use cases across the internet. But one tool caught my attention for its unbelievable capabilities—using GPT-4 Vision to generate entire codebases from a single screenshot. This tool, called screenshot-to-code, is an absolute game-changer in the world of web programming.
For those unfamiliar, GPT-4 Vision represents a massive leap forward in AI’s ability to understand and describe images. Frankly, its image comprehension skills blow my mind. It can turn a picture of two cute puppies into a descriptive paragraph.
In this example, I simply uploaded an image of the puppies into ChatGPT and asked the chatbot to describe the image. The latest version of ChatGPT is smart enough to auto-switch into the GPT-4 Vision model to perform the task.
Now, developers have harnessed this technology’s potential to translate UI components into full-fledged code.
In my view, screenshot-to-code is the most impressive demonstration of GPT-4 Vision’s capabilities to date. With just a single screenshot, this web app can churn out complete HTML and Tailwind CSS to recreate website and app designs. The tool is also able to generate similar-looking images using Dall-E 3.
As a developer myself, this tool utterly astonishes me. It automates what used to be an intensive, manual process into a few clicks.
Here’s an example screenshot of Taylor Swift’s Instagram page:
In seconds, screenshot-to-code can model the page design with stunning accuracy. This is an unbelievable resource for mocking up designs during development.
Screenshot-to-code is completely free to use as long as you have an OpenAI key. Head over to this website, click on the settings icon, add your OpenAI key, and hit the save button.
I tried the tool myself on a screenshot of Zeniteq’s article page.
Here’s the final result, as interpreted by the tool.
Was it able to replicate the screenshot? Well, not really. But the results were pretty close. You can make further modifications to the code by asking the AI in the highlighted section below.
You can also manually update the source code by clicking on the “code” toggle on the upper right corner.
The download button downloads the generated HTML and Tailwind CSS code into an index.html file. This is what the index.html file looks like in the example above.
It looks pretty decent, in my opinion.
There are several benefits to using screenshot-to-code.
These are the drawbacks of using screenshot-to-code:
Overall, I’m impressed with this kind of use case for GPT-4 Vision. Even in its early stages, it’s amazing what it can do. And I think it’s only going to get better. Who knows how well it will perform in just a few months?
I’m also intrigued by the possibility of front-end designers getting replaced. While I don’t think that’s going to happen anytime soon, I do think it’s something we need to keep an eye on.
‍
Software engineer, writer, solopreneur