November 26, 2023

Turn A Screenshot Into Source Code Using OpenAI's GPT-4 Vision

Here's how you can use GPT-4 Vision to transform a screenshot into source code.

by

Jim Clyde Monge

OpenAI’s release of GPT-4 Vision has sparked a wave of groundbreaking use cases across the internet. But one tool caught my attention for its unbelievable capabilities—using GPT-4 Vision to generate entire codebases from a single screenshot. This tool, called screenshot-to-code, is an absolute game-changer in the world of web programming.

What is GPT-4 Vision?

For those unfamiliar, GPT-4 Vision represents a massive leap forward in AI’s ability to understand and describe images. Frankly, its image comprehension skills blow my mind. It can turn a picture of two cute puppies into a descriptive paragraph.

In this example, I simply uploaded an image of the puppies into ChatGPT and asked the chatbot to describe the image. The latest version of ChatGPT is smart enough to auto-switch into the GPT-4 Vision model to perform the task.

Here’s an example: the GPT-4 vision describes the input image of two adorable puppies — Image by Jim Clyde Monge

Now, developers have harnessed this technology’s potential to translate UI components into full-fledged code.

What is screenshot-to-code?

In my view, screenshot-to-code is the most impressive demonstration of GPT-4 Vision’s capabilities to date. With just a single screenshot, this web app can churn out complete HTML and Tailwind CSS to recreate website and app designs. The tool is also able to generate similar-looking images using Dall-E 3.

As a developer myself, this tool utterly astonishes me. It automates what used to be an intensive, manual process into a few clicks.

Here’s an example screenshot of Taylor Swift’s Instagram page:

Screenshot-to-code example Taylor Swift’s Instagram page — Screenshot-to-code example

In seconds, screenshot-to-code can model the page design with stunning accuracy. This is an unbelievable resource for mocking up designs during development.

Try it yourself

Screenshot-to-code is completely free to use as long as you have an OpenAI key. Head over to this website, click on the settings icon, add your OpenAI key, and hit the save button.

Screenshot to image settings page — Image by Jim Clyde Monge

I tried the tool myself on a screenshot of Zeniteq’s article page.

Zeniteq’s AI-related articles page — Image by Jim Clyde Monge

Here’s the final result, as interpreted by the tool.

Zeniteq’s AI-related articles page code translated by Screenshot to code tool — Image by Jim Clyde Monge

Was it able to replicate the screenshot? Well, not really. But the results were pretty close. You can make further modifications to the code by asking the AI in the highlighted section below.

screenshot to code dashboard — Image by Jim Clyde Monge

You can also manually update the source code by clicking on the “code” toggle on the upper right corner.

screenshot to code, edit code page — Image by Jim Clyde Monge

The download button downloads the generated HTML and Tailwind CSS code into an index.html file. This is what the index.html file looks like in the example above.

screenshot to code example generated index html page — Image by Jim Clyde Monge

It looks pretty decent, in my opinion.

The Benefits

There are several benefits to using screenshot-to-code.

It can save programmers a significant amount of time. By automating the task of code generation, screenshot-to-code can free up programmers to focus on other tasks, such as designing and testing software.
It can save you money. Website templates typically cost between $50 and $200. This can save you a significant amount of money, especially if you need to create multiple websites.
It’s super easy to use. Screenshot-to-code tools are typically very easy to use. Simply upload a screenshot of the website you want to create, and the tool will generate the code for you.

Drawbacks

These are the drawbacks of using screenshot-to-code:

It can be inaccurate. The tool is still in its early stages of development, so it is not always accurate. The tool may generate incorrect code, or it may not generate all of the code that is needed.
The tool is not a replacement for a human programmer. It can be used to generate code, but it is still up to the programmer to ensure that the code is correct and efficient.

Final Thoughts

Overall, I’m impressed with this kind of use case for GPT-4 Vision. Even in its early stages, it’s amazing what it can do. And I think it’s only going to get better. Who knows how well it will perform in just a few months?

I’m also intrigued by the possibility of front-end designers getting replaced. While I don’t think that’s going to happen anytime soon, I do think it’s something we need to keep an eye on.

‍

Stay ahead. Stay updated.