This feature lets ChatGPT control programming tools like Xcode, VS Code, and even terminal apps.
The next big trend in AI is here—AI chatbots controlling your PC.
Just three weeks ago, Anthropic introduced a feature on Claude called Computer Use, allowing it to control desktops with text-based commands. Soon after, Microsoft revealed Omniparser, a research preview for an AI agent that can parse your screen—likely a hint on an upcoming feature to control a user’s desktop, most likely through Copilot.
While Apple and Google have been quiet about similar tools, it’s safe to assume they’re working on something behind the scenes.
Today, OpenAI released a new feature called “Work with Apps” on the ChatGPT desktop app for Mac. This feature lets ChatGPT control programming tools like Xcode, VS Code, and even terminal apps.
As a developer and an avid user of AI programming assistants, this is huge.
Let’s get this out of the way — this feature isn’t an AI agent yet.
However, OpenAI describes it as a “key building block” for creating agentic systems. The main challenge for AI agents right now is learning to interpret the entire computer screen, not just text-based prompts or their own outputs.
Here’s what ChatGPT’s new feature can do:
Here’s what the new feature looks like on the ChatGPT desktop app for Mac.
The new desktop app control button is placed beside the internet search toggle, and clicking on it shows a list of compatible apps that you can enable/disable.
Before you can use the new desktop control feature, make sure to give ChatGPT permission to control your computer in the Accessibility settings.
Once ChatGPT is enabled in the Accessibility settings, you should see that Xcode is now enabled on ChatGPT’s dashboard.
For this example, I asked ChatGPT to create a navigation menu for my product, Flux Labs AI. It’s a tool for generating high-quality images of products and portraits using trained image models.
Let’s try asking ChatGPT to write a code that will show a navigation menu at the bottom of the screen.
Prompt: Can you create navigation menu at the bottom of the screen?
1. AI Tools
2. Creations
3. Discover
4. Account
ChatGPT peeks at the code in my opened Xcode project and writes a code suggestion for me.
Here’s what ChatGPT suggested:
I then have to copy the code from ChatGPT and paste it into Xcode. Here’s what it looks like:
It’s worth noting that ChatGPT can’t directly modify code in Xcode. You’ll need to copy and paste everything manually, which isn’t ideal but gets the job done.
Nevertheless, I like the icons and the layout it created for me. I can even click on each tab, and it switches the pages for me. Pretty cool.
Next, I asked ChatGPT to create a dashboard for the user account page. I uploaded a screenshot of an example page and prompted ChatGPT to replicate it.
Prompt: In the Accounts page, can you create a nice looking dashboard with the information similar to the attached image?
It generated the code and even provided a step-by-step guide on where to place it. For someone unfamiliar with Swift, this was a lifesaver.
ChatGPT gave me the step-by-step guide on how and where to add the code.
Here’s what the dashboard looks like:
The dashboard wasn’t perfect but worked well enough. With more tweaks, it could easily become production-ready.
Finally, pair ChatGPT with the terminal and ask it to commit the code changes and push to the repository.
ChatGPT will provide the git commands, and all you have to do is execute them on the terminal.
OpenAI’s Work with Apps relies heavily on macOS’s accessibility API, which has been powering Apple’s VoiceOver screen reader. This lets ChatGPT read text from apps and process it.
Right now, the screen reader only works with text. It can’t handle other stuff on screen like images, object layouts, or videos.
Here’s how it processes code:
Note that this process uses a lot of input tokens.
According to a Bloomberg report a few days ago, the speed of innovation in the AI landscape is slowing down.
The three biggest AI tech companies—OpenAI, Google, and Anthropic—are now seeing diminishing returns from their costly efforts to build more advanced artificial intelligence systems.
With the introduction of AI tools that can access our computers, it opens up a plethora of possible use cases. I can’t even imagine the worst and best possible scenario these tools could make in a user PC.
Off the top of my head, here are some possible effects:
According to Harvard Business Review research, generative AI is already making a big impact in the labor market.
We find that the introduction of ChatGPT and image-generating tools led to nearly immediate decreases in posts for online gig workers across job types, but particularly for automation-prone jobs. After the introduction of ChatGPT, there was a 21% decrease in the weekly number of posts in automation-prone jobs compared to manual-intensive jobs. Writing jobs were affected the most (30.37% decrease), followed by software, app, and web development (20.62%) and engineering (10.42%).
Looking at this trend, I believe that the people who choose to adapt and learn to use AI tools will thrive and those who resists are likely to lose their jobs.
After spending a few hours with ChatGPT’s new feature, it became clear to me that ChatGPT lags behind Cursor AI.
For example, I wanted to add a new model called “Article” to my database schema file. Here’s the full prompt:
Prompt: I have this base prisma schema. I have a feature called AI Article generator where user can generate an article by text prompt and the AI will generate an article based on the prompt. I want to save the article and the text prompt so that when user checks the history, the articles along with the prompt will be retrieved from the database. Can you update the schema to accommodate this request?
Cursor AI proposes the code changes, and all you have to do is decide whether you want to accept or not. See the highlighted code in green above? That’s the newly added code block in the database schema.
In ChatGPT, you would have to copy that code change from the chatbot on to the IDE, which is prone to mistakes if you copy it the wrong way.
Honestly, ChatGPT controlling Mac apps is something I didn’t see coming this soon.
It’s exciting to think about how far this could go in the future. Imagine ChatGPT creating files, writing code, running tests, and even pushing changes to GitHub.
Will everyone love it? Definitely not. The security concerns alone will make a lot of people skeptical. But as a developer, I think this has so much potential.
Is it better than Cursor AI? No.
Cursor is still ahead with more features, and it can even directly edit code. Plus, Cursor offers 500 prompts per month for free. That’s more than enough for smaller projects. ChatGPT’s free GPT-4 messaging credits feel really limited in comparison.
So far, it’s unclear how OpenAI will expand this feature to apps that don’t work with Apple’s screen reader. Competitors like Anthropic are taking a different approach by analyzing screenshots to understand what’s on the screen.
Software engineer, writer, solopreneur