It’s time to move past the command-line era of AI.
We keep hearing that AI is for everyone. “No coding required.” “Your new coworker.” “An assistant for the masses.” And yet, to get consistently useful results from today’s best models, you still need to know the secret handshakes: the right phrases, the magic tags, the unspoken etiquette of LLMs.
The power is there. The intelligence is real. But the interface? Still stuck in the past.
And that disconnect became painfully clear one day, when a friend of mine tried to do something absurdly simple…
When ChatGPT needed a magic spell
My friend — let’s call him John — decided to use ChatGPT to count the names in a big chunk of text. He pasted the text into ChatGPT and asked for a simple count. GPT-4o responded with a confident-sounding number that was laughably off. Confusion gave way to frustration as John tried new phrasing. Another round of nonsense numbers. By the fifth attempt, he looked ready to fling his keyboard out the window.
I ambled over and suggested a trick: wrap the block of text in <text> tags, just as if we were feeding the model some neat snippet of XML. John pressed Enter and, with that tiny tweak, ChatGPT nailed the correct count. Same intelligence, but now it was cooperating.
John was both relieved and annoyed. Why did a pair of angle brackets suddenly turn ChatGPT into a model of precision? The short answer: prompt engineering. That term might sound like a fancy discipline, but in practice, it can look like rummaging around for cryptic phrases, tags, or formatting hacks that will coax an LLM into giving the right answer. It is reminiscent of a fantasy novel where you have infinite magical power, yet must utter each syllable of the summoning chant or risk conjuring the wrong monster.
https://medium.com/media/54f0195b29bc89292f7db656514a657a/href
In an era when we supposedly have next-level AI, it’s hilarious that we still rely on cryptic prompts to ensure good results. If these models were truly intuitive, they would parse our intentions without requiring special incantations. Instead, we’re forced to memorize half-hidden hints, like travelers collecting obscure tourist phrases before visiting a foreign land. That is a design problem, not an intelligence problem.
Sound familiar?
If this rings a bell, it’s because we’ve seen a similar dynamic before in the early days of personal computing. As usability expert Jakob Nielsen explains, the very first user-interface paradigm was “batch processing,” where you had to submit an entire batch of instructions upfront (often as punched cards) and wait hours or even days for the results. This eventually gave way to “command-based” interfaces (like DOS or Unix) in the 1960s, where you’d type in a single command, see an immediate result, and then type another command. Back then, command lines ruled the UI landscape, demanding meticulous keystrokes that only power users could navigate.
Then the graphical user interface arrived, with clickable icons and drag-and-drop metaphors that pulled computing out of the realm of black-screen mysticism and into the daylight. You no longer had to type COPY A: C:DOC.DAT; you just dragged a file into a folder. That wasn’t a minor convenience—it was a revolution that let everyday people, not just specialists, feel in control of the machine for the first time.
We saw the same leap again and again. WYSIWYG editors let people format documents without memorizing tags or markup. Mosaic put the internet behind glass and made it something your mom could use. The smartphone touchscreen transformed phones from keypad-laden gadgets into intuitive portals you could navigate with a swipe and a tap. In every case, the breakthrough wasn’t just better tech under the hood — it was better ways to interact with it.
That’s what made ChatGPT the spark that set off the LLM explosion. Sure, the underlying tech was impressive, but GPT-style models had been kicking around for years. What ChatGPT nailed was the interface. A simple chatbox. Something anyone could type into. No setup. No API key. No notebook to clone. It didn’t just show people that LLMs could be powerful…it made that power feel accessible. That was the GUI moment for large language models.
Yet, as many people (Nielsen included) have noted, we’re still in the early days. Despite ChatGPT’s accessibility, it still expects precise prompts and overly careful phrasing. Most people are left relying on Reddit threads, YouTube hacks, or half-remembered prompt formats to get consistent results. We’re still stuck in a “command line” moment, wearing wizard hats just to ask a machine for help.
The tech is here. The intelligence is here. What’s missing is the interface leap, the one that takes LLMs from impressive to indispensable. Once we design a way to speak to these models that feels natural, forgiving, and fluid, we’ll stop talking about prompt engineering and start talking about AI as a true collaborator. That’s when the real revolution begins.
Examples of emerging UX in AI
Cursor = AI moves into the workspace
Most people still interact with LLMs the same way they did in 2022: open a chat window, write a prompt, copy the answer, paste it back into whatever tool they were actually using. It works, but it’s awkward. The AI sits off to the side like a consultant you have to brief every five minutes. Every interaction starts from zero. Every result requires translation.
Cursor flips that model inside out. Instead of forcing you to visit the AI, it brings the AI into your workspace. Cursor is a code editor, an IDE, or integrated development environment, the digital workbench where developers write, test, and fix code. But unlike traditional editors, Cursor was built from the ground up to work hand-in-hand with an AI assistant. You can highlight a piece of code and say, “Make this run faster,” or “Explain what this function does,” and the model responds in place. No switching tabs. No copy-paste gymnastics.
The magic isn’t in the model. Cursor uses off-the-shelf LLMs under the hood. The real innovation is how it lets humans talk to the machine: directly, intuitively, with no rituals or spellbooks. Cursor doesn’t ask users to understand the quirks of the LLM. It absorbs the context of your project and adapts to your workflow, not the other way around.
It’s a textbook example of AI becoming more powerful not by getting smarter, but by becoming easier to work with. This is what a UX breakthrough looks like: intelligence that feels embedded, responsive, and natural, not something you summon from a command line with just the right phrasing.
So what happens when we go one step further, and make the interaction even looser?
Vibe coding = casual command line
“Vibe coding,” a term popularized by Andrej Karpathy, takes that seamless interaction and loosens it even more. Forget structured prompts. Forget formal requests. Vibe coding is all about giving the LLM rough direction in natural language, “Fix this bug,” “Add a field for nickname,” “Reduce the sidebar padding,” and trusting it to figure out the details.
At its best, it’s fluid and exhilarating. LLMs like a talented engineers who have read your codebase and can respond to shorthand with useful action. That sense of flow — of staying in the creative zone without stopping to translate your thoughts into machine-friendly commands — is powerful. It lets you focus on what you want to build, not how to phrase the ask.
But here’s the catch: you still have to know how to talk to the machine.
Vibe coding isn’t magic. It’s an interaction style built on top of a chat interface. If you don’t already understand how LLMs behave, what they’re good at, where they stumble, how to steer them with subtle rewordings, the magic breaks down. You’re still typing into a text box, hoping your intent comes through. It’s just that now, the prompts are written in lowercase and good vibes.
https://medium.com/media/06036727978af6a2ca6b2de416bbfd0f/href
So while vibe coding lowers the friction for seasoned users, it still relies on unspoken rules and learned behavior. The interface is friendlier, but it’s not yet accessible. It shows that AI can feel conversational…but still speak a language only some people understand.
If Cursor showed us what it looks like to embed AI into tools we already use, and vibe coding loosened the interaction into something more human, then what’s next? What happens when you don’t even need to steer, when AI takes the wheel?
Manus = the interface is the intelligence
When Manus hit the scene, it made waves. Headlines hailed it as the “first true AI agent.” Twitter lit up. Demos showed the tool writing code, running it, fixing bugs, and trying again, all without needing constant human input. But here’s the quiet truth: Manus runs on Sonnet 3.7, a Claude model you can already access elsewhere. The model wasn’t new.
What was new, and genuinely exciting, was the interface and interactions.
Instead of prompting line by line, Manus lets users speak in goals: “Analyze this dataset and generate a chart,” or “Build a basic login page.” Then it takes over. It writes the code. It runs the code. If something breaks, it investigates. It doesn’t ask you to prompt again. It acts like it knows what you meant.
This is delegation as design. You’re no longer babysitting the model. You’re handing off intent and expecting results. That’s not about intelligence. That’s about trust in the interface. You don’t need to wrangle it like a chatbot or vibe with it like a coder who knows the right lingo. You just ask and Manus handles the “how.”
And that’s the point. The Manus moment wasn’t about a smarter model. It was about making an existing model feel smarter through better interaction. Its leap forward wasn’t technical … it was experiential. It didn’t beat other tools by out-reasoning them. It beat them by understanding the human better.
This is the future of AI: tools that don’t just process our input, but interface with our intent.
All three examples — Cursor, vibe coding, and Manus — prove the same point: it’s not the size of the model that changes the game, it’s the shape of the interaction. The moment AI starts understanding people, instead of people learning how to talk to AI, everything shifts.
The missing link between power and people
If we return to John’s struggle with <text> tags, it’s clear his predicament wasn’t just a glitch. It revealed how these advanced models still force us to speak their language rather than meeting us halfway. Even though powerful LLMs can write code and pass exams, many AI interactions still feel like sending Morse code to a starship. The real shortfall isn’t intelligence; it’s an outdated, command-line-like user experience that demands ritual rather than collaboration.
Tools like Cursor, Manus, and vibe coding show glimpses of a new reality. By embedding AI directly into our workflows and allowing more natural, goal-oriented conversations, they move us closer to what Jakob Nielsen calls a “complete reversal of control,” where people state what they want and let the computer figure out how to get there. We’ve watched technology make leaps like this before, from command lines to GUIs, from styluses to touchscreens, and each time, the real revolution was about removing friction between people and possibility.
AI will be no different. The next major step isn’t just bigger models or higher accuracy: it’s designing interfaces that make AI feel like a partner rather than a tool. When that happens, we’ll stop talking about prompts and simply use AI the way we use search bars or touchscreens. Intuitively and effortlessly.
This shift is not only the future of AI, but also the future of computing itself.
Cursor, “vibe coding,” and Manus: the UX revolution that AI needs was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.