LukeW


Letting the Machines Learn
Every time I present on AI product design, I'm asked about AI and intellectual property. Specifically: aren't you worried about AI models "stealing" your work? I always answer that if I accused AI models of theft, I'd have to accuse myself as well. Let me explain…
I've spent 30 years writing three books and over two thousand articles on digital product design and strategy. But during those same 30 years? I've consumed exponentially more. Countless books, articles, tweets. Thousands of conversations. Products I've used, solutions I've analyzed. All of it shaped what I know and how I write.
If you asked me to trace the next sentence I type back to its sources, to properly attribute the influences that led to those specific words, I couldn't do it. The synthesis happens at a level I can't fully decompose.
AI models are doing what we do. Reading, viewing, learning, synthesizing. The only difference is scale. They process vastly more information than any human could. When they generate text, they're drawing from that accumulated knowledge. Sound familiar?
So when an AI model produces something influenced by my writings, how is that different from a designer who read my book and applies those principles? I put my books out there for people to buy and learn from. My articles? Free for anyone to read. Why should machines be excluded from that learning opportunity?
"But won't AI companies unfairly profit from training on your content?"
From AI model companies, for $20 per month, I get an assistant that's read more than I ever could, available instantly, capable of helping with everything from code reviews to strategic analysis. That same $20 couldn't buy me two hours of entry-level human assistance.
The benefit I receive from these models, trained on the collective knowledge of millions of contributors, including my microscopic contribution, dwarfs any hypothetical loss from my content being training data. In fact, I'm humbled that my thoughts could even be part of a knowledge base used by billions of people.
So let machines learn, just like humans do. For me, the value I get back from well-trained AI models far exceeds what my contribution puts in.
Unstructured Input in AI Apps Instead of Web Forms
Web forms exist to put information from people into databases. The input fields and formatting rules in online forms are there to make sure the information fits the structure a database needs. But unstructured input in AI-enabled applications means machines, instead of humans, can do this work.
17 years ago, I wrote a book on Web Form Design that started with "Forms suck." Fast forward to today and the sentiment still holds true. No one likes filling in forms but forms remain ubiquitous because they force people to provide information in the way it's stored within the database of an application. You know the drill: First Name, Last Name, Address Line 2, State abbreviation, and so on.
With Web forms, the burden is on people to adapt to databases. Today's AI models, however, can flip this requirement. That is, they allow people to provide information in whatever form they like and use AI do the work necessary to put that information into the right structure for a database.
How does this work? Instead of a Web form enforcing the database's input requirements a dynamic context system can handle it. One way of doing this is with AgentDB's templating system, which provides instructions to AI models for reading and writing information to a database.
With AgentDB connected to an AI model (via an MCP server), a person can simply say "add this" and provide an image, PDF, audio, video, you name it. The model will use AgentDB's template to decide what information to extract from this unstructured input and how to format it for the database. In the case where something is missing or incomplete, the model can ask for clarification or use tools (like search) to find possible answers.
In the example above, I upload a screenshot from Instagram announcing a concert and ask the AI model to add it to my concert tracker. The AgentDB template tells the model it needs Show, Date, Venue, City, Time, and Ticket Price for each database entry. So the AI model pulls this information from the unstructured input (screenshot) and, if complete, turns it into the structured format a database needs.
Of course, the unstructured input can also be a photo, a link to a Web page, a Word document, a PDF file, or even just audio where you say what you want to add. In each case the combination of AI model and AgentDB will fill in the database for you.
No Web form required. And no form is the best kind of Web Form Design.
World Knowledge Improves AI Apps
Applications built on top of large-scale AI models benefit from the AI model's built-in capabilities without requiring app developers to write additional code. Essentially if the AI model can do it, an application built on top of it can do it as well. To illustrate, let's look at the impact of a model's World knowledge on an app.
For years, software applications consisted of running code and a database. As a result, their capabilities were defined by coded features and what was inside the database. When the running code is replaced by a large language model (LLM), however, the information encoded in model's weights instantly becomes part of the capabilities of the application.
With AI apps, end users are no longer constrained by the code developers had the time and foresight to write. All the World knowledge (and other capabilities) in an AI model are now part of the application's logic. Since that sounds abstract let's look at a concrete example.
I created an AI app with AgentDB by uploading a database of NBA statistics spanning 77 years and 13.6 million play-by-play records. When I add the MCP link AgentDB makes for me to Anthropic's Claude, I have an application consisting of a database optimized for AI model use, and an AI model (Claude) to use as the application's brain. Here's a video tutorial on how to do this yourself.
In the past a developer would need to write code to render the user interface for an application front-end to this database. That code would determine what kind of questions people could get answers to. Usually this meant a bunch of UI input elements to search and filter games by date, team, player, etc. The NBA's stats page (below) is a great example of this kind of interface.
But no matter how much code developers write, they can't cover all the ways people might want to interact with information about the NBA's 77 years. For instance, a question like "What were the last 5 plays in the Malice in the Palace game?" requires either running code that can translate malice in the palace to a specific date and game or an extra field in the database for game nicknames.
When a large language model is an application's compute, however, no extra code needs to be written. The association between Malice in the Palace and November 19, 2004 is present in an AI model's weights and it can translate the natural language question into a form the associated database can answer.
An AI model can use its World knowledge to translate people's questions into the kind of multi-step queries needed to answer what seem like simple questions. Consider the example below of: "Who was the tallest player drafted in Ant-Man’s NBA draft class?" We need to figure what player Ant-Man refers to, what year he was drafted, who else was drafted then, get all their heights, and then compare them. Not a simple query to write by hand but with AI acting as an application's brain... it's quick and easy.
World knowledge, of course, isn't the only capability built-in to large-language models. There's multi-language support, vision (for image parsing), tool use, and more emerging. All of these are also application capabilities when you build apps on top of AI models.
Chat is: the Future or a Terrible UI
As the proliferation of AI-powered chat interfaces in software continues, people increasingly take one of two sides. Chat is the future of all UI or chat is a terrible UI. Turns out there's reason to believe both, here's a bunch of them.
Back in 2013, I proposed a variant of Jamie Zawinski's popular Law of Software Envelopment reframed as:
Every mobile app attempts to expand until it includes chat. Those applications which do not are replaced by ones which can.Today every major mobile app has some form of chat function whether social network, e-commerce, ride-share, and so on. So chat is already pervasive and thereby familiar, which made it a great interface to usher in the age of AI. But is it AI's final form?
“Chat is the future of software.”- People already know how to use chat interfaces. This familiarity means people can jump right in and start using powerful AI systems.
- An empty text box is great at capturing user intent: people can simply tell chat apps what they want to get done. “Just look at Google.”
- Natural language allows people to communicate what they want like they would in the real World, no need to learn a UI.
- The best interface is… no interface, an invisible interface, etc.
- Conversational interfaces can shift topics and goals, providing a way to compose information and actions that’s just right for specific. needs.
- Voice input means people don’t have to type but can still simply chat with powerful systems.
- Chat user interfaces for AI models are a fundamental shift from forcing humans to learn computers to computers understanding human language.
- Chat interfaces face the classic "invisible UI" problem: without clear affordances, people don't know what they can do, nor how to get the best results from them.
- Walls of text are suboptimal to communicate and display complex information and relationships unlike images, tables, charts, ad more.
- Scrolling through conversation threads to find and extract relevant information is painful, especially as chat conversations run long.
- Context gets lost in back and forth interactions which slow everything down. Typing everything you want to do is cumbersome.
- Language is a terrible way to describe visual, spatial, and temporal things.
- Voice-based interfaces make it even harder to communicate information better suited to images and user interfaces.
- We’re very early in the evolution of AI-powered software and lots of different and useful interfaces for interacting with AI will emerge.
It's also worth noting that chat isn't the only way to integrate AI in software products and increasingly agent-based applications outperform chat-only solutions. So expect things to keep changing.
Platform Shifts Redefine Apps
With each major technology platform shift, people underestimate how much "what an application is and how it's built" changes. From mainframes to PCs, to Web, to Mobile and now AI, computing platform changes redefined software and created new opportunities and constraints for application design and development.
These shifts not only impacted how applications work but also where they run, what they look like, how they're built, delivered, and experienced by people.
Mainframe era: Applications lived on massive shared computers in climate-controlled rooms, with people typing text-only commands into terminals that were basically windows into a distant brain. All the intelligence sat somewhere else, and you just got text back.
PC era: Software became physical products you'd buy in boxes, install from floppy disks or CDs, and run entirely on your own machine. Suddenly computing power lived under your desk, and applications could use rich graphical interfaces instead of just green text on black screens.
Web era: Applications moved into browsers accessed through URLs, shifting from installed software to services that updated automatically. No more version numbers or install wizards, just type an address and you're using the latest version built out of cross-platform Web standards UI components.
Mobile era: Applications shrank into task-focused apps downloaded from curated stores, designed for fingers not mice, and aware of your location and orientation. Computing became something in your pocket that could make use of the environment around you through cameras, GPS, and on-device sensors.
AI era: Instead of screens and buttons, applications are conversations where AI models understand intent, execute complex tasks, and adapt to context without explicit programming for every scenario. And we're just getting started.
While it's true that AI applications sound a lot like the mainframe applications of old, those apps required exact syntax and returned predetermined responses. AI applications understand natural language and generate solutions on the fly. They don't just process commands, they reason through problems and build UI as needed.
During each of these platform shifts, companies react the same way. They attempt to port the application models they had without thinking through and embracing what's different. Early Web site were posters and brochures. Early mobile apps were ported Websites. Just like early TV shows were just radio shows with cameras pointed at them.
But at the start of a technology platform shift, how applications will change isn't clear. It takes time for new forms to develop. As they do most companies will end up rebuilding their apps like they did for the Web, mobile, and more. Companies that embrace new capabilities and modes of building early on can gain a foothold and grow. That's why technology shifts are accompanied by a surge of new start-ups. Change is opportunity.
Five Paths to Solving Robotics
In his AI Speaker Series presentation at Sutter Hill Ventures, Google DeepMind's Ted Xiao outlined five worldviews on how to achieve useful, ubiquitous robotics and dug into his team's work integrating frontier models like Gemini directly into robotic systems. Here' my notes from his talk:
We're at a unique moment in robotics where there's no consensus on the path forward. Unlike other AI breakthroughs where approaches quickly consolidated, robotics remains wide open with multiple reasonable paths showing early signs of success. Ted presented five worldviews, each with smart researchers and builders pursuing them with conviction:
Industry IncumbentThese researchers believe general-purpose robotics is the wrong goal. Purpose-built solutions actually work today - from industrial automation to appliances we don't even call robots anymore. When robotics succeeds, we just call them tools. The path forward: directly optimize for specific use cases using decades of control theory and hardware expertise.
Humanoid CompanyThese researchers see hardware as the primary bottleneck. Once platforms stabilize, researchers excel at extracting performance - drones went from fragile research prototypes to consumer products, quadrupeds became robust commercial platforms. Humanoid form factors matter because the world is built for humans, and human-like robots can better leverage internet-scale human data.
Robot Foundation Model StartupThese researchers focuses on robot data and algorithms as the key. Generality is non-negotiable - transformative technologies are general by nature. The core challenge: building an "internet of robotics data" either vertically (solve one domain completely, then expand) or horizontally (achieve robotics' GPT-2 moment first, then improve). Bitter Lesson Believer
These researchers argue frontier models are the only existence proof of technology that can model internet-scale data with human-level performance. You can't solve robotics without incorporating these "magical artifacts" into the exploration process. Frontier model trends and compute lead robotics by about two years. AGI Bro
These researchers take the most radical position: just solve AGI and ask it to solve robotics. The Platonic Representation Hypothesis suggests that as AI models improve across domains, their internal representations converge. Perfect language understanding might inherently include physical understanding.
Gemini RoboticsTed's team at Google DeepMind pursued the Bitter Lesson approach, building robotics capabilities directly into Gemini rather than treating frontier models as black boxes.
Their Gemini Robotics system first enhanced embodied reasoning - teaching the model to understand the physical world better through 2D bounding boxes in cluttered scenes, 3D understanding with depth and orientation, pointing for granular precision, and grasp angles for manipulation. The system then learned low-level control with diverse robot actions, operating at 50Hz control frequency with quarter-second end-to-end latency. This unlocked three key advances:
- Interactivity: The robot responds to dynamic scenes, following objects as they move and adapting to human interference
- Dexterity: Beyond rigid objects, it can fold clothes, wrap headphone wires, and manipulate shoelaces
- Generalization: Handles visual distribution shifts (new lighting, distractors), semantic variations (typos, different languages), and spatial changes (different sized objects requiring different strategies)
When deployed at a conference with completely novel conditions - crowds, different lighting, new table - the system maintained reasonable behavior for arbitrary user requests, showing sparks of that GPT-2 moment where it attempts something sensible regardless of input.
Dark Horses and Emerging Paradigms- Several emerging paradigms could completely upend current approaches.
- Video World Models learning physics without robots through action-conditioned video generation
- Robot-Free Data from simulation or humans with head-mounted cameras
- Thinking Models applying frontier models' reasoning capabilities to robotics
- Locomotion-Manipulation Unity bridging RL-based locomotion with foundation model manipulation
There's no consensus on which path will win. Each approach has reasonable arguments and early signs of success. The lack of agreement isn't a weakness - it's what makes this the most exciting time in robotics history.
Rethinking Applications for AI
With every new technology platform, the concept of an application shifts. Consider the difference between compiled apps during the PC era, online applications during the Web, and app stores during mobile. Now with AI it's happening again.
Before getting into the impact AI is having on applications, it's worth noting we still have downloadable desktop applications, Web applications, mobile app stores and everything in between. Technology platform shifts don't wipe out the past and they also don't happen overnight. So AI-driven changes, while happening fast, are going to be happening for a long time.
The basic components of an application have also stayed consistent for a long time. An application at its highest level is just running code and a database. The database stores the information an application manipulates and the running code allows you to manipulate it through input and output controls (user interface, auth, etc.).
As AI coding agents have gotten more capable, they've increasingly been able to handle more of the running code aspect of an application. Not only can they generate code, they can review it, fix it, and maintain it. So it's not hard to see how AI agents can be a self-sustaining loop.
As AI coding agents take on more and more of the running code aspect of an application, they increasingly need to create, update, and work with databases. Today's databases, however, were made for people to use, not agents. So we built a database system for AI applications called AgentDB designed for agents, not people.
AgentDB allows agents to manifest new databases by just referencing a unique ID. Instead of filling out a series of forms - like people do when creating a database. It also provides agents with templates that let them start using databases immediately and consistently across use cases. These templates are dynamic so as agents learn new or better ways to use a database, that information is passed on to all subsequent agent use.
With these two changes, the concept of an application is already shifting. But what if the idea of needing "running code" is also changing? By fronting an AgentDB database and template system with a remote Model Context Protocol (MCP) server: all you need is a URL plus an AI model to have an app.
All you need is a URL plus an AI model to have an app.In this video, I demonstrate uploading a CSV file of a credit card statement to AgentDB. The system creates a database and template, encapsulates both with a remote MCP server URL that you can add to any AI application that supports remote MCP like Claude, Cursor, Augment Code, etc. The end result is an instant chat app.
Through natural language instructions, you can read and write data immediately and consistently and ask for any variant of user interface you want. Most credit card websites are painfully limiting but now I can create the specific visualizations, categories, queries, and features I want. No waiting around for the credit card site to implement new code.
You also don't need a CSV file to make an app. Just tell an AI model connected to AgentDB what you want. It can use AgentDB to create a database, populate it, and then ensure anything you add to it includes the right information. Tracking the date, location, and cost of concert tickets? AgentDB will enforce all that info is there and if you add a new bit of data to track, it can update all your records (see video below).
You can try making your own chat app from a database or CSV file at the demo page on AgentDB to get a feel for it. There's definitely some rough edges especially when trying to add a remote MCP server to some AI applications (in fact, this whole step should go away) but it's still pretty compelling.
As I mentioned at the start, we don't fully know how the AI platform shift will transform applications yet. Clearly, though, there's big changes coming.
Dynamic Context for AI Agents
For AI applications, context is king. So context management, and thereby context engineering, is critical to getting accurate answers to questions, keeping AI agents on task, and more. But context is also hard earned and fragile, which is why we launched templates in AgentDB.
When an AI agent decides it needs to make use of a database, it needs to go through a multi-step process of understanding. It usually takes 3-7 calls before an agent understands enough about a database's structure to accomplish something meaningful with it. That's a lot of time and tokens spent on understanding. Worse still, this discovery tax gets paid repeatedly. Every new agent session starts from zero, relearning the same database semantics that previous agents already figured out.
Templates in AgentDB tackle this by giving AI agents the context they need upfront, rather than forcing them to discover it through trial and error. Templates provide two key pieces of information about a database upfront: a semantic description and structural definition.
The semantic description explains why the database exists and how it should be used. It includes mappings for enumerated values and other domain-specific knowledge. Think of it as the database's user manual written for AI agents. The structural component uses migration schemas to define the database layout. This gives agents immediate understanding of tables, relationships, and data types without needing to query the system architecture.
With AgentDB templates, agents requests like "give me a list of my to-dos" (to-do database) or "create a new opportunity for this customer" (CRM database) work immediately.
Once you've defined a template, it works for any database that follows that pattern. So one template can provide the context an AI agent needs for any number of databases with the same intent. Like a tot-do list database for every user to keep with an earlier example.
But static instructions for AI agents only go so far. These are thinking machines after all. So AgentDB templates can evolve with on use. For example, a template can be dynamically updated with specific queries that worked well. This creates a feedback loop where templates become more effective over time, learning from real-world usage to provide better guidance to future AI interactions.
AgentDB templates are provided to AI agents as an MCP server which also supports raw SQL access. So AI agents can make use of a database effectively right away and still experiment through querying. AgentDB templates are another example of designing software for AI systems rather than humans because they're different "users".
Prompt Building User Interfaces
Perhaps the biggest problem facing AI products today is: people don't know all the things these products can do nor how to get the best results out of them. Not surprising when you consider most AI product interfaces are just empty text fields asking "what do you want to do?". Prompt building user interfaces can help answer that question and more.
We've been exploring ways to help people understand what's possible and how to accomplish it in Bench. Bench is AI for everyday work tasks. As such, it can do a lot: search the Web, browse the Web as you (with a browser extension), generate reports, make PowerPoint, use your email, and many more of the things that make up people's daily work tasks. The problem is... that's a lot.
To give people a better sense of what Bench can do, we started with suggested prompts (aka instructions) that accomplished specific work tasks. To make these as relevant as possible, we added an initial screen to the Bench start experience asking people to specify their primary roles at work: Engineering, Design, Sales, etc. If they did, the suggested prompts would be reflective of the kinds of things they might do at work. For example Sales folks would see suggestions like: research a prospect, prep for a sales meeting, summarize customer feedback, and so on.
The problem with these kinds of high level suggestions is they are exactly that: too high level. Though relevant to a role, they're not relevant to someone's current work tasks. Sales teams are researching prospects but doing it in a way that's specific to the product they're selling and the prospect they're researching. Generic prompt suggestions aren't that useful.
To account for this, we attempted to personalize the role-based suggestions by researching people's companies in the background while they signed up. This additional information allowed us to make suggestions more specific to the industry and company people worked for. This definitely made suggested prompts more specific, but it also made them less useful. Researching someone's company gives you some context but not nearly the amount its employees have. Because of this, personalized suggested prompts felt "off". So we went back to more generic suggestions but made them more atomic.
Instead of encompassing a complete work task, atomic suggestions just focused on part of it: where the information for a work task was coming from (look at my Gmail, search my Notion) and what the output of a work task should be (create a Word Doc, make a chart). These suggestions gave people a better sense of Bench's capabilities. It can read my calendar, it can make Google sheets. Almost immediately, though, it felt like these atomic suggestions should be combine-able.
To enable this, we made a prompt rewriter that would change based on what atomic suggestions people chose. If they picked Use Salesforce and Create Google Doc, the rewriter would merge these into a single instruction that made sense "Use [variable] from Salesforce to create a Google Doc". This turned the process of writing complex prompts into just clicking suggestions. The way these suggestions were laid out, however, didn't make clear they could be combined like this. They looked and felt like discrete prompts.
Enter the task builder. In the latest version of Bench, atomic suggestions have been expanded and laid out more like the building blocks of a prompt. People can either select what they want to do, use, make, or any combination of the three. The prompt rewriter then stitches together a machine-written prompt along with some optional inputs field people can fill in to provide more details about the work task they want to get done.
This prompt builder UI does a few things for people using Bench. It:
- makes what the product can do clearer
- provides a way to surface new functionality as it's added to the product
- rewrites people's prompts in a way that gets them to better outcomes
- clarifies what people can add to a prompt to make their tasks more effective
While that's a decent amount of good outcomes, design is never done and AI capabilities keep improving. As a result, I'm sure we're not done with not only Bench's task builder UI but solutions to discoverability and prompting in AI products overall. In other words... more to come.
Prompt Building User Interfaces
Perhaps the biggest problem facing AI products today is: people don't know all the things these products can do nor how to get the best results out of them. Not surprising when you consider most AI product interfaces are just empty text fields asking "what do you want to do?". Prompt building user interfaces can help answer that question and more.
We've been exploring ways to help people understand what's possible and how to accomplish it in Bench. Bench is AI for everyday work tasks. As such, it can do a lot: search the Web, browse the Web as you (with a browser extension), generate reports, make PowerPoint, use your email, and many more of the things that make up people's daily work tasks. The problem is... that's a lot.
To give people a better sense of what Bench can do, we started with suggested prompts (aka instructions) that accomplished specific work tasks. To make these as relevant as possible, we added an initial screen to the Bench start experience asking people to specify their primary roles at work: Engineering, Design, Sales, etc. If they did, the suggested prompts would be reflective of the kinds of things they might do at work. For example Sales folks would see suggestions like: research a prospect, prep for a sales meeting, summarize customer feedback, and so on.
The problem with these kinds of high level suggestions is they are exactly that: too high level. Though relevant to a role, they're not relevant to someone's current work tasks. Sales teams are researching prospects but doing it in a way that's specific to the product they're selling and the prospect they're researching. Generic prompt suggestions aren't that useful.
To account for this, we attempted to personalize the role-based suggestions by researching people's companies in the background while they signed up. This additional information allowed us to make suggestions more specific to the industry and company people worked for. This definitely made suggested prompts more specific, but it also made them less useful. Researching someone's company gives you some context but not nearly the amount its employees have. Because of this, personalized suggested prompts felt "off". So we went back to more generic suggestions but made them more atomic.
Instead of encompassing a complete work task, atomic suggestions just focused on part of it: where the information for a work task was coming from (look at my Gmail, search my Notion) and what the output of a work task should be (create a Word Doc, make a chart). These suggestions gave people a better sense of Bench's capabilities. It can read my calendar, it can make Google sheets. Almost immediately, though, it felt like these atomic suggestions should be combine-able.
To enable this, we made a prompt rewriter that would change based on what atomic suggestions people chose. If they picked Use Salesforce and Create Google Doc, the rewriter would merge these into a single instruction that made sense "Use [variable] from Salesforce to create a Google Doc". This turned the process of writing complex prompts into just clicking suggestions. The way these suggestions were laid out, however, didn't make clear they could be combined like this. They looked and felt like discrete prompts.
Enter the task builder. In the latest version of Bench, atomic suggestions have been expanded and laid out more like the building blocks of a prompt. People can either select what they want to do, use, make, or any combination of the three. The prompt rewriter then stitches together a machine-written prompt along with some optional inputs field people can fill in to provide more details about the work task they want to get done.
This prompt builder UI does a few things for people using Bench. It:
- makes what the product can do clearer
- provides a way to surface new functionality as it's added to the product
- rewrites people's prompts in a way that gets them to better outcomes
- clarifies what people can add to a prompt to make their tasks more effective
While that's a decent amount of good outcomes, design is never done and AI capabilities keep improving. As a result, I'm sure we're not done with not only Bench's task builder UI but solutions to discoverability and prompting in AI products overall. In other words... more to come.
AI Has Flipped Software Development
For years, it's been faster to create mockups and prototypes of software than to ship it to production. As a result, software design teams could stay "ahead" of engineering. Now AI coding agents make development 10x faster, flipping the traditional software development process on its head.
In my thirty years of working on software, the design teams I was part of were typically operating "out ahead" of our software development counterparts. Unburdened by existing codebases, technical debt, performance, and infrastructure limitations, designers could work quickly in mockups, wireframes, and even prototypes to help envision what we could or should build before time and effort was invested into actually building it.
While some software engineering teams could ship in days, in most (especially larger) organizations, building new features or redesigning apps could take months if not quarters or years. So there was plenty of time for designers to explore and iterate. This was also reflected in the ratio of designers to developers in most companies: an average of one designer for every twenty engineers.
When designs did move to the production engineering phase, there'd (hopefully) be a bunch of back and forth to resolve unanswered questions, new issues that came up, or changing requirements. A lot of this burden fell on engineering as they encountered edge cases, things missing in specs, cross-device capability differences, and more. What it added up to though, was that the process to build and launch something often took longer than the process to design it.
AI coding tools change this dynamic. Across several of our companies, software development teams are now "out ahead" of design. To be more specific, collaborating with AI agents (like Augment Code) allows software developers to move from concept to working code 10x faster. This means new features become code at a fast and furious pace.
When software is coded this way, however, it (currently at least) lacks UX refinement and thoughtful integration into the structure and purpose of a product. This is the work that designers used to do upfront but now need to "clean up" afterward. It's like the development process got flipped around. Designers used to draw up features with mockups and prototypes, then engineers would have to clean them up to ship them. Now engineers can code features so fast that designers are ones going back and cleaning them up.
So scary time to be a designer? No. Awesome time to be a designer. Instead of waiting for months, you can start playing with working features and ideas within hours. This allows everyone, whether designer or engineer, an opportunity to learn what works and what doesn’t. At its core rapid iteration improves software and the build, use/test, learn, repeat loop just flipped, it didn't go away.
In his Designing Perplexity talk at Sutter Hill Ventures, Henry Modisett described this new state as "prototype to productize" rather than "design to build". Sounds right to me.
AI Has Flipped Software Development
For years, it's been faster to create mockups and prototypes of software than to ship it to production. As a result, software design teams could stay "ahead" of engineering. Now AI coding agents make development 10x faster, flipping the traditional software development process on its head.
In my thirty years of working on software, the design teams I was part of were typically operating "out ahead" of our software development counterparts. Unburdened by existing codebases, technical debt, performance, and infrastructure limitations, designers could work quickly in mockups, wireframes, and even prototypes to help envision what we could or should build before time and effort was invested into actually building it.
While some software engineering teams could ship in days, in most (especially larger) organizations, building new features or redesigning apps could take months if not quarters or years. So there was plenty of time for designers to explore and iterate. This was also reflected in the ratio of designers to developers in most companies: an average of one designer for every twenty engineers.
When designs did move to the production engineering phase, there'd (hopefully) be a bunch of back and forth to resolve unanswered questions, new issues that came up, or changing requirements. A lot of this burden fell on engineering as they encountered edge cases, things missing in specs, cross-device capability differences, and more. What it added up to though, was that the process to build and launch something often took longer than the process to design it.
AI coding tools change this dynamic. Across several of our companies, software development teams are now "out ahead" of design. To be more specific, collaborating with AI agents (like Augment Code) allows software developers to move from concept to working code 10x faster. This means new features become code at a fast and furious pace.
When software is coded this way, however, it (currently at least) lacks UX refinement and thoughtful integration into the structure and purpose of a product. This is the work that designers used to do upfront but now need to "clean up" afterward. It's like the development process got flipped around. Designers used to draw up features with mockups and prototypes, then engineers would have to clean them up to ship them. Now engineers can code features so fast that designers are ones going back and cleaning them up.
So scary time to be a designer? No. Awesome time to be a designer. Instead of waiting for months, you can start playing with working features and ideas within hours. This allows everyone, whether designer or engineer, an opportunity to learn what works and what doesn’t. At its core rapid iteration improves software and the build, use/test, learn, repeat loop just flipped, it didn't go away.
In his Designing Perplexity talk at Sutter Hill Ventures, Henry Modisett described this new state as "prototype to productize" rather than "design to build". Sounds right to me.
Designing Software for AI Agents
From making apps, browsing the Web, to creating files, today's AI agents today can take on an increasing number of computing tasks on their own. But the software underlying these capabilities, wasn't made for agents. It was designed and built for people to use. As such there's an opportunity, and perhaps an increasing need, to rethink these systems for agent use.
When building agent-based AI applications, you'll likely butt up against a number of situations where existing software isn't optimized for what thinking machines can do. For instance, Web search. Nearly every agent-based AI application makes use of information on the Web to get things done. But Web Search APIs weren't written with agents in mind.
They provide a limited number of search results and a condensed snippet format that lines up more with how people use Web search interfaces. We get a page of ten blue links and scan them to decide which one to click. But AI agents aren't people. Not only can they make sense of many more search results at once, but their performance usually improves with larger document summaries and contents. People on the other hand, are unlikely to read through all search results before making a decision. So search APIs could certainly be rethought for agents.
Similarly, when agents are developing applications or collecting data, they can make use of databases. But once again databases were designed and built for people to use not AI agents. And once again they can be rethought for agents, which is what we did with our most recent launch: AgentDB.
Agents can (and do) produce 1000x more databases than people every day, so the process of spinning up and managing any database for an agent needs to be as easy and maintenance-free as possible. Most of the databases AI agents create will be short-lived after serving their initial purpose. But some databases will be used again and others still will be used regularly.
With this kind of volume costs can become an issue, so keeping that many databases available needs to be as cost effective as possible. Last but not least, the content of databases needs to work well as context for AI models so agents can use this data as part of their tasks.
AgentDB is a database system designed around these considerations. With AgentDB, creating a database only requires a Universally Unique Identifier (UUID). There's no setup or configuration step. So whenever an AI agent decides it needs a database, it has one simply by creating a UUID. No forms or set-up wizards involved.
Databases in AgentDB are stored as files not hosted services requiring compute and maintenance. If an AI agent needs to query a database or append to it, it can. But if it never needs to access it again, the database is just a file. That means you're only paying for the cost of storage to keep it around and because AgentDB databases are just files, they scale. Meaning they can easily keep up with the scale of AI agents.
To make data within each AgentDB database easily accessible as context for AI models, every AgentDB account is also an MCP server. This makes the data portable across AI applications as long as they support MCP server connections (which most do).
Altogether this example illustrates how even the most fundamental software infrastructure systems, like databases, can be rethought for the age of AI. The AgentDB database system doesn't look like a hosted database as a service solution because it's not designed and built for database admins and back-end developers. It's built for today's thinking machines.
And as agents take on more computing tasks for people, it won't be the only software made with agents as first class users.
Designing Software for AI Agents
From making apps, browsing the Web, to creating files, today's AI agents today can take on an increasing number of computing tasks on their own. But the software underlying these capabilities, wasn't made for agents. It was designed and built for people to use. As such there's an opportunity, and perhaps an increasing need, to rethink these systems for agent use.
When building agent-based AI applications, you'll likely butt up against a number of situations where existing software isn't optimized for what thinking machines can do. For instance, Web search. Nearly every agent-based AI application makes use of information on the Web to get things done. But Web Search APIs weren't written with agents in mind.
They provide a limited number of search results and a condensed snippet format that lines up more with how people use Web search interfaces. We get a page of ten blue links and scan them to decide which one to click. But AI agents aren't people. Not only can they make sense of many more search results at once, but their performance usually improves with larger document summaries and contents. People on the other hand, are unlikely to read through all search results before making a decision. So search APIs could certainly be rethought for agents.
Similarly, when agents are developing applications or collecting data, they can make use of databases. But once again databases were designed and built for people to use not AI agents. And once again they can be rethought for agents, which is what we did with our most recent launch: AgentDB.
Agents can (and do) produce 1000x more databases than people every day, so the process of spinning up and managing any database for an agent needs to be as easy and maintenance-free as possible. Most of the databases AI agents create will be short-lived after serving their initial purpose. But some databases will be used again and others still will be used regularly.
With this kind of volume costs can become an issue, so keeping that many databases available needs to be as cost effective as possible. Last but not least, the content of databases needs to work well as context for AI models so agents can use this data as part of their tasks.
AgentDB is a database system designed around these considerations. With AgentDB, creating a database only requires a Universally Unique Identifier (UUID). There's no setup or configuration step. So whenever an AI agent decides it needs a database, it has one simply by creating a UUID. No forms or set-up wizards involved.
Databases in AgentDB are stored as files not hosted services requiring compute and maintenance. If an AI agent needs to query a database or append to it, it can. But if it never needs to access it again, the database is just a file. That means you're only paying for the cost of storage to keep it around and because AgentDB databases are just files, they scale. Meaning they can easily keep up with the scale of AI agents.
To make data within each AgentDB database easily accessible as context for AI models, every AgentDB account is also an MCP server. This makes the data portable across AI applications as long as they support MCP server connections (which most do).
Altogether this example illustrates how even the most fundamental software infrastructure systems, like databases, can be rethought for the age of AI. The AgentDB database system doesn't look like a hosted database as a service solution because it's not designed and built for database admins and back-end developers. It's built for today's thinking machines.
And as agents take on more computing tasks for people, it won't be the only software made with agents as first class users.
Context Management UI in AI Products
They say context is king and that's certainly true in AI products where the content, tools, and instructions applications provide to AI models shape their behavior and subsequent results. But if context is so critical, how do we allow people to understand and manage it when interacting with AI-driven software?
In AI products, there's a lot of stuff that could be in context (provided to an AI model as part of its instructions) at any given point, but not everything will be in context all the time because AI models have context limits. So when getting results from AI products, people aren't sure if or how much they should trust them. Was the right information used to answer my question? Did the model hallucinate or use the wrong information?
When I launched my personal AI two years ago, context was much simpler than it is today. In Ask LukeW, when people ask a question about digital product design, the system searches through my writings, finds and puts the most relevant bits into context for AI models to use and reference, then cities them in the results people see. This is pretty transparent in the interface: the articles, videos, audio, and PDFs used are shown on the right with citations within each response to where these files were used the most.
The most complicated things get in Ask LukeW is when someone opens one of these citied articles, videos, or PDFs to view its full contents. In this case, a small "context chip" is added to the question bar to make clear questions can be asked of just this file. In other words, the file is the primary thing in context. If someone wants to ask a question of the whole corpus of my writings and talks again, they can simply click on the X that removes this context constraint and the chip disappears from the question bar. You can try this out yourself here.
Context chips are pretty common in AI products today because they're a relatively easy way to both give people a sense of what's influencing an AI model's replies and to add or remove it. When what's in context expands, however, they don't scale very well. For example, Augment Code uses context chips for retrieval systems, active files, selected text, and more.
Using a context chip to display everything influencing an AI model's response begins to break down when many things (especially different things) are in context. Displaying them all eats up valuable space in the UI and requires that their names or identifiers are truncated to fit. That kind of defeats the purpose of "showing you what's in context". Also when AI products do automatic context retrieval like Augment Code's context retrieval engine: does that always show up as a chip? or should people not worry about it and trust the system is finding and putting the right things into context?
With AI products using agents these issues are compounded because each tool call an agent makes can retrieve context in different ways or multiple times. So showing every bit of context found or created by tools as a context chip quickly breaks down. To account for this in earlier versions of Bench, we showed the context from tools used by agents as it was being created. But this turned out to be a jarring experience as the context would show up then go away when the next tool's context arrived (as you can see in the video).
Since then, we've moved to showing an agent's process of creating something as condensed steps with links to the context in each step. So people can click on any given steps to see the context a tool either found or created. But that context isn't being automatically flashed in front of them as it's made. This lets people focus on the output and only dig into the process when they want to understand what led to the output.
This approach becomes even more relevant with agent orchestration. When agents can make use of agents themselves, you end up with nested amounts of context. Told you things were a lot simpler two years ago! In these cases, Bench just shows the collective context combined from multiple tool calls in one link. This allows people to examine what cumulative context was created by sub agents. But importantly this combined context is treated the same way - whether it comes from a single tool or a subagent that uses multiple tools.
While making context understood and manageable feels like the right thing to provide transparency and control, increasingly people seem to focus more on the output of AI products and less on the process that created them. Only when things don't seem "right" do they dig into the kinds of process timelines and context links that Bench provides. So if people become even more confident using AI products, we might see context management UIs with even less presence.
Context Management UI in AI Products
They say context is king and that's certainly true in AI products where the content, tools, and instructions applications provide to AI models shape their behavior and subsequent results. But if context is so critical, how do we allow people to understand and manage it when interacting with AI-driven software?
In AI products, there's a lot of stuff that could be in context (provided to an AI model as part of its instructions) at any given point, but not everything will be in context all the time because AI models have context limits. So when getting results from AI products, people aren't sure if or how much they should trust them. Was the right information used to answer my question? Did the model hallucinate or use the wrong information?
When I launched my personal AI two years ago, context was much simpler than it is today. In Ask LukeW, when people ask a question about digital product design, the system searches through my writings, finds and puts the most relevant bits into context for AI models to use and reference, then cities them in the results people see. This is pretty transparent in the interface: the articles, videos, audio, and PDFs used are shown on the right with citations within each response to where these files were used the most.
The most complicated things get in Ask LukeW is when someone opens one of these citied articles, videos, or PDFs to view its full contents. In this case, a small "context chip" is added to the question bar to make clear questions can be asked of just this file. In other words, the file is the primary thing in context. If someone wants to ask a question of the whole corpus of my writings and talks again, they can simply click on the X that removes this context constraint and the chip disappears from the question bar. You can try this out yourself here.
Context chips are pretty common in AI products today because they're a relatively easy way to both give people a sense of what's influencing an AI model's replies and to add or remove it. When what's in context expands, however, they don't scale very well. For example, Augment Code uses context chips for retrieval systems, active files, selected text, and more.
Using a context chip to display everything influencing an AI model's response begins to break down when many things (especially different things) are in context. Displaying them all eats up valuable space in the UI and requires that their names or identifiers are truncated to fit. That kind of defeats the purpose of "showing you what's in context". Also when AI products do automatic context retrieval like Augment Code's context retrieval engine: does that always show up as a chip? or should people not worry about it and trust the system is finding and putting the right things into context?
With AI products using agents these issues are compounded because each tool call an agent makes can retrieve context in different ways or multiple times. So showing every bit of context found or created by tools as a context chip quickly breaks down. To account for this in earlier versions of Bench, we showed the context from tools used by agents as it was being created. But this turned out to be a jarring experience as the context would show up then go away when the next tool's context arrived (as you can see in the video).
Since then, we've moved to showing an agent's process of creating something as condensed steps with links to the context in each step. So people can click on any given steps to see the context a tool either found or created. But that context isn't being automatically flashed in front of them as it's made. This lets people focus on the output and only dig into the process when they want to understand what led to the output.
This approach becomes even more relevant with agent orchestration. When agents can make use of agents themselves, you end up with nested amounts of context. Told you things were a lot simpler two years ago! In these cases, Bench just shows the collective context combined from multiple tool calls in one link. This allows people to examine what cumulative context was created by sub agents. But importantly this combined context is treated the same way - whether it comes from a single tool or a subagent that uses multiple tools.
While making context understood and manageable feels like the right thing to provide transparency and control, increasingly people seem to focus more on the output of AI products and less on the process that created them. Only when things don't seem "right" do they dig into the kinds of process timelines and context links that Bench provides. So if people become even more confident using AI products, we might see context management UIs with even less presence.
What Do You Want To AI?
Alongside an increasing sameness of features and user interfaces, AI applications have also converged on their approach to primary calls to action: "What Do You Want To ___?" But is there a better way... especially for more domain specific applications?
Looking across AI products today, most feature an open-ended text field with an equally open-ended call to action:
- What do you want to know?
- What can I help with?
- What do you want to create?
- What do you want to build?
- What will you imagine?
- Ask anything...
- Ask a question...
- Ask [AI tool]...
So many questions. I've even turned them into a running joke. When a financial company integrates their AI: "What do you want to bank?" or "What do you want to accountant?" Silly I know, but it illustrates the issue. People often don't know what AI products can do nor how to best instruct/prompt them. Questions just exacerbate the issue.
It may be a small detail but instead of asking, how about instructing? Reve's image creation call to action says: "Describe an image or drop one here...". Bench's AI-powered workspace starts with: "Describe the task you want Bench to do...". Both calls to action are still open ended enough that so they can capture the kind of broad intent AI models can handle. But perhaps there's something to having a bit more guidance beyond "What Do You Want To AI?"
What Do You Want To AI?
Alongside an increasing sameness of features and user interfaces, AI applications have also converged on their approach to primary calls to action: "What Do You Want To ___?" But is there a better way... especially for more domain specific applications?
Looking across AI products today, most feature an open-ended text field with an equally open-ended call to action:
- What do you want to know?
- What can I help with?
- What do you want to create?
- What do you want to build?
- What will you imagine?
- Ask anything...
- Ask a question...
- Ask [AI tool]...
So many questions. I've even turned them into a running joke. When a financial company integrates their AI: "What do you want to bank?" or "What do you want to accountant?" Silly I know, but it illustrates the issue. People often don't know what AI products can do nor how to best instruct/prompt them. Questions just exacerbate the issue.
It may be a small detail but instead of asking, how about instructing? Reve's image creation call to action says: "Describe an image or drop one here...". Bench's AI-powered workspace starts with: "Describe the task you want Bench to do...". Both calls to action are still open ended enough that so they can capture the kind of broad intent AI models can handle. But perhaps there's something to having a bit more guidance beyond "What Do You Want To AI?"
More on Generative Publishing
One of the most common questions people ask my personal AI, Ask LukeW, is "how did you build this?" While I've written a lot about the high level architecture and product design details of the service, I never published a more technical overview. Doing so highlighted enough interesting generative publishing ideas that I decided to share a bit about the process.
First of all, Ask LukeW makes use of the thousands of articles I've written over the years to answer people's questions about digital product design. Yes, that's a lot of writing but it's not enough to capture all the things I've learned over the past 30 years. Which means sometimes people Ask LukeW questions that I can answer but haven't written about.
In the admin system I built for Ask LukeW, I can not only see the questions that don't get answered well but I can also add content to answer them better in the future. Over the last two years, I've added about 500 answers and thereby expanded the corpus Ask LukeW can respond from by a lot. So the next time similar questions get asked, people aren't left without answers.
That process is an interesting part of generative publishing that I've written about before but it's also how I know that people regularly ask how I built Ask LukeW. they want technical details: what frameworks, what models, what services. I never wrote this up because I'm not that technical and several great engineers helped me build Ask LukeW. As a result, I didn't think I'd do a great job detailing the technical aspect of things.
But one day it occurred to me I could use our AI for code company, Augment Code, which has a deep contextual understanding of codebases to help me write up how Ask LukeW works. I opened the codebase in VS Code and asked Augment the questions people asked me: "how does the feature work?" "what is the codebase?" "what is the tech stack?" and got great detailed responses.
Augment, however, doesn't answer questions the way I do. So I took Augment's detailed technical replies and dropped them into another one of our companies, Bench. A while back I had Bench read a lot of my blog posts and create a prompt that writes articles the way I would. I've saved this prompt in Bench's agent library and can apply it anytime I want it to write like I would.
Once I had Augment's technical details of how Ask LukeW worked written the way I'd explain them by Bench, I took the results and added them as saved answers to the Ask LukeW corpus. Now anytime someone asks these kinds of questions, they get much more detailed technical answers. In fact, this worked so well that I also asked Augment to write up the overall tech stack for my Website and went through the same process.
I for one, found this a really enlightening look at where generative publishing is now. I can see what kinds of information I should be publishing by looking at the questions people ask my personal AI but don't get good answers for. I can use an AI for coding tool to turn code into prose. I can use an agentic workspace to rewrite that prose the way I would because I taught it to write like me. And finally I can feed that content back into my overall corpus so it's available for any similar questions people ask in the future.
That doesn't look like the publishing of old to me. Of course, it's split between multiple tools, requires me know what each one can do, and a host of other issues. We're still early but it's exciting.
More on Generative Publishing
One of the most common questions people ask my personal AI, Ask LukeW, is "how did you build this?" While I've written a lot about the high level architecture and product design details of the service, I never published a more technical overview. Doing so highlighted enough interesting generative publishing ideas that I decided to share a bit about the process.
First of all, Ask LukeW makes use of the thousands of articles I've written over the years to answer people's questions about digital product design. Yes, that's a lot of writing but it's not enough to capture all the things I've learned over the past 30 years. Which means sometimes people Ask LukeW questions that I can answer but haven't written about.
In the admin system I built for Ask LukeW, I can not only see the questions that don't get answered well but I can also add content to answer them better in the future. Over the last two years, I've added about 500 answers and thereby expanded the corpus Ask LukeW can respond from by a lot. So the next time similar questions get asked, people aren't left without answers.
That process is an interesting part of generative publishing that I've written about before but it's also how I know that people regularly ask how I built Ask LukeW. they want technical details: what frameworks, what models, what services. I never wrote this up because I'm not that technical and several great engineers helped me build Ask LukeW. As a result, I didn't think I'd do a great job detailing the technical aspect of things.
But one day it occurred to me I could use our AI for code company, Augment Code, which has a deep contextual understanding of codebases to help me write up how Ask LukeW works. I opened the codebase in VS Code and asked Augment the questions people asked me: "how does the feature work?" "what is the codebase?" "what is the tech stack?" and got great detailed responses.
Augment, however, doesn't answer questions the way I do. So I took Augment's detailed technical replies and dropped them into another one of our companies, Bench. A while back I had Bench read a lot of my blog posts and create a prompt that writes articles the way I would. I've saved this prompt in Bench's agent library and can apply it anytime I want it to write like I would.
Once I had Augment's technical details of how Ask LukeW worked written the way I'd explain them by Bench, I took the results and added them as saved answers to the Ask LukeW corpus. Now anytime someone asks these kinds of questions, they get much more detailed technical answers. In fact, this worked so well that I also asked Augment to write up the overall tech stack for my Website and went through the same process.
I for one, found this a really enlightening look at where generative publishing is now. I can see what kinds of information I should be publishing by looking at the questions people ask my personal AI but don't get good answers for. I can use an AI for coding tool to turn code into prose. I can use an agentic workspace to rewrite that prose the way I would because I taught it to write like me. And finally I can feed that content back into my overall corpus so it's available for any similar questions people ask in the future.
That doesn't look like the publishing of old to me. Of course, it's split between multiple tools, requires me know what each one can do, and a host of other issues. We're still early but it's exciting.