Internet News

Portfolio Presentations & Design Work

LukeW - Wed, 03/13/2024 - 2:00pm

Portfolio presentations are an opportunity for designers to showcase their design process and problem-solving skills to potential employers, peer groups, and more. Over the years, there's been a clear trend in the portfolio presentations I see: much more focus on doing "work" vs. "design work." Here's what that means and how we're trying to account for it:

In large organizations, getting design done requires a lot more than flow diagrams, screen designs, and prototypes. There's a long list of meetings, processes, collaborations, and sign-offs to surmount before a design gets shipped. Because this kind of work takes so much time and effort, designers begin to view it as their primary job. But being great at navigating an organization doesn't necessarily mean being great at design.

This carries over to portfolio presentations as well. In an hour long presentation most of the time goes to describing organizational challenges or processes and little is left for design skills. Couple this with the prevalence of design systems and UI toolkits, and it becomes hard to know how a designer designs and why.

To account for this situation, I wrote a preface for designers coming to interview with us. Several of them suggested I publish it to be more widely useful. So here's the relevant part (below) and I hope it's helpful.

While we understand the need to walk through background and work history, we’ve all read your resume before you to come in. So you can keep your introduction brief and perhaps focus on relevant parts of your background that don’t show up on LinkedIn.

These days it's especially hard to get a clear sense of how designers make decisions and bring ideas to life due to the scale of tech companies (so many processes and stakeholders) and the prevalence of design systems and UI toolkits. We’re building companies from the ground up so getting to see your core design skills is critical for us. In many organizations, especially larger ones, a big part of getting design done requires cross-team coordination, resource management, getting buy-in, and more. While this certainly demonstrates your ability to get things done it’s more of a reflection on your ability to operate within an organization, not your product design sense.

We often find designers over-index on that kind of “work” and end up without enough time and depth on “design work.” So try to strike the right balance. Understanding the context behind a design is critical to evaluating it but connecting the two is where we learn the most about how you work as a designer. When presenting your portfolio, focus on the concrete things that you've personally accomplished and the way you accomplished them. Go deep on a couple of examples to provide insight into how you make design decisions. Walk through the 'why' at a big picture level, and then the 'how' at a detailed level.

To communicate your product design skills, answer questions like: why did you decide on a specific design solution? What iterations did you go through to get to it? Basically connect the pixel-level process to your understanding of business, product, and user goals. How did your unique contributions as a designer, not just as an employee or team member, make the kind of impact you intended?

Generation IT

LukeW - Tue, 03/12/2024 - 2:00pm

If you get the call when both your parents and children need help with their computers, you're part of Generation IT (thanks Sam). Jokes aside, as technology gets more layers there's increasing concerns about who will maintain the hardware and software systems we all increasingly depend on.

In his talk about the Life Post-Moore's Law, Mark Horowitz pointed out that in the not too distant past, groups of students could build a whole microprocessor. Today that process involves huge teams and hundreds of millions of dollars, making it inaccessible to the next generation of hardware designers. The implication according to Mark is:

"Because chip design and designers are a smaller group of people that are getting more gray hair, we're going to end up in a universe where we all are dependent on technology that none of us understand how it works."

Many other areas of technology have similarly gotten much more complex and added many layers of abstraction. While today's kids grow up with smartphones and the Internet, they don't need to know the inner workings of these systems, they use them at the highest level of abstraction. Contrast this to the generation that built all these systems and thereby know all the layers in the stack.

Did you make Web pages by writing HTML, CSS, and Javascript or through one of the many frameworks available today? Did you crank out mockups in Photoshop pixel by pixel or move the components of a design system around in Figma? Did you write code on SPARCstations or have ChatGPT do it for you?

Of course, these tools have made technology "easier" to access and use for many people. I'm not suggesting we all go back to punched-card interfaces. But there's a lot of learning that happens when you do things the hard way. Getting down to the fundamentals and then building back up again provides an understanding that you can't otherwise get.

"Art does not begin with imitation, but with discipline."—Sun Ra, 1956

People in Generation IT have likely had to explain how to connect a printer to both the generation before and after them. If that's our lot in life, ok. But if the next generation doesn't end up with the interest or ability to maintain our increasing archive of printer drivers... things are going to start breaking a lot.

Pinball User Interface

LukeW - Sun, 03/10/2024 - 2:00pm

Using software can be hard. All those form fields, menu items, interactive widgets and more... continually changing. So why make it harder for people by strewing all these user interface elements around a screen? This happens enough in applications that it needs a name. So let's call it Pinball UI.

Pinball UI happens when the various user interface elements on an application screen could be arranged meaningfully... but are not. An intentional layout of user interface elements can help people make their way through a process (like filling in a form) or make the relationships between various bits of content and actions clear. Designers use visual relationships to make these functional relationships clear to users. In Pinball UI, they do not.

Let's illustrate with an example. The "Complete Your Payment" form on PayPal is a pretty critical screen. You've found something you want to buy and are ready to pay. But instead of doing so quickly and effectively... you're playing pinball. The various user interface elements required to complete the payment process are tossed about the screen and people have to go hunting for what's next.

In this proposed redesign of the PayPal form, the steps required to complete a payment flow a clear path to completion. The headers, form elements, content, and primary action are all clearly aligned in a simple sequence. No darting around the screen to figure out what's next or how things are related.

This example from PayPal was featured in my 2008 book, Web Form Design, alongside user research and eye-tracking data highlighting why clear paths to completion reduce errors, speed up annoying processes (like filling in Web forms), and give people more confidence in their actions. So now it's 2024 and Pinball UI is a thing of the past, right?

Looking at Google's new Sign Up form, maybe not. Although there's a lot less UI on Google's form than on PayPal's payment form, the Pinball UI is still there. Our eyes bounce around the screen to make progress. With just a few UI elements on the screen, this could be easily fixed:

Maybe there's other constraints driving Google's layout decisions that aren't apparent to those of us looking at it from the outside.. or maybe they just like pinball.

The Most Important Startup Skill

LukeW - Fri, 03/01/2024 - 2:00pm

Lots of things make startups hard... building teams, shipping products, finding customers, earning revenue... to name just a few. But when there's an endless list of things to do what takes priority? I'd argue it's getting good at learning.

When you start a company, you may have many ideas but you don't have any proven answers. You don't know who your customer is going to be, what product you're going to make, which features are going to matter, what your customer will pay, and so on. Therefore, the most important thing to get good at is finding answers to all these questions.

How do you do that? You get really good at learning.

Luckily there's lots of ways to learn rapidly. Get in front of potential customers early and often. Iterate on design, prototypes, and product continuously. Collect data by using your products, watching others use them, collecting quantitative and qualitative data, and acting on what you see.

To get everyone excited about learning, let them share and celebrate their discoveries. My favorite way of doing this is with a regular "what did we learn this week?" meeting that allows designers, researchers, engineers, and more to highlight what they found out that week about our customers, products, technologies, etc.

This simple process only takes an hour (do it over lunch) and goes a long way to improving how well a company learns.

Life Post-Moore's Law

LukeW - Wed, 02/28/2024 - 2:00pm

In his AI Speaker Series presentation at Sutter Hill Ventures, Mark Horowitz discussed the current state of hardware chip design and scaling along with significant challenges as Moore's law comes to an end. Here's my notes from his talk:

  • We don't recognize is how pervasive our notion that computing will get cheaper in the future is. Everybody's building more complicated models that take longer to compute and the expectation is that that's OK, computers will be able to compute it.
  • The driver of this expectation is Moore's law. but most people don't understand that Moore's law is really about cost per function. That cost scaling is not what it used to be.
  • When transistor cost scale, making the same product in new technology is cheaper to do. That means you always moved all products to the most advanced technology.
  • But that's not happening anymore so Moore's law has ended. Cost per transistor is on a linear scale. It's supposed to be log.
  • If you plot the cost per bit over the past 60 years, DRAM prices are relatively flat. The hard drive has really bottomed out in terms of cost per bit. The only thing that's still scaling is SSDs.
  • So scaling today is just a marketing label. People are expecting better performance but basic technology is not the way to go.
  • What we need to do is increase efficiency and the only way we know how to increase efficiency is to increase customization for a particular end application. We need to tailor certain things for certain markets.
  • We used to be able to build a universal thing and now we need build these little different products without bankrupting ourselves.
  • Chiplets are not the answer. They are interesting and useful technology but won't solve the base problem.
  • We need to do application optimization but who's going to do this optimization?
  • Groups of students in the past could build a whole microprocessor.
  • The systems that are competitive today require investments of hundreds of millions of dollars. Most of the cost is the firmware, the basic interfaces because that's where all the complexity is.
  • Because of this complexity, the number of companies in the silicon space is decreasing. And student interest in hardware is decreasing because it's so opaque to them and the level of complexity that they need to make contributions is very high.
  • Given the situation, we need innovation now more than ever before. And we have nobody to do it.

  • To get great improvements in application optimization, we need radical thinking. In 99% or 95% of the cases it doesn't work but in 5% of cases, there's something interesting.
  • If every experiment costs you $100 million, you're not going to find a few percent of ideas that actually are good.
  • So can we make this exciting again and bring in new people? And make it cheap so that people can actually do it?
  • The good news is this happened before. In the 70s there were only custom chip designs. A bunch of crazy people in the 80s had this idea not to help the custom designers, but to enable another group of people who were interested in hardware design to basically build chips. Those were the logic designers, the people who used our chips and put them on boards.
  • To do that they had to create a whole different level of tooling that interfaced with people not thinking about chip architecture. These tools created really crappy chips but in 10 years it enabled a vibrant design community and the tools improved.
  • Now nobody does custom.
  • So we need a new group of people to throw spaghetti against the wall, because some of it might actually be useful.
  • We need to do hardware software co-design with performance engineers in the application space. They have no knowledge of hardware they know about locality, parallelism and metrics.
  • These users need to have a system to interact with an open interface to a proprietary platform for people to make money. We need to figure out how to map an application to hardware automatically.
  • The application designer need feedback at the level of their software about where the bottlenecks are.
  • All this stuff is hard but it doesn't seem like it is impossible. And if we're going to make forward progress, we really do fundamentally need to change the way we think about design.
  • Because chip design and designers are a smaller group of people that are getting more gray hair. And we're going to end up in a universe, where we all are dependent on technology that none of us understand how it works.

ConveyUX: Three Conversations in Design

LukeW - Tue, 02/27/2024 - 2:00pm

In his Three Conversations in Design presentation at Convey UX Andrew Hogan shared trends in user experience jobs, scaling, and the impact of AI on designers. Here are my notes from his talk:

  • Between 2008 and 2018, there was a decrease in the number of industrial design jobs but an sharp increase in user experience design jobs.
  • With AI, are we at that same moment with digital design jobs? There’s 118,000 people employed in digital design in the US (new added category in 2022). Projections have looked good but they also did for industrial design when it began to decline. UX jobs are down relative to their peak. The main driver is technology unemployment claims.
  • So what’s going to happen now? Industries like utilities, finance, and government are increasing while tech is decreasing. So digital transformation across many industries will keep designers busy.
  • Scaling design teams requires new roles and expansion of roles. 5% of fortune 1000 companies have chief design officer roles. In banking that’s 50%.
  • These design organizations have been growing significantly and scaling roles in design systems, content, design ops, accessibility, and more. But hiring a lot isn’t the solution. Adding lots of designers creates common challenges: collaboration cross teams, career progression, and understanding impact.
  • The relationship of a PM and designer is as predictive of how they feel about their job is as predictive as their relationship with their direct manager.
  • AI is increasingly part of the design process. 65% of designers use AI in their design process. And regularly find it speeds up design processes.
  • But most of design is about communication: what are we doing for who and how? Jambot in Figma brings generative AI features into the canvas to summarize, rewrite, code, etc. But the multi-player aspect still needs to get worked out.
  • Practitioners with a sense of what good is can evaluate generative output but what about new practitioners that don’t yet know how to assess the quality of AI output? Designing AI is becoming a job. Walmart, New York Times, and more are specifically hiring designers to add AI into their products.
  • What’s the impact of AI on design and design on AI? Will we feel a spike in AI design interest and jobs?
  • More AI interactions lead to more energy into designing AI. So these systems will get better but we don’t know what’s next but it will definitely be interesting.

ConveyUX: Good UX is Good Business

LukeW - Mon, 02/26/2024 - 2:00pm

In her Good UX is Good Business talk at Convey UX Amy Lanfear outlined the journey the Microsoft Security team's joint UX group underwent to make their business impact clear to executives. Here are my notes from her talk:

  • In January 2022, a brand new division was been formed at Microsoft for all the security products and they decided to centralize UX.
  • The ultimate goal for the engineering executives was to drive business impact. So how can the UX team show their value toward that end?
  • To communicate that impact, the UX team had to speak a language that engineering executives understood: data.
  • How do you start to apply data in everything that you're doing? OKRs. They're a methodology for tracking goals that are methods-based and help drive accountability.
  • The team articulate objective statements for culture, quality, operational excellence, and a few others.
  • For each, they asked why does it really matter? Why do we think it's important? Perhaps usability, accessibility, productivity, customer loyalty, etc.
  • Executives might not look at the world the same way. Their drivers are about revenue and driving the business forward. So how do you connect these two outlooks?
  • Set a hypothesis that products with high-performing UX metrics are going to build stronger business results. To prove it, they set about measuring their OKRs.
  • One objective was to deliver quality experiences that are easy to use, human-centered, and accessible and grounded ourselves in three metrics.
  • Accessibility should be a C grade or higher. Ease of use, the SEQ score should be a six or higher on a seven-point scale. And so on...
  • The problem was each UX team measured things differently. Some measured NPS. Some measured ask completion. It was just all over the map.
  • The Common UX Measurement Framework was started to make sure all product teams measure like things in like ways.
  • Then across designers, researchers, technical PMs, software developers, data scientists, they got to shared goals and a common scorecard that was reviewed quarterly review.
  • In the review, the why questions that only humans can ask and answer are discussed. In this meeting, shining a light on winning scenarios helps other teams ask, what are you doing?
  • But UX metrics weren't enough. It was the data correlation with business metrics that mattered like support incidents, CSAT, NPS, public perception, internal efficiencies.
  • the team just finished a pilot project that showed a 12% reduction in support over three months' time.
  • Now the team is using their data insights to shift left, which means let's get problems right upstream, versus wait for problems, and then only have to fix them later downstream.
  • Basically it's about bringing quality assurance closer to the beginning stage of the development cycle.
  • The next step is get UX metrics into the product launch readiness criteria.<./li>
  • Though the things that we measure today will be different tomorrow because the experiences will be different tomorrow, this framework still holds true. The metrics might change because with new experiences come new things to measure and in different ways.

ConveyUX: Building Expertise into Generated Conversations

LukeW - Mon, 02/26/2024 - 2:00pm

In her Building Expertise into Generated Conversations talk at Convey UX Susan Hura talked about the impact of Large Language Models on the role and process of conversational designers. Here are my notes from her talk:

  • It is a weird time to be a conversation designer. Two years ago Siri and Google Assistant raised people's consciousness. People started to become willing to engage. But ChatGPT really changed the situation as people realized its capabilities.
  • So how are large language models and generative AI are impacting this domain of design?
  • Speech is getting things in my head into your head. We used to think this is a metaphor but when you measure the brain waves between speakers, they match across conversation. So conversation is really critical.
  • AI has been used in various parts of conversation for years. NLG: natural language generation, TTS: text to speech, ASR: automatic speech recognition (sounds to words), NLU: natural language understanding.
  • But we used to have to build custom language models to get to something that looks like understanding. It was a challenging issue. Large language models only work with few shots. You can train a model now with just dozens of examples.
  • Conversation is an over learned behavior. We learn it very early in life (about a year old) and automatically. No one needs to teach us how to learn to speak.
  • We are designing for one of the most human of behaviors. Their are all kinds of social, relationship, and emotional triggers in conversations.
  • The most important elements of conversation are not tied to language, it’s a back and forth. You are in relationship in a conversation. When you get it right, it establishes trust.
  • We don’t want people to have to think about how to talk to a computer, instead computers need to play by the rules of conversation.
LLMs for Conversational Design
  • What we can do today is play to the strengths of generative AI models. They're great at synthesizing huge amounts of information so use them to help analyze large chunks of data.
  • For example, one of the things that's important in conversation design is establishing a conversational style guide. Mature organizations have UX writing guidelines and a voice and tone guide.
  • For smaller organizations, you can use LLMs to draft these resources from an organization's Website by analyzing how they talk to customers.
  • You can also use Generative AI to analyze unstructured user research results especially open-ended questions that take a long time to categorize.
  • Generative AI can also give us raw materials at design time. For example, the first deliverable for conversation design is often a sample dialogue. But once you get more than a small handful of them, it becomes a maintenance nightmare. So use a large language model to generate all the needed variants.
  • These are really quite low risk because they are used at design time. They just help us do our jobs better and faster.
  • Using Generative AI at run time is also possible. For instance to allow flexibility in entity collection. That is to allow the user to give you any or none or all pieces of information in whatever way they choose.
  • This used to require a lot of logic in code but now it can be a single statement.
  • Generative AI can handle procedural elements of conversation. Could you hang on a second? Could you repeat that? You can build this one-off, but with the right model, you don't have to.
  • Generative AI can also provide a real-time complete FAQ from all the information you want you to draw answers from. With this, you can actually answer questions.
  • Very nonlinear, totally unscripted conversations are only possible with a fully autonomous AI agent.
  • Instead of writing code, just lay out the rules of the road and hand over the entire interactions: what gets said, the order in which it gets said, everything.
  • This is necessary for some use cases that don't seem all that complicated like shopping for a new TV.
  • You need an agent because these conversations are unscripted, nonlinear, and the user changes their minds: they back up, they go forward, they skip steps.
  • Prompt engineering needs to tell the bot, who are you? What is your role in the conversation that you're about to have with an end user? etc.
  • Evaluate anything that's generated. There aren't established guardrails yet but things like RAG and large context windows can help.
  • Know when Generative AI is not your right solution like hen you need security, privacy, and compliance.
  • Be aware of latency. Maybe you get away with it taking 3 or 5 seconds to get back to you. But even that isn't ideal for natural conversations.
  • When large language models fail, they fail in ways that do not make intuitive sense to us as humans. Because LLMs don't really understand the world.<./li>
  • There's a difference between knowing the association between words and images and having that real-world grounding of what they mean.
  • This has implications on how we set up our users to talk to these things. Are we setting up these AI agents as artificial humans?
  • The problem with putting a counterfeit version of a human in front of people is that we can't stop imagining the mind behind the conversation.

AI Models and Headsets

LukeW - Thu, 02/15/2024 - 2:00pm

We've all heard the adage the "the future is here it just isn't evenly distributed yet. With rapid advancement in multi-modal AI models and headset computing, we're at a stage where the future is clear but is isn't implemented (yet). Here's two examples:

Multi-modal AI models can take videos, images, audio, and text as input and apply them as context to provide relevant responses and actions. Coupled with a lightweight headset with a camera, microphone, and speakers, this provides people with new ways to understand and interact with the World around them.

While these capabilities both exist, large AI models can't run locally on glasses ...yet. But the speed and cost of running models keeps decreasing while their abilities keep increasing.

Video generation models can not only go from text to video, image to video, but also video to video. This enables people to modify what they're watching on the fly. Coupled with immersive video on a spatial computing platform, this enables dynamic environments, entertainment, and more.

Again these capabilities exist seperately but the kind of instant immersive (high resolution) video generations needed for Apple's Vision Pro format isn't here... yet.

There are no Original Ideas. But...

LukeW - Wed, 02/14/2024 - 2:00pm

Mark Twain is believed to have said "There is no such thing as an original idea." The implication is that, as a species, we are constantly building on what came before us: inspired and driven by what we've seen and experienced. Personally, I like to phrase a similar sentiment as "There are no original ideas. But there are original executions."

Anyone who has been around product design has probably experienced idea fetish: the belief that a good idea is all you need to be successful. This was elegantly debunked by Steve Jobs in 1995.

"You know, one of the things that really hurt Apple was after I left John Sculley got a very serious disease. It’s the disease of thinking that a really great idea is 90% of the work. And if you just tell all these other people “here’s this great idea,” then of course they can go off and make it happen. And the problem with that is that there’s just a tremendous amount of craftsmanship in between a great idea and a great product.

Designing a product is keeping five thousand things in your brain and fitting them all together in new and different ways to get what you want. And every day you discover something new that is a new problem or a new opportunity to fit these things together a little differently.

And it’s that process that is the magic."

As Steve points out designing products is fitting thousands of different things together in different ways. So every execution of an idea is an explosion of possibility and thereby originality. How you execute an idea is always original simply because of the number of variables in play. Hence "There are no original ideas. But there are original executions."

Apple Vision Pro: First Experience

LukeW - Sun, 02/04/2024 - 2:00pm

Like many technology nerds out there, I got to explore Apple's new Vision Pro headset and Spatial Computing operating system first hand this weekend. Instead of a product review (there's plenty out there already), here's my initial thoughts on the platform interactions and user interface potential. So basically a UI nerd's take on things.

Apple's vision of Spatial Computing essentially has two modes: an infinite canvas of windows and their corresponding apps which can be placed and interacted with anywhere within your surroundings and an immersive mode that replaces your physical surroundings with a fully digital environment. Basically a more Augmented Reality (AR)-ish mode and a more Virtual Reality (VR)-ish mode.

Whether viewing a panoramic photo, exploring an environment, or watching videos in cinematic mode, the ability to fully enter a virtual space is really well done. Experiences made for this format are the top of my list to try out. It's where things are possible that make use of and alter the entirety of space around you. That said, the apps and content that make use of this capability are few and far between right now.

But I expect lots of experimentation and some truly spatial computing first interactions to emerge. Kind of like the way Angry Birds fully embraced multi-touch on the iPad and created a unique form of gameplay as a result.

While definitely being used to render the spatial OS, the deep understanding Apple Vision Pro's camera and sensor system has of your environment feels under-utilized by apps so far. I say this without a deep understanding of the APIs available to developers but when I see examples of the data Apple Vision Pro has available (video below), it feels like more is possible.

So that's certainly compelling but what about the app environment, infinite windows, and getting things done in the AR-ish side of Spatial OS? With this I worry Apple's ecosystem might be holding them back vs. moving them forward. The popular consensus is that having a deep catalog of apps and developers is a huge advantage for a new platform. And Apple's design team has made it clear that they're leaning into this existing model as a way to transition users toward something new.

But that also means all the bad stuff comes along with the good. At a micro level, I found it very incongruent with the future of computing to face a barrage of pop-up modal windows during device setup and every time I accessed a new app. I know these are consistent patterns on MacOS, iOS, and iPadOS but that's the point: do they belong in a spatial OS? And frankly given their prominence, frequency, and annoyance... do they belong in any OS?

Similarly, retaining the WIMP paradigm (windows, icons, menus, and pointers) might help bridge the gap for people familiar with iPhones and Macs but making this work with the Vision Pro's eye tracking and hand gestures, while technically very impressive, created a bunch of frustration for me. It's easily the best eye and hand tracking I've experienced but I still ended up making a bunch of mistakes with unintended consequences. Yes, I'm going to re-calibrate to see if it fixes things but my broader points stands.

Is Apple now locked in a habit of porting their ecosystem from screen to screen to screen? And, as a result, tethered to too many constraints, requirements, and paradigms about what an app is and we should interact with it? Were they burned by skeuomorphic design and no longer want to push the user interface in non-conventional ways?

One approach might be to look outside of WIMP and lean more into a model like OCGM (objects, containers, gestures, manipulations) designed for natural user interfaces (NUI). Another is starting simple, from the ground up. As a counter example to Apple Vision Pro, consider Meta's Ray Ban glasses. They are light, simple, and relatively cheap. For input there's an ultra-wide 12 MP camera and a five-microphone array. The only user interface is your voice and a single hardware button.

When combined with vision and language AI models, this simple set of controls offers up a different way of interacting with reality than the Apple Vision Pro. One without an existing app ecosystem, without windows, menus, and icons. But potentially one with a new of bringing computing capabilities to the real World around us.

Which direction this all goes... we'll see. But it's great to have these two distinct visions for bringing compute capabilities to our eyes.

Common Visual Treatments

LukeW - Tue, 01/30/2024 - 2:00pm

In the context of a software interface, things that work the same should mostly look the same. This isn't to say consistency always wins in UI design but common visual treatments teach people how to get things done in software. So designers need to be intentional when applying them.

People make sense of what they see in an interface by recognizing the similarities and differences between visual elements. Those big white labels in a menu? Because they all look the same, we assume they work the same as well. When one label opens a dropdown menu, another links to a different page, and a third reveals a video on hover, we no longer know what to expect. Consequently our ability to use the software to accomplish things degrades along with our confidence.

Though Amazon's header has lots of options, there's a common visual representation for the elements in it that reinforces how things work. A white label on the dark blue background is a link to a distinct section of the site. If there's a light gray triangle to the right of the label, you'll get a set of choices that appear when you hover over it. And last, but not least, the white label with a menu icon to the left of it, reveals a side panel on top of the current page when you click. Here's a simplified image of this:

Each distinct visual representation (white label, white label with arrow to right, white label with icon to left) is consistently matched with a distinct action (link to a section, reveal choices on hover, open side panel). The end result is that once people interact with a distinct visual element, they know what to expect from all the elements that look the same.

If one of the white labels in Amazon's header that lacked a light gray triangle also revealed a menu but did it on click instead of on hover, people's prior experience wouldn't line up with this behavior and they'd have to reset their understanding of how navigation on Amazon works.

While one such instance doesn't seem like a big deal... I mean it's only a little gray triangle... do this enough times in an interface design and people will increasingly get confused and feel like it's harder and harder to get what they need done.

Discomfort is a Strategic Advantage

LukeW - Fri, 01/19/2024 - 2:00pm

When things are going well, it's natural to feel comfortable. The better things go... the more comfortable you get. This isn't just true for humans it applies to companies as well. But in both cases, being uncomfortable is a strategic advantage.

I often got confused looks from co-workers when they asked me how a project was going: "things are going really well, I don't like it." Why would you not like when things are going well? My mindset has always been if things are good, there's more opportunity for them to get worse. But if things are bad there's a lot of room for them to get better.

In reflecting on this, there's an underlying belief that being uncomfortable is a better state to be in than being comfortable. Discomfort means you're not satisfied with the current situation. You know it can be better and you're motivated to make it so.

"It’s wild, but comfort can be a poison— John Nack

In the context of product design, this ends adds up to a mindset that design is never done and there's always things to improve. So you spend time understanding what is broken at a deeper level and keep iterating to improve it. Usually this type of process leads you back to core, critical flows. Fixing what really matters.

When you're comfortable, you instead assume the core product is doing fine and begin to fill time by thinking up what else to do, adding new features, or veering away from what actually matters. Discomfort with the status quo drives urgency and relevance.

"To grow new markets means making yourself uncomfortable. It means you can’t keep doing more of what got you here." -What Steve Jobs taught me about growth

Discomfort is also a prerequisite of doing something new. When you're solving a problem in a different way, it won't be immediately understood by others and you'll get a lot more head shakes than nods of agreement. To get through that, you need to be ok with being uncomfortable. The bigger the change, the longer you'll be uncomfortable.

But how do you motivate yourself and your teams to be uncomfortable? I often find myself quoting the words of the late, great Bill Scott. When explaining how he decided what to do next, he always looked for "butterflies in the stomach and a race in the heart". He wanted to be both uncomfortable (butterflies) and excited (race). Because comfort, while nice, isn't really that exciting.

Video: Using Website Content in AI Interfaces

LukeW - Sun, 01/07/2024 - 2:00pm

In this two minute video from my How AI Ate My Website talk, I outline how to automatically answer people's design questions using the content from Web site using embeddings. I also explain why that approach differs from how broader Large Language Model (LLM) generate answers. It's a quick look at how to make use of AI models to rethink how people can interact with Web sites.

Transcript

When we have all these cleaned up bits of content, how do we get the right ones to assemble a useful answer to someone's question? Well, all those chunks of content get mapped to a multi-dimensional vector space that puts related bits of information together. So things that are mobile-touch-ish end up in one area, and things that are e-commerce-ish end up closer to another area.

This is a pretty big simplification, but it's a useful way of thinking about what's happening. To get into more details... enter the obligatory system diagram.

The docs that we have, videos, audios, webpages, get cleaned up and mapped to parts of that embedding index. When someone asks a question, we retrieve the most relevant parts, rank them, sometimes a few times, put it together for an AI language model to summarize in the shape of an answer.

And sometimes we even get multiple answers and rank the best one before showing it to anybody. Feedback is also a really important part of this, and why kind of starting with something that roughly works and iterating is more important than doing it exactly right the first time.

So what's the impact of doing all this versus just using something like ChatGPT to ask questions?

Well for starters, you get very different kinds of answers, much more focused and reflecting a particular point of view versus general world knowledge. As you can see in the difference between a ChatGPT answer on the left to, why do designs look the same, versus the answer you get from Ask Luke.

On the Ask Luke side, you also get citations, which allow us to do a bunch of additional things, like object-specific experiences. On Ask Luke, you ask a question, get an answer, with citations to videos, audio files, webpages, PDFs, etc. Each one has a unified, but document-type specific interface.

The More Features You Add...

LukeW - Tue, 12/19/2023 - 2:00pm

As Dave Fore once said: "features are the currency of software development and marketing." Spend time in any software company and you'll begin to echo that sentiment. But there's consequences...

The first of which is feature-creep: loosely be defined as “the tendency to add just another little feature until the whole product is overwhelmed with them”. That pretty much sounds like a bad thing, so why does it keep happening?

Multiple studies have shown that before using a product, people judge its quality based on the number of features it has. It's only after using the product that they realize the usability issues too many features create.

So in order to maximize initial sales companies build products with many features. But to maximize repeat sales, customer satisfaction, and retention companies need to prioritize ease-of-use over features. Cue the inevitable redesign cycle that software applications go through... design is never done.

The more you own, the more you maintain.

The other key issue with more features is more maintenance. Every feature that goes out the door is a commitment to bug fixes, customer support, and the resources required to keep the feature running and updated. Too often these costs aren't considered enough when features get launched. And an increasing number of features inevitably begin to bog down what a company can do going forward. Companies get stuck in their self-inflicted feature morass negatively impacting their ability to move quickly to address new customer and market needs, which often matters more than a few incremental features.

Like consumer shopping decisions, product team decisions are weighted toward short-term vs. long-term value. Launching new features within software companies typically gets you the accolades, promotions, and clout. Maintaining old features, much less so.

For both consumers and product teams the upfront allure of more features usually wins out, but in both cases, long-term consequences await. So sail the feature-seas mindfully please.

Video: PDFs & Conversational Interfaces

LukeW - Sun, 12/17/2023 - 2:00pm

This two minute video from my How AI Ate My Website talk, highlights the importance of cleaning up the source materials used for conversational interfaces. It illustrates the issues PDF documents can have on large-language model generated answers and how to address them.

Transcript

PDFs are special in another way, as in painfully special. Let's look at what happened to our answers when we added 370 plus PDFs to our embedding index. On the left is an answer to the question, what is design? Pretty good response and sourced from a bunch of web pages.

When PDFs got added to the index, the response to this question changed a lot and not in a way that I liked. But more importantly, only one PDF was cited as a source instead of multiple web pages.

So what happened?

What happened is a great demonstration of the importance of the document processing, aka cleanup step, I emphasized before. This ugly spreadsheet shows the ugly truth of PDFs. They have a ton of layout markup to achieve their good looks.

But when breaking them down, you can easily end up with a bunch of bad content chunks like the ones here. After scoring all our content embeddings, we were able to get rid of a bunch that were effectively junk and clogging up our answers.

Removing those now gives a much better balance of PDFs, videos, podcasts, and web pages, all of which gets cited in the answer to what is design. More importantly, however, the answer itself actually got better.

Video: Suggested Questions in Conversational UI

LukeW - Wed, 12/13/2023 - 2:00pm

If you've ever designed a conversional interface, you've probably found that people often don't know what they could or should ask. In this 2 minute video from my How AI Ate My Website talk, I discuss the importance of suggested questions in the Ask Luke conversational UI on this site and walk through some of the design iterations we tried before landing on our current solution.

Transcript

So now we have an expandable conversational interface that collapses and extends to make finding relevant answers much easier. But there's something missing in this screenshot... and that's suggested questions.

For the purpose of this presentation, I simplified the UI a bit in the past few examples. But on the real site, each answer also includes a series of suggested questions. The first few of these are related to the question you just asked, and additional ones come from the rest of the corpus of content.

Suggested questions are pretty critical because they address the issue of, what should I ask? And it turns out, lots of people have that problem, because a very large percent of all the questions asked kick off with one of these suggestions.

We knew from the start these were important, but it took a bit to get to the design solution you see here. At first, we experimented with an explicit action to trigger suggested questions.

Need an idea for what to ask? Just hit the lightbulb icon.

We then iterated to a more clear, what can I ask, link and icon that works the same way. But in both cases, the burden was on the user to ask for suggested questions.

So we began exploring a series of designs that put suggested questions directly after each answer, automatically. With this approach, there was no work required on the part of the user to show suggested questions.

These iterations continued until we got to suggested questions directly in line in our expandable conversational interface.

Video: Embedded Experiences in Conversational UI

LukeW - Tue, 12/05/2023 - 2:00pm

In this 2.5min video from my How AI Ate My Website talk, I walk through how a conversational (chat) interface powered by generative AI can cite the materials it uses to answer people's questions through a unified embedded experience for different document types like videos, audio, Web pages, and more.

Transcript

Now, as I mentioned, answers stem from finding the most relevant parts of documents, stitching them together, and citing those replies. You can see one of these citations in this example.

This also serves as an entry point into a deeper, object-specific experience. What does that mean? Well, when you see these cited sources, you can tap into any one of them to access the content. But instead of just linking out to a separate window or page, which is pretty common, we've tried to create a unified way of exploring each one.

Not only do you get an expanded view into the document, but you also get document-specific interactions, and the ability to ask additional questions scoped just to that open document.

Here's how that looks in this case for an article. You can select a citation to get the full experience, which includes a summary, the topics in the article, and again, the ability to ask questions just of that document. In this case, about evolving e-commerce checkout.

There's more document types than just webpages, though. Videos, podcasts, PDFs, images, and more. On Ask Luke, you ask a question, get an answer, with citations to videos, audio files, webpages, PDFs, etc. Each one has a unified, but document-type specific interface.

The video experience, for example, has an inline player, a scrubber with a real-time transcript, the ability to search that transcript, some auto-generated topics, summaries, and the ability to ask questions just of what's in the video.

When you search within the transcript, you can also jump directly to that part of the video in the inline player. Audio works the same way, just an audio player instead of a video screen. Here you can see the diarization and cleanup work at play, which is how we have the conversation broken down by speakers and their names and the timestamp for the transcript.

Webpages have a reader view, just like videos and audio files. We show a summary, key topics, give people the ability to ask questions scoped to that article, and by now you get the pattern.

Video: Structuring Website Content with AI

LukeW - Sun, 12/03/2023 - 2:00pm

To create useful conversational interfaces for specific sets of content like this Website, we can use a variety of AI models to add structure to videos, audio files, and text. In this 2.5 minute video from my How AI Ate My Website talk, I discuss how and also illustrate if you can model a behavior, you can probably train a machine to do it at scale.

Transcript

There's more document types than just web pages. Videos, podcasts, PDFs, images, and more. So let's look at some of these object types and see how we can break them down using AI models in a way that can then be reassembled into the Q&A interface we just saw.

For each video file, we first need to turn the audio into written text. For that, we use a speech-to-text AI model. Next, we need to break that transcript down into speakers. For that, we use a diarization model. Finally, a large language model allows us to make a summary, extract keyword topics, and generate a list of questions each video can answer.

We also explored models for identifying objects and faces, but don't use them here. But we did put together a custom model for one thing, keyframe selection. There's also a processing step that I'll get to in a bit, but first let's look at this keyframe selection use case.

We needed to pick out good thumbnails for each video to put into the user interface. Rather than manually viewing each video and selecting a specific keyframe for the thumbnail, we grabbed a bunch automatically, then quickly trained a model by providing examples of good results. Show the speaker, eyes open, no stupid grin.

In this case, you can see it nailed the which Paris girl are you backdrop, but left a little dumb grin, so not perfect. But this is a quick example of how you can really think about having AI models do a lot of things for you.

If you can model the behavior, you can probably train a machine to do it at scale. In this case, we took an existing model and just fine-tuned it with a smaller number of examples to create a useful thumbnail picker.

In addition to video files, we also have a lot of audio, podcasts, interviews, and so on. Lots of similar AI tasks to video files. But here I wanna discuss the processing step on the right.

There's a lot of cleanup work that goes into making sure our AI generated content is reliable enough to be used in citations and key parts of the product experience. We make sure proper nouns align, aka Luke is Luke. We attach metadata that we have about the files, date, type, location, and break it all down into meaningful chunks that can be then used to assemble our responses.

Video: Expanding Conversational Interfaces

LukeW - Thu, 11/30/2023 - 2:00pm

In this 4 minute video from my How AI Ate My Website talk, I illustrate how focusing on understanding the problem instead of starting with a solution can guide the design of conversational (AI-powered) interfaces. So they don't all have to look like chatbots.

Transcript

But what if instead we could get closer to the way I'd answer your question in real life? That is, I'd go through all the things I've written or said on the topic, pull them together into a coherent reply, and even cite the sources, so you can go deeper, get more context, or just verify what I said.

In this case, part of my response to this question comes from a video of a presentation just like this one, but called Mind the Gap. If you select that presentation, you're taken to the point in the video where this topic comes up. Note the scrubber under the video player.

The summary, transcript, topics, speaker diarization, and more are all AI generated. More on that later, but essentially, this is what happens when a bunch of AI models effectively eat all the pieces of content that make up my site and spit out a very different interaction model.

Now the first question people have about this is how is this put together? But let's first look at what the experience is, and then dig into how it gets put together. When seeing this, some of you may be thinking, I ask a question, you respond with an answer.

Isn't that just a chatbot? Chatbot patterns are very familiar to all of us, because we spend way too much time in our messaging apps. The most common design layout of these apps is a series of alternating messages. I say something, someone replies, and on it goes. If a message is long, space for it grows in the UI, sometimes even taking up a full screen.

Perhaps unsurprisingly, it turns out this design pattern isn't optimal for iterative conversations with sets of documents, like we're dealing with here. In a recent set of usability studies of LLM-based chat experiences, the Nielsen-Norman group found a bunch of issues with this interaction pattern, in particular with people's need to scroll long conversation threads to find and extract relevant information. As they called out, this behavior is a significant point of friction, which we observed with all study participants.

To account for this, and a few additional considerations, we made use of a different interaction model, instead of the chatbot pattern. Through a series of design explorations, we iterated to something that looks a little bit more like this.

In this approach, previous question and answer pairs are collapsed, with a visible question and part of its answer. This enables quick scanning to find relevant content, so no more scrolling massive walls of text. Each question and answer pair can be expanded to see the full response, which as we saw earlier can run long due to the kinds of questions being asked.

Here's how things look on a large screen. The most recent question and answer is expanded by default, but you can quickly scan prior questions, find what you need, and then expand those as well. Net-net, this interaction works a little bit more like a FAQ pattern than a chatbot pattern, which kind of makes sense when you think about it. The Q&A process is pretty similar to a help FAQ. Have a question, get an answer.

It's a nice example of how starting with the problem space, not the solution, is useful. I bring this up because too often designers start the design process with something like a competitive audit, where they look at what other companies are doing and, whether intentionally or not, end up copying it, instead of letting the problem space guide the solution.

In this case, starting with understanding the problem versus looking at solutions got us to a more of a FAQ thing than a chatbot thing. So now we have an expandable conversational interface that collapses and extends to make finding relevant answers much easier.

Syndicate content
©2003 - Present Akamai Design & Development.