The research you can’t simulate

I ran an experiment. I briefed Claude to act as a UX researcher and conduct an in-depth interview with a usability testing component, then set up other agents as client personas: different emotional temperaments, various levels of technical knowledge and attitudes toward the product.

The output looked like solid research: sharp observations, specific suggestions, and well-structured transcripts. But I kept coming back to the same question: how much of it was real? The personas said plausible things and reacted in plausible ways. And that’s exactly the problem, real users don’t always make sense.

FixMix: what I found as a courier

FixMix was a service connecting people who needed things repaired with craftspeople. I took on courier duties on purpose. Not to test the logistics, but to be inside the service flow.

At a cobbler’s workshop, I started unpacking shoes customers had sent for renovation. Many were caked in mud, dirty in ways that had nothing to do with the repair. The cobbler didn’t mention it, he’d stopped noticing years ago. The customers never thought about it either. A friction point invisible from both sides, and nobody would raise it in an interview because nobody registered it as a problem. It was a point in our conversation and cobbler started to complain about it when I complained in front of him.

Then there was pricing. Every craftsperson said the same thing: they couldn’t give prices before seeing the item, because every job was different. But after watching dozens of jobs come through, most services fell within a narrow range. What they were actually saying was: I don’t want to commit to a price and then discover the job is worse than expected. Risk-aversion, not pricing complexity. The design response was to absorb the risk ourselves rather than trying to change their behaviour.

There was more you pick up just by doing the job. Craftspeople used legacy phones and wanted text messages, not apps or emails. Customers were routinely rude to me as the courier. No interview captures that.

KDB.market: what reactions told me

KDB.market was a children’s clothing marketplace. When pitching the idea, reactions split cleanly: mothers asked “when do you deploy it?” while people without children asked “is it really an issue?” I wasn’t running a survey, but the split was the research finding, it told me instantly where the need lived and where it didn’t.

Then we ran clothes swapping events. Same room, same piles of clothes, two completely different experiences. The children treated it like treasure hunting: digging through stacks, showing things off to each other. Not one of them thought about the clothes as “second-hand.” It was closer to finding a trophy. The parents, meanwhile, were having a glass of wine and conversation. The social value had nothing to do with the clothes.

Same event, two value propositions nobody would have separated in a survey, because you’d never think to ask “what is this event for?” when the answer seems obvious.

What the simulation would have missed

So back to the question from the experiment: how much of it was real?

The AI-generated interviews were useful. They stress-tested the interview guide, surfaced edge cases, and helped me think through persona variation. I’ll keep using them. But would a synthetic persona have told me “we can’t standardise pricing” while the evidence showed you could? Would it have generated dirty shoes as a friction point, or predicted children treating second-hand clothes as trophies?

These findings live in the gap between what people say about their experience and what actually happens when you’re standing there. Formal research asks people to articulate what they think. Guerrilla research catches what they’ve stopped thinking about.

As simulated research gets cheaper, I think this kind of fieldwork becomes more valuable, not less. Being undercover in the service flow catches the things every other method misses: the normalised frictions, the risk dressed up as logic, the reactions nobody would volunteer. It’s research you can’t generate from a prompt, because the findings only exist when you’re physically there.

The research you can’t simulate

FixMix: what I found as a courier

KDB.market: what reactions told me

What the simulation would have missed

About the author

Lucas

FixMix: what I found as a courier

KDB.market: what reactions told me

What the simulation would have missed

About the author

Lucas

Read more