Welcome back to The Interline Podcast. I’m going to start today’s show with a question. When’s the last time you had a memorable online shopping experience? Not a smooth transactional one, not one with an unexpectedly quick delivery, not one with a wider selection than you were thinking. One that actually felt like it started in one place and then ended in another place where you maybe understood more about what you wanted and you felt like you’d had something approaching genuine personalisation?
In fact, I have two questions. There’s that one, and then there’s this one. Have you tried any of the existing AI assistants that are embedded into e-commerce and marketplace front ends? Or have you tried using ChatGPT’s Atlas browser, Perplexity’s Comet, or any of the equivalents to go and find and buy stuff on your behalf? In my case, I have tried all those things, and I’ve come away feeling like they’re all pretty half-baked.
Or that they represent a weird interzone between the established shopping paradigm, which is the endless aisle you try and navigate for yourself, and the future we’re being sold by big tech and AI labs where the process is automated, seamless, and centralised. As it happens, there are companies that are working to navigate this weird middle ground. And while you might think that they themselves are just intermediaries between you and general-purpose AI, the reality is that there’s quite a lot of thought and some deep technical and behavioural work going into figuring out where e-commerce and retail personalisation go from here.
So today I’m bringing on someone who’s been hands-on with that work at a pretty significant scale, Maria Belousova, who is the CTO at Daydream, which is positioned as a chat-based shopping agent that listens and understands.
Maria was previously Chief Technology Officer at Grubhub. And she’s worked for Microsoft in the past. So she’s someone who knows both the enterprise and the consumer side of things. And she’s someone who has some pretty deep and nuanced thoughts on what the fusion of AI and shopping looks like from here. So keep those two questions in your head while I chat to Maria and see if you land on a more concrete vision for what’s coming next by the end.
NB. The transcript below has been lightly edited.
Okay, Maria Belousova, welcome to the Interline Podcast.
Thank you for having me. Excited for the conversation.
Yeah, likewise. I’m very much looking forward to this one. Let’s start with the essentials. At Daydream, you’re building for what you call the way people “actually shop”. What does that mean? And how do you think existing modes of retail have fallen out of lockstep with what people really want?
Yeah, so if you think about, especially in fashion and, you know, in many other verticals where taste and personal preferences are really important, why do we buy things, right? When you wake up in the morning and you realise, hey, I have this event next week that I need to go to, you don’t instantly think about, I need this dress and it’s going to be exactly like this, right? You’re actually thinking, who is going to be attending the event? How do I want to look? How do I want to feel? What is the outcome of this?
And so usually, if you actually chart the journey of the shopper, she would usually go to maybe Instagram, maybe Pinterest, get inspired, and then go to a few retailers, look at what they have. And then finally, when she’ll settle in on some idea, she’ll take maybe a copy of the product item and go to Google potentially, look at different versions of the product, and then finally buy it.
The journey is generally sort of multi-step and it takes a while for somebody to really understand, what are we buying? And yet if you look at traditional e-commerce, we constrain the shopper to keywords and attributes. When they come to a retailer or a shopping destination, you usually expect to already have a category and attribute ready? And so at Daydream, we think about the shopper and sort of this notion of shopper vocabulary. How do they describe? What are they shopping for? And then we help them through this journey to arrive at the perfect product.
Okay, and I know we’re going to get into shopper vocabulary a bit more as we go. I think what you’re describing there is just a very multi-step, multi-channel approach. And I guess fragmentation is probably the best word I could use to describe the way that online retail currently operates.
Yeah, I agree.
So we’re in a weird spot I think for search discovery and content in general. I’ve got a publisher’s perspective on this. So we know from The Interline’s traffic analysis, the traditional modes of search are in a fairly steep decline. I think it’s about -16% year over year for us in terms of organic search traffic, although luckily traffic overall is still up. But the promise of the kind of AI age of the web is that AI can actually provide better qualified, higher intent traffic because by the time somebody arrives with you, whether you are a publisher or retailer, they are already several steps deep into a query, into a conversation. Are you seeing that trend, kind of fall off of organic search traffic being mirrored in product discovery the same way it is in content? And that’s more from your brand partner’s perspective in the wider industry rather than from yours.
Yeah. And I think that this is a very nuanced question and a nuanced situation, right? Because the journey is fragmented. And I think that the questions like where the traffic is declining depends on the intent. Because ultimately, if you’re looking for product, you’re not going to be satisfied with an agent answer, right? At the end of the day, you still want to find that product. And so I think that the journey is changing and so the discovery process is changing. But definitely, you know, we’re hearing from our brand retailers that organic search traffic is declining. And there’s also quite a bit of interest and desire from shoppers to shop in a conversational, multimodal way. So we’re seeing many different shifts in patterns.
Cool, that’s good to know. So, functionally speaking, how does Daydream work on the back end? As a shopper, when I interact with the outward-facing chatbot, where is it pulling its product information from and how is it matching my intent to the actual attributes of the product? It might be functional attributes, aesthetic, however I’m judging it. What does it look like to onboard a partner at the collection and SKU level? Do you need to ingest somebody’s whole inventory as a project? Is there some kind of MCP or scraping going on to keep the catalogues up to date? Just tell me how that all comes together in the back end.
So starting from the bottom, we’re walking with all of our retailers. So they usually give us a feed. Sometimes they whitelist us and allow us to crawl their website. But we spend quite a bit of time ingesting their catalogue into our database and indexing that. Furthermore, we also spend a lot of time to understand each product and the attributes that might not be present in the feed so that we could better represent the products.
On top of that, we have a pretty sophisticated multimodal search engine, which is capable of searching using lexical searches, hybrid searches, vector searches, and image search. And then on top of that, we have this shopper intent system that is conversational in nature. And essentially, our system of models, ensemble of models, is able to understand the shopper intent in a wide range of cases. Sometimes the shopper is talking about the occasion she’s going to. Sometimes she’s talking about a body type or trend. So think of it as a search engine that is specifically dedicated to fashion. And so the catalogue has all of the products in it from our partners.
And from a model perspective – I mean AI models there rather than human models – are you hinging on frontier general-purpose LLMs for everything? Or do you see value in using smaller models for more discreet purposes than search, for instance, but still presenting a unified front end to the shopper? I’m always curious when it comes to consumer-facing applications, which parts of AI are actually manifesting themselves behind the scenes.
So this has actually been a very interesting learning journey for us and for the industry as a whole. I think that everybody was very enamoured with the power of LLMs a couple of years ago, and it was just unreal how well large language models could understand intent in many cases. And yet actually building a product that works, that is grounded in real-world facts and real-world catalogue products is much more complicated.
And so we started with a large language model, a single one, and then eventually evolved. Our current state of the system is actually an ensemble of experts. So a model would be trained to understand, say, for example, the level of formality that is appropriate for a given type of event or the season and what type of fabrics would be appropriate for a given season. For a dress, for example, colour. So often our shoppers say, less colourful or more colourful, darker colours etc. And so being able to actually interpret that nuance and understanding how to retrieve the right products takes a lot of care.
So usually what we do is we take shoppers’ requests or an entire conversation and then process it through a system of models. And each of the small models opines on what exactly is the situation here, the intent. And then we aggregate the consensus into a comprehensive understanding.
So behind the scenes, we have a number of different versions of models. Some of them are frontier models like OpenAI and Gemini. Some of them are fine-tuned open-source models. And some of them are in-house models that are just deeply focused on specific areas. We also, at Daydream, have a lot of fashion experts who help us create training data for these models. Thinking about occasion-appropriate wear and how to train, say, a model to recognise which dresses would be appropriate for a specific occasion, trends, body types. So we essentially have a lot of our in-house data for training.
That’s a really good answer because I think there’s an assumption sometimes amongst the general readership, the general population that applying AI to these kinds of problems is a kind of one-stop general-purpose situation rather than something that has a lot more in the way of moving pieces behind the scenes as you’ve just described.
Yeah, yeah, I agree.
You’ve talked a couple of times now about the shopper vocabulary. So I think that’s an important piece of what you’ve just described. It’s an important piece of when we think about the interactivity side of this and bridging people’s intent with objective product attributes. Because I think, if the key to personalisation and refining people’s intent and getting them to the sort of stylist level recommendations that you’re looking for …if the key to that is being specific and granular, then you have to have that specificity and that granularity either in the existing data set through attribute tagging of catalogues and so on, or in a model that can distinguish those things through native visual modality and then bridge that understanding with what it’s being asked from the consumer into action.
You’ve hinted at this a little bit, but maybe go a little bit deeper on what it looks like to train a model in call it the lexicon of fashion, the internal labelling and the internal dialogue and the shopper vocabulary. Because on one side, you have objective considerations. And then on the other side, you’ve got a huge amount of subjectivity as well. Because the way people ask for something could be very different from the way merchants and product teams describe it.
Yes. So I think that a good way to start talking about this is to frame the problem. So I’m going to give you a few examples of the queries that you will find in our search logs when people are searching for products. So sometimes a shopper will say, I’m looking for an outfit for dinner with the in-laws. Or she says, I started a new job and I’m looking for an outfit for the office and I’m an MBA student, right? Or she will say, I’m a rectangle and I’m looking for a dress that will make me look more like an hourglass body, right? And last week I was talking at a conference and one of the queries I demoed was I’m looking for a revenge dress for a wedding in Paris in the style of Saltburn.
And so if you look at this query, there’s five concepts there: there’s revenge, there’s wedding, in Paris, Saltburn. And then there’s obviously dress, but dress could represent a million options. And so being able to understand each of the concepts here is what we call vocabulary. So users typically reference social setting. They reference a venue, they reference a place, sometimes a season. So like an intent for how they want to look. And being able to recognise these entities and so actually convert this to what we call a merchant vocabulary.
So merchants typically talk in categories like dresses, tops, boots, and attributes like ankle boots or suede boots, right? And so we do the work on both sides. Starting with the catalogue, we have this process called enrichment, where we essentially take each product and extract some of the latent attributes that may not be actually specified in text or in structured form through a feed. And we index this knowledge so that later, at runtime, we could actually retrieve the right products. And so on the merchant side, we have a very comprehensive vocabulary of categories and attributes.
And then on the shopper side, we have what we call a knowledge graph. So like a fashion knowledge graph that converts shopper vocabulary, how they reference seasons, venues, places, intent, social setting actually into merchant attributes.
That’s fascinating. And I think as well, the query you referenced is a good reminder of the fact that not all of this is even fashion vocabulary. When you’re bringing in the Saltburn example, it’s cross-media knowledge, it’s culture knowledge.
Exactly. Exactly. And that’s why it’s such a complex and fascinating space.
I agree. And it’s great to hear you kind of getting behind the scenes on some of this when we think about the point of interaction, right? So we think about your shopper that you’ve just described. When she’s sitting down and she’s interacting with Daydream, what kind of modality are you finding people gravitate towards the most? In my head, I think I’m thinking that most sessions start with text interaction. But I understand you’ve also leaned into allowing users to iterate visually, adjusting specific parameters like colours, prints, necklines, etc. to arrive at something that benefits their vision. What does the typical journey look like from the modality point of view? And do you see it evolving to be more visual over time, more text-based over time, or trending towards an amalgamation of all the different modalities?
I think that it starts with, you know, why is a shopper shopping, right? Sometimes if they have a specific place they’re going to or a vacation they’re taking, they may start describing the situation. But as they see more products, at some point, it’s easier for them to say, well, like this one, only different, right? And so if we go back to our shopper who is shopping for a wedding guest dress, we may start with: here’s some dresses that you may wear to such an event. And she may refine her searches a few times in conversation. She may say more chic, more bodycon, but then over time she’ll see a dress that she kind of likes, but maybe it’s the wrong colour or maybe it’s too long. Maybe it doesn’t have the right sleeve length. And it’s actually really interesting. One of our very popular capabilities on our platform is to take a dress or an item and describe what you would like to change about it.
And what happens behind the scenes, we use visual intelligence to understand, what is she looking at? What are the attributes of this product? And if she says, OK, like this only sleeveless, right? We essentially say, a garment like this with these attributes, and we do a search using the visual understanding of the product. And over time, what you see, many sessions have multiple iterations of the conversation and there will be some conversational refinements and some visual refinements and people love that ability to switch back and forth. However, often people just start with an image: here’s an outfit I liked on Instagram, help me find this. So it really depends on where you start your shopping journey. Do you start with a visual inspiration or do you start with a problem?
And how far does it extend the other way? Do you handle payments on your side or does that go to the brand or retailer’s checkout? Because presumably that’s something you think could maybe have an agentic component to it in the future either way.
Our goal is to drive our shoppers to our partners. So the checkout happens on our partner’s site. But definitely looking at agentic checkout in the future becomes easier every month that goes by. I definitely would look at that later on.
And that brings me to another question, which is, so far at least, Daydream is web-based. But as we talked about at the beginning, the web’s changing pretty rapidly. And it seems like everybody wants to either try and wall themselves off from the open web or to pursue distribution in an alternative or complementary way through an app or a device. We’re a week or so post-launch of ChatGPT Atlas, the browser, which I have to admit feels like a very weird product to me as a user, but I understand the rationale behind it, which is a play to own as much of the web journey as possible, including shopping. And you have Perplexity Comet, Dia, a bunch of those other things that are making the same bet.
If we’re looking at a near future where those options are open to us and where we’re basically surrounded by text boxes that are eager to serve as product recommendations. (I mean, I can think of five or six that are a click away on my desktop right now.) How do you think about what makes Daydream unique in that scenario and how best to package up and distribute it so that it becomes people’s preferred method of interaction when it comes to shopping and discovery?
So we start by thinking about, what do the shopper and the retailer want, right? What is the ideal experience for them? From the shopper perspective, she would like a stylist who can advise her and help her select the right product. And she wants to know that as she’s doing that, you know, she sees the full range of products available in a given category on a given style and wants to be able to select the best one. And obviously, we want to be able to showcase the product that the merchants have in the best possible way, with the best imagery and the best content.
And so starting from, what is Daydream? It’s a stylist. It’s this AI being that is helping shoppers make decisions and navigate through options and find just the right product. So if you start with that, we could be on the web, we’re currently working on a native experience for iOS. I think that it’s not just that AI is changing how people engage with content. The distribution is also changing. And so it’s just really interesting to see.
Absolutely. And we think about things from a content and publishing point of view the same way. We think about: where are people asking questions? Where is knowledge and authority beneficial to distribute? And that’s where we want to show up from that perspective. So I understand that angle. To bring in some wider context, I think a lot of people were surprised by the success of the Sora app, and specifically by the social component of it and people’s willingness to create cameos of themselves. Now, candidly, I don’t know if that feels like a short-lived experiment or if that’s actually a sea change, but it does seem that maybe some of the cultural resistance around AI sharing is breaking down.
Are you thinking about getting into the sharing space when it comes to people promoting or collaborating on collections and looks they’ve put together or even visualising themselves wearing them?
I think in general, any time you give creators a new way to create new content, they’re going to rise to the challenge. We’ve seen this time and again with YouTube, with TikTok. So it’s not surprising to me that Sora is popular with creators and with social. And shopping is inherently a very social activity. And definitely our shoppers are very interested in being able to share the products they like, the collections they build, and also being able to follow people whose styles they respect and admire. So we’re definitely thinking about it and definitely experimenting with any social interactions. As far as AI content, I think it might be interesting to see what an item would look like on you. Things like virtual try on, I think until you understand how an item actually fits and what it looks like exactly, it may not be as helpful to just do a try on, just to put an item on you, right?
Yeah, the virtual try-on is the most kind of back and forth space that we’ve observed over the last five or six years at The Interline. It’s an incredibly high bar to clear. People have incredibly exacting expectations for both objective fit and subjective fit. And to date, I just don’t think it’s a solved problem. And I think it’s fair to keep that on the agenda on that basis.
Agreed.
Now when I think about everything we’ve just talked about, like where does this show up, how do you interact with it, I feel like from a consumer point of view, memory is the biggest thing in AI in general. I think it’s potentially the next platform lock-in. Just on my own personal time, I spent a bit of time playing around with graph memory and local models. And if you look at the visualised results you can get after even a few everyday conversations about your calendar and your kid’s school and your preferences and your drive to work etc. It’s remarkable to see how much knowledge you can build up in memory about a person in quite a short set of interactions.
Now, for the big AI companies, they’re competing for that memory as a very wide funnel. They’re competing to know as much as they can about people’s lives in very broad brush strokes so that it becomes preferential for them to stay in that ecosystem. For something like Daydream, you’ve got the opportunity to define memory in a much more focused way. And you’ve also got the opportunity, I think, to be more transparent with shoppers about what data points you’re collecting and how you’re using them.
How are you approaching that blend of capturing memory about people from interactions versus acting on the explicit signals that they’re giving you when they start a new conversation?
So, first, I agree with you that memory is I think the new step function improvement in personalisation on the web in general. Because I think that being able to explicitly collect information that matters to you, things you like to do, types of places you like to go to, how you like to shop, is very important.
To your point, collecting data in the context of fashion is much more specific and useful, right? Because you can not only interpret, I work at a bank. Well, that means you probably have a certain dress code, right? Or you go to these types of events and therefore I like to dress a certain way. Collecting information, but then also using it for shopping and for personalising your experience is very, very relevant and helpful.
We believe that it’s important to be transparent with our shoppers about what we’re collecting and how we’re storing it, and then also giving them the ability to come and change it. So let’s say somebody is shopping for a dress and she told us she’s looking for this type of price range, but maybe this was just a one-off situation where she was willing to pay for a much higher price. But over time, it’s actually not her price range and she wants to bring it down. So we have this style passport at Daydream where we store information about preferred colours, preferred categories, preferred prices. Favourite brands is a very important one. And this is something that we find is a very strong indicator of preferences and style and aesthetic. So users often select their favourite brands at onboarding time. And then over time, they continue adding more of their favourite brands to their passport.
And I think what you’re describing there is the fundamental architectural difference between something memory-based like this and traditional product recommendations and signals because it’s the perennial: buy a hammer on Amazon and suddenly get notifications telling you, we thought you might like a new hammer. But actually, no, no, I had a specific problem to solve with this hammer. And it’s solved now. I don’t need that to define our interactions going forward.
Yes.
Flipping the script a bit now. So that was the consumer point of view. From an enterprise point of view, it feels like maybe the next step could be access to some of that valuable data that comes from the kind of conversational interactions that shoppers are having. Understanding how a shopper wound up on my storefront rather than my competitors? Why did they hit a product from one pricing tier instead of something higher up the ladder? Why did a shopper who had a conversation with Daydream where my brand or a specific product that I make came up, why did they not click through to my e-commerce website? Why did they end up somewhere else? Is this sort of insight something you provide now or something you think maybe about doing in the future?
I mean, definitely when we’re talking to our partners, we’re sharing insights, we’re learning over time in general. And all of our customer data is owned by our partners. But when you think about it, information about favourite styles, favourite brands, you need to be able to react to this information at runtime, right? And so it’s not just knowing that the shopper likes these silhouettes. It’s actually being able to then present these styles to her in search. And so I think that we’re looking at, how can we help our partners not only know this, but also react to this and build a much better search and discovery experience for them as well.
Great. Now, I think it’s often said that taste is kind of the last frontier for AI. And what I mean by that is the sense that you can take tasks that have clearly defined objectives and pass-fail states and, it’s easy – maybe not easy to build, but it’s easy to envision a world where eventually all of that falls to automation because it’s a binary yes-no, it’s a binary good-bad. But I think anywhere else where success is determined by distinguishing something subjectively good from something subjectively bad or something subjectively suitable from something subjectively unsuitable, there’s maybe always going to need to be a human holding the reins.
Now, when you talk about delivering AI that’s grounded in human taste, what does that mean? And do you think the taste and curation component is going to scale up or down over time? Are we careering towards more human kind of final gatekeeping or are we risking maybe devolving the business of tastemaking to AI?
So this is a very interesting question. I find that it’s fascinating. So I actually see those as two questions: will AI eventually make taste-based decisions for us? And then, what is taste and why haven’t we yet done e-commerce as well as we’ve done it for books or electronics, right? And so starting with that one, when you think about it, what is e-commerce? How did it evolve? We started with document search and document retrieval, right? And so traditional search technologies were initially down for websites and documents. And then we sort of started doing e-commerce for books, electronics.
And so these days, if you do a mental exercise in a room of, say a hundred people, how many of them will have the same phone? Probably more than half, right? How many of them are likely to use the same staples like a toothpaste? A lot of them, right? And yet everybody in the room is going to be wearing something different and most likely have very different preferences when it comes to books, when it comes to music. So, I argue we haven’t really done taste-based discovery in e-commerce yet. And now is the time. And so going back to the shopper vocabulary and being able to interpret that. You know, that’s the key and a traditional search box that is intended for keywords simply doesn’t work in taste world.
And then to your second question, will AI make taste-based decisions for us? You know, would you ever ask AI to watch a movie for you? Probably not, right? Watching a movie is fun. So for me, shopping for clothes is fun. So I would never delegate something like this to AI. Or like, you don’t eat any cake, right? I’m going to eat it myself. So I think that what’s interesting with these agentic workflows is that boring tasks are easy to delegate to, because you don’t want to do them yourself. Fun tasks, shopping, entertainment, socialising with friends and humans, we’ll continue doing that online ourselves.
I think that’s a really good way of framing it. And I think it’s also a reminder of the fact that the initial promise of e-commerce, you know the endless aisle, is not fun and hasn’t been fun for quite some time. And I think from that point of view, it’s overdue for disruption.
Okay, finally, to bring us to a close. What’s next for Daydream? Something we haven’t already covered.
Well, we touched on some of these topics throughout our conversation. We’re focused on deeply understanding intent. And right now, we’re focused on specifically occasions, Because occasions are such a multi-dimensional, multifaceted concept. Holiday occasions, weddings, vacation. And so we’re going very deep in this conversational, multimodal interactions with our shoppers.
But we’re also looking at different platforms. We’re working on a native app, as I mentioned. We’re looking at how our brands would like to be serviced in AI platforms. So, broad and deep.
Okay, that’s a good perspective, I like it. I do have one very final question actually. So The Interline did our AI Report in the spring of this year, and we’ll be doing an AI Report again in the spring of 2026. If I could put you on the spot and say, what’s a topic we should be covering in that that maybe isn’t being covered in detail or isn’t being covered the right way elsewhere, what would you pick?
I think that a very important topic is: how do we interact with content in AI? So you mentioned that you played with Atlas. I did too. And I ran into many sort of dead-end situations where I couldn’t recover as I’m browsing. And so I think that when your scenarios and experiences go beyond the fold content on the website, when it’s not just like a simple interaction, how should this work? I think that these multimodal interactions with AI, right now the industry puts a chatbot in the corner and it’s just like, OK, we’re done. This is AI. But I think that over time, AI will become much more ambient, sort of present, but you’re not necessarily talking to it.
Yeah.
And I think that user experience and interactions is one topic where a lot of interesting research and experiments should be done.
Okay, well, we’re going to fold that into our coverage plan for next spring then. Maria, thank you so much for joining me. I really enjoyed this conversation. We’ll hope to bring you back at some point in the future and see how things have progressed at Daydream and see how things are moving on with the whole user interactivity paradigm for AI as well.
It was a pleasure. Thank you, Ben. I really enjoyed our conversation.
Thanks for coming. Brilliant. Thank you.
Okay, that’s my chat with Maria done. This was one of my favourites from the start of this new season, partly because, well, I always like talking to CTOs, but partly because there’s just a tremendous amount of change taking place around us right now when it comes to where AI lives, where the web lives, and where finding and buying products and having product-adjacent experiences actually happens. I hope you got something out of listening, too.
I’m going to be tackling some related and some completely unrelated topics over the next few weeks, so I hope you’ll come back for those as well. For now, though, thanks for listening, and I’ll talk to you again soon.
