
Key Headings:
- Microsoft’s ‘Magentic Marketplace’ simulation exposed the current limits of full AI autonomy, showing that in complex, multi-agent virtual economies, agents become indecisive and struggle with coordination when guardrails are removed. This key finding, consistent with the company’s published paper, suggests that autonomy still heavily relies on human-defined structure and narrow, consistent context for dependable execution.
- Major enterprise platforms are focusing on managing, not just deploying, agentic AI. Companies like SAP are developing frameworks with custom agent-builders, centralised monitoring, and governance capabilities anchored in core business data. This focus prioritises predictable, rule-bound execution over free-roaming independence, reflecting the need for reliability in operational sectors like fashion production and logistics.
- The rise of autonomous agents is turning into a legal and jurisdictional battle, as demonstrated by Amazon’s lawsuit against Perplexity AI over its “shopping agent” extracting product data. This conflict moves the conversation beyond technical capability to a question of platform control and who mediates the buying process, highlighting the friction between established, closed commercial ecosystems and emerging, conversational AI interfaces.
Agentic AI is often described as the point where artificial intelligence stops assisting and starts acting. The idea that software will soon be capable of carrying out complex, goal-driven tasks, from finding products to completing transactions, without constant human steering is, after all, a persuasive one. So much so that it sometimes feels like almost every major platform taking part in the AI arms race is currently building toward that kind of a hand-off.
If that still sounds theoretical, the first signs of it are already beginning to appear. This week Google released an update to its experimental “AI Mode,” which the company says will let U.S. users test early features such as finding concert tickets or booking beauty appointments. While Google admits it’s still early and “may make mistakes,” the feature hints at what everyday delegation could look like. It’s not quite world-changing, but it’s enough to make the idea feel closer than it did six months ago.
Shopify has been saying much the same from the retail side, reporting that purchases attributed to AI-powered search and recommendation tools are up eleven-fold since January 2025, with AI-driven traffic rising seven times. It’s one of the first measurable signs that agentic systems are already influencing how people shop, even if most of us don’t think of them as “agents” just yet.

It makes it all rather easy to imagine a world where AI agents are doing the majority of our heavy lifting for us when it comes to e-commerce. That’s the future these companies are designing, after all. But we aren’t quite there yet.
Microsoft’s recent experiment makes that gap visible. Working with Arizona State University, the company has built a controlled, AI-to-AI, simulation called the Magentic Marketplace, a virtual economy where autonomous agents acted as both buyers and sellers, trading and negotiating entirely with one another. The design stripped out the human variable to see what machines do when left to organise themselves.
The results, published this week, were mixed. As the number of choices increased, agents became indecisive and inconsistent. When cooperation was required, they struggled to assign roles or maintain context. Some could even be steered toward poor decisions by misleading signals. These aren’t catastrophic failures, but they highlight a persistent gap between what models can do in isolation and how they perform in dynamic environments.

In other words, Microsoft didn’t prove that agentic AI is unworkable. It showed that autonomy still depends on structure. The agents performed best when guardrails were clear and the problem space was narrow, findings consistent with Microsoft’s published paper on the project. For now, “agentic” systems appear to be less free-roaming collaborators and more rule-bound apprentices.
The agentic narrative has been propelled forward on a mix of promise and projection: the notion that soon, agents will operate across marketplaces, platforms and domains without friction. Microsoft’s marketplace demonstrates that we might still be some distance from that level of composure. The experiment showed that performance in static, single-prompt tests doesn’t translate to complex, multi-agent settings. The issue wasn’t intelligence but coordination. Autonomy, in practice, depends on consistent context, and reliable data – all things that still require human design.
That’s where enterprise companies like SAP are now, seemingly, focusing some of their attention. If Microsoft exposed the limits of free-roaming autonomy, SAP is developing the framework that could make agentic systems manageable. At its TechEd conference in Berlin this week, SAP, whose enterprise systems underpin finance, logistics, and production for much of global retail, announced enhancements to its agentic-AI platform, including custom agent-builders, centralised monitoring and governance capabilities anchored in business data. Based on its messaging, the emphasis for now seems more on dependable execution than on full autonomy.

It’s a practical approach, and one that echoes how other operational industries already work. Fashion’s production systems, for instance, rely on shared data, approvals, and traceability to keep processes steady as automation expands. In both cases, progress depends less on bold ideas than on systems that behave predictably.
That might sound conservative, but it’s also where most real progress happens. Autonomy, at this stage, seems to work best when someone’s still watching, and before agents can act freely, someone has to decide what freedom actually includes.
A version of that same negotiation is already happening in the market. Earlier this week Amazon filed a lawsuit against Perplexity AI over its new “shopping agent,” alleging that the company violated platform rules by extracting product and pricing data without permission. Perplexity’s defence, that it is merely automating what users already do, points to something broader. The real question is who controls automation once it begins to mediate the buying process.

Amazon’s position is easy to understand. The company has spent decades building a closed commercial ecosystem. A third-party agent that redirects searches or purchases threatens to weaken that control, even if the intention is benign. For Perplexity, the incentive runs the other way, to turn the act of shopping into a conversation, a single interface where search and purchase merge together. Between them sits the problem no experiment can fully reproduce, platforms designed for human navigation being challenged by software that doesn’t recognise their boundaries.
The disagreement may settle quietly, but it reveals how the conversation around autonomy is changing. What began as a question of technical capability is becoming a question of jurisdiction. Who sets the limits when intelligence starts to act on behalf of others, and whose business model depends on keeping those limits in place?
Seen together, these developments don’t describe an era of full autonomy so much as the start of a long adjustment. The technology is advancing fast, but the systems around it are still deciding what to tolerate, from retail platforms to production networks. For now, progress feels procedural, machines learning where the limits lie, not yet how to move beyond them. And somewhere inside that slow calibration is the real story. Not how smart the agents become, but how much of the world they’re eventually allowed to touch.
