[Featured image highlighting MetaHuman 5.6, Unreal Engine.]

Key Takeaways:

  • Google’s Veo 3 and Epic’s new MetaHuman tools could reshape media production, potentially offering unprecedented control over creative campaigns and character generation. These advancements might signal a shift towards more accessible AI capabilities for creative industries.
  • The increasing accessibility of these powerful AI systems might lead creative teams, including those in fashion and beauty, to consider digital humans as versatile assets for casting and campaign development. This has the potential to prompt a re-evaluation of traditional production approaches and foster hybrid content creation models.
  • The growing realism of AI-generated content, highlighted by Google’s decision to watermark Veo 3 videos, raises important questions about transparency in creative industries. As synthetic media becomes more prevalent, ethical considerations regarding disclosure could increasingly become practical operational decisions within production workflows.

Over the past few weeks, some of the biggest companies in tech have given us a clearer picture of what the next era of media production could look like, if their tools are adopted, and if they keep evolving in the direction they appear to be going. 

Google stood up at I/O and unveiled Veo 3, a text to video tool capable of producing cinematic shots with the kind of framing, motion, and tone that usually take years of creative instinct to develop. Epic, meanwhile, used UnrealFest, to showcase advances in real-time animation, digital humans, and AI-driven character logic. 

These were loud, confident announcements, not just about what’s possible, but about what’s ready to use. Whether they make a real impact will depend on how (and how quickly) creative industries decide to use them.

We’ll start with UnrealFest, where Epic laid out what could become a blueprint for scalable synthetic performance.

MetaHuman 5.6, Unreal.

Fortnite creators will soon be able to build their own AI-powered NPCs. Characters can speak, react, hold conversations, and deliver pre-set emotional tones. You decide what they say, how they sound, and what kind of personality they project. You do all of this highly customisable work inside the Unreal Editor, and there’s no custom code required. We’re not talking about background extras, these are playable actors. The fact they demoed it with Darth Vader, who speaks in a convincingly synthetic James Earl Jones baritone, was a flex sure, but also a nudge. If you can build Vader one of the most instantly recognisable characters in all of media, who else becomes castable? 

And that wasn’t their only expansion. Epic also pushed its Metahuman framework further, making it easier to bring hyper-realistic digital humans into different environments and platforms. One of the most impressive updates was to MetaHuman Animator, which now lets you animate a digital character in real time using nothing more than a standard webcam. No mocap suits or studio setup required, just a face, a feed, and a remarkably responsive digital performance. Epic demoed it live on stage and it worked exactly as promised. What once required an entire team of specialists now runs on a laptop. 

That’s a huge shift, not just in capability, but in who gets to use it. These powerful systems are no longer locked away behind studio walls, they’re starting to look, and feel like everyday tools, available, usable, and maybe even a little familiar. 

If Epic’s vision plays out, these digital humans could become everyday creative assets, ready to drop into environments, platforms, or even fashion campaigns. That’s not guaranteed, of course, but the technical groundwork is there. And if it gets picked up, it could start to change how teams approach casting, direction, and visual identity.

You could imagine a campaign being built around one of them without much ceremony. Not in the distant future, but right here and now. Which doesn’t mean real models are going away, it just means the decisions get more interesting. Why go through the time and cost of a full location shoot if a convincing, controllable stand-in can be built to order? That’s not to say we’re rooting for the end of talent, quite the opposite, we’re merely highlighting the reality: that the equation is changing, and creative leads are bound to at least consider the options now sitting in front of them. 

Maybe that’s the real takeaway. Not the realism or the scale or even the novelty, but the way this all gets slotted into infrastructure. Characters as tool kits, with performance being something editable, almost modular. The strange part isn’t that it works. What’s strange is how quickly it’s moved from concept to toolbelt, and when things like this start landing in the hands of people who aren’t engineers or VFX leads, the conversation doesn’t feel like speculation anymore. It just becomes production. Someone in fashion or beauty, maybe not a big house, maybe someone smaller and faster, is likely going to start treating personas the way we treat presets: pick one, tweak it, build around it.   

The same current runs through Google’s Veo 3. It can generate short, cinematic video clips from text prompts, complete with sound, motion and framing. At its worst, it still veers into the uncanny, like early-gen AI content, but at its best, it’s clean, polished, and good enough to pass in a mid-tier ad campaign. That much wasn’t surprising. These models improve with every release, it’s expected. What’s more interesting, and infinitely more complicated, is how “real” each new version is starting to look. That’s likely why Google quietly added a watermark to generated videos a week after launch, a visible on-screen tag that marks the clip as AI-made. 

This decision didn’t come out of nowhere. Someone inside Google likely understood what a lot of people outside it were beginning to feel: that if we’re going to watch, share and remix AI generated content, we should probably know that’s what it is. Not because the content itself is always dangerous (though sure, some bad actors will do what they do) but because we’re in danger of losing vital context. When everything can look like anything, people want even a tiny signal to help orient themselves.  

Fashion and beauty already work in a space where constructed images and stylised storytelling are the norm, so it’s not hard to imagine Veo 3 becoming part of the process. But if that happens, the expectations around its use will matter. Do campaigns need to disclose? Should digital models be tagged? Is the illusion part of the sale, or is it something that needs to be managed more carefully? 

When everything can look like anything, people want even a tiny signal to help orient themselves.

If these sound like ethical questions, the reality is, they no longer are. What we’re looking at is a series of practical operations questions, because once synthetic video and image generation become default tools, and it’s hard to argue that isn’t happening, decisions about transparency no longer live in the ethics department. Instead they’ll be in production schedules and approval workflows.  

We’re already seeing early models of how creative industries might adapt. Echo Hunter, an AI generated sci-fi short film that has been making a splash online, offers a glimpse of what hybrid authorship can look like. The film was built using generative visuals, but the creators worked with SAG-AFTRA actors to voice the characters and licence their likeness. The final product isn’t fully synthetic, but it’s not traditional either. It’s a hybrid. The credits list humans. The performances have intent. The images may have come from prompts, but the shape of the story was created by people. 

echo hunter.

There’s something important there, and not just for filmmakers. The same kind of collaborative model could work well in any creative industry, not least fashion and beauty. Use AI to build the bones of a lookbook or concept shoot, and use humans to guide the tone, lend voice, and most importantly shape the identity. Not everything needs to be real, but it does need to be honest about what went into it. Audiences can live with fiction, what we’re all collectively tired of, is being misled. 

We made a similar point earlier this year when we wrote about AI photoshoots and what they mean for the professionals who usually sit just-off camera – the stylist casting directors, lighting technicians, and all the other quiet operators who bring coherence to fashion image-making. That piece argued that displacement isn’t inevitable. What’s needed is a redefinition of how creative work is credited and used when it intersects with generative systems, that argument still holds. Echo Hunter just gives it a cleaner visual reference. 

All these visual tools and the work they make possible are not happening somewhere in the future, they’re already functioning, and in some cases, surprisingly well. But whether they spark a new chapter in visual production will depend less on the tech itself, and more on how it’s adopted. The next leap isn’t just technical. It’s cultural. The real question now is: who picks them up, how they’re used, and how far they’re allowed to go.