Virtual influencers like Shudu and Lil Miquela seem poised to play a major role in marketing in the metaverse, and if you work in the industry, then you probably already know how this pitch goes. Absent a physical body, avatars can be anywhere. They are immune to the virus and travel restrictions that stymied many a pandemic photoshoot. They don’t age. They are flawless, or only as artfully flawed as dictated by their creators. They will never embarrass a brand with a public meltdown. And finally, they are cheaper than using human models. So much so that they are coming for their jobs.
Or are they?
Despite what ample media coverage, academics and figureheads would have you believe, “digital” models are a lot less digital than most people think. Thanks to elaborate and inefficient software pipelines, they actually cost more than their human counterparts. They still require physical photoshoots, and are susceptible to human weaknesses, because beneath the mystical marketing hood of almost every digital model you know today is, ironically, a real human model.
Take the category-defining Lil Miquela. Even savvy industry insiders seem unaware that she’s played by real human actors wearing real physical clothing posing for real photoshoots. During post-production, a multidisciplinary team extensively photoshops the resulting imagery to replace the human model’s head with that of a CGI character made with 3D avatar software like Daz3D. In other words, the primary (and, in many cases, sole) digital component of these “digital” models is a CGI face that’s been superimposed or sometimes deepfaked on an otherwise very real, virus-vulnerable human body, creating what I’ll coin “Phygital Frankensteins.”
Clearly, this process costs more than simply using traditional models, who can breathe a sigh of relief knowing their jobs are safe for now. Real human models are also required to bring these Phygital Frankensteins to life – in, say, a runway show or TikTok video – as part of an elaborate process typically involving one of two approaches: motion capture, or deepfake. Motion capture, or “mo-cap” for short, has historically necessitated expensive suits and studio time to record movement from an actor that’s later retargeted to an avatar (often a cheaper alternative to hand-animation). Human actors can also directly puppet a 3D character to make it gesture and speak or – as seen recently on the reality competition Alter Ego – even sing like they do. Mo-cap production costs have actually fallen recently to the point where, with enough time and dollars (low thousands), individual creators can livestream as avatars (“Vtubing”) or make compelling avatar-based content.
While promising news for human models – who in the future may even stand to profit from digitally commodified assets beyond their appearance, like signature walks and poses – this pipeline is not a sustainable means for delivering digital characters. They simply cannot be easily divorced from their human puppeteer, leading to what some investors in VTuber startups call “The Actress Bottleneck.” When the actor behind the character burns out or quits, the fan base also moves on (which, incidentally, has caused significant struggles for VTuber companies). Deepfaking, which uses neural networks to create synthetic media, partially alleviates this bottleneck by enabling the head of any human actor in a still or video to be replaced with a digital one in post production. Yet – though more automated than using photo editing software – deepfakes require expensive skilled AI engineering labour, and still depend on the voice and mannerisms of the actor in the original video.
If these tactics sound like a losing strategy, that’s because they are. To wit, the company behind Lil Miquela joined Dapper Labs in an all-stock deal to focus on DAO initiatives late last year, indicating that even with over $25 million in venture funding, they were unable to mitigate high production costs with a sustainable business model. And while research has found younger generations are welcoming of CGI characters in part because fake avatars like Miquela seem no less authentic than highly filtered real people, I predict 2022 will see an increasing number of orphaned virtual influencer accounts because the cost-benefit of dressing physical up as digital simply does not compute,
Instead, I believe that the true purview of 3D characters – and the most likely source of a sustainable business model built around them – is not 2D feeds but 3D virtual worlds (i.e., the metaverse), where they can befriend increasingly avatar-enthused, Instagram-weary, younger generations.
Prior to 2022, companies creating photoreal virtual influencers opted for the “fake it ‘til you make it” approach, because the alternative – doing digital for real, without physical reference – was simply too prohibitive, both commercially and technologically. The added production costs of applying a digital veneer to a physical model were still cheaper than digitising 3D garments, building 3D sets, and spending months making media with stylized digital avatars. And marketing a TikTok influencer (voiced quietly by a human actor) as synthetic or an “AI robot” is far easier than building an actual AI influencer: an endeavour that requires vast AI expertise and a complex technology stack outside the typical agency wheelhouse.
But all this is about to change, for two reasons. First, truly disruptive software arrived in 2021 in the form of a photoreal avatar creator from Epic Games, with rival engine Unity likely soon to follow suit in light of recent acquisitions. Such software is already being leveraged by large brands, as seen with Dermalogica’s Natalia, a “virtual human designed to train skin professionals.” Second, a shift in market dynamics, and consumer behaviour around digital ownership and fashion, is paving the way for digital garment monetization beyond gaming: a projected $50 billion dollar metaverse opportunity luxury brands are rushing to seize.
Digitising garments with software traditionally used to virtualise production and design, like CLO3D, can now have tremendous upsides in whatever metaverse the future brings. Digital assets marketed on digital models will drive new revenue, and enable supply chain optimisations, minimising costs and environmental impact. But so far, early instantiations in recent collections and campaigns appear to be no more than inverted Phygital Frankensteins: featuring digital garments laboriously mapped onto highly posed 2D photos of human influencers. Yet another labour and time-intensive process, and hardly a sustainable one at scale.
While suitable for showcasing digital fashion on the 2D social media feeds dominant in the Web 2 age, such stopgap applications shortchange the real opportunity unlocked by 3D assets that can actually be tried on, worn, and sold in the 3D worlds of the impending Web 3 era. In the metaverse, truly digital models will realise their full technological potential: evolving from mere visual marketing tools into dynamic, interactive, AI-powered spokesmodels.
In the next decade, I believe every brand that enters the metaverse will eventually employ AI-driven avatar ambassadors that can interact natively in 3D worlds. That said, artificial intelligence has a long, unfortunate history of being overhyped. Marketers overpromise, and the technology underdelivers. Digital models or characters with humans under the hood are just another riff on the classic Mechanical Turk sleight-of-hand, wherein human labour is promoted as “AI” or “digital” as the only means to make good on misleading marketing. Amidst metaverse-mania, both the media and regulators need to do a better job of holding startups and purveyors of virtual influencers accountable to truthful, transparent claims about their technology (or lack thereof), lest the hype cycle come crashing down before liftoff.