The age of fake social media influencers is coming.
Would you agree if I said that the age of fake influencers is coming?
Well, the truth is, it’s actually here.
According to a study from Influencer Marketing Hub, 31.7% of brands think that virtual influencers have an advantage over human influencers because they have “more control over messaging.” A further 29.1% said 24/7 availability was the biggest advantage offered by AI influencers.
AI-powered platforms that let you create images of attractive female influencers and turn them into realistic videos are now highly accessible. Some even offer their services for free.
However, from my personal experience, AI-generated videos of people are still hit-and-miss in terms of realism. Most, if not all, video models still struggle with maintaining motion coherence.
Recently, the University of Hong Kong officially launched the Goku video generation model, developed in collaboration with ByteDance. This new video model aims to produce some of the most realistic, tiktok-esque kind of videos, which are perfect if you’re aiming to create videos of an AI influencer.
Goku is a family of rectified flow Transformer models designed for joint image and video generation. It is designed to achieve industry-grade performance, integrating advanced techniques for high-quality visual generation, including meticulous data curation, model design, and flow formulation.
Goku supports multiple generation tasks:
Goku has another variation called Goku+, which lets you directly create virtual digital human videos. With Goku+, you can turn text into extremely realistic human videos that outperform current methods.
It even produces videos longer than 20 seconds, with steady hand movements and very expressive facial and body actions.
Goku for image-to-video generation (I2V) uses a widely adopted strategy of using the first frame of each video clip as a reference image. Here’s a breakdown of the process:
Take a look at these examples:
The reference images are shown on the leftmost columns. Keywords are highlighted in red text.
To ensure that Goku produces high-quality videos, the model is trained on a dataset that is visually appealing, contextually relevant, and diverse.
The data curation pipeline consists of five main stages:
This pipeline ensures that the video clips used for training the model are of high visual quality. This is achieved through visual aesthetic filtering using aesthetic models to evaluate keyframes and retain photorealistic and visually rich clips.
For instance, videos with resolutions around 480 x 864 are discarded if their aesthetic score is below 4.3, while for resolutions exceeding 720 x 1280, the threshold is raised to 4.56.
You can learn more about the technical details in this whitepaper.
The examples below showcase the capability of Goku+ to generate hyper-realistic videos of products and AI influencers:
In the example below, Goku+ demonstrates its capability to generate videos ideal for advertising self-care products. The visual style closely mirrors the dynamic, fast-paced aesthetics found on popular platforms like TikTok.
Did they actually scrape millions of TikTok videos and use them as training data? If so, did they even get permission from the uploaders?
In this example, the model does a great job of creating videos where a person interacts naturally with a product. The videos feel like a friendly explainer or a casual demo where everything just flows effortlessly.
Think about using an AI influencer to do live selling for you. How cool is it to have someone who doesn’t get tired talking and answering questions?
This feature is probably one of the most practical: turning a static product image into a lively video clip. Instead of setting up a full video shoot, you just take one image and let Goku+ bring it to life with subtle movements and engaging details.
It’s a huge time-saver, especially for online sellers who need dynamic content fast. However, whether Goku+ maintains this level of consistency from the reference image is still yet to be seen.
Right now, Goku is still a research paper, and no publicly accessible demo page is released. I highly recommend keeping an eye on their GitHub and HuggingFace spaces to stay up to date with future updates.
Goku+ is seriously impressive on paper—those example videos look fantastic. But these are cherry-picked highlights, and we won’t know the real deal until the public demo drops. Once we see it in action across all types of content, we’ll really get a sense of whether it can deliver consistent, high-quality performance.
Another big question on my mind is how the training data was actually gathered. Did they really scrape TikTok videos for this? If that’s the case, it raises some valid concerns about privacy and permissions. And then there’s the matter of ByteDance’s involvement—what's in it for them?
The possibility that these AI influencers could eventually be integrated into TikTok is pretty wild, and it opens up a whole new debate about the future of digital content and influencer marketing.
What’s your take on AI influencers? I’d love to know your thoughts in the comments section.
References:
[1] Saiyan-World, “Goku,” GitHub Repository. Available: https://github.com/Saiyan-World/goku?tab=readme-ov-file.
[2] “arXiv:2502.04896,” arXiv Preprint. Available: https://arxiv.org/abs/2502.04896.
‍
Software engineer, writer, solopreneur