Vidu is another AI video generator that challenges Sora, Kling, and Gen-3 Alpha.
T
hree months after its initial preview, Vidu by Shengshu Technology is now live and accessible to the public. This new AI video generator aims to compete with OpenAI’s popular but unreleased Sora.
In case you missed it, another Chinese AI video tool, Kling, became publicly accessible a few days ago. You can sign up with an email address and get free credits upon every login. Watch my review of Kling here.
Vidu is an AI-powered tool that can generate videos from text descriptions or existing images. Announced on April 27, 2024, Vidu is designed to generate high-definition, 4-second videos in less than 30 seconds. It can render videos in both anime and realistic styles.
The Vidu AI model is built on a proprietary visual transformation model architecture called the Universal Vision Transformer (U-ViT). This integrates two text-to-video AI models: the Diffusion and the Transformer. This architecture enables the creation of high-quality videos with dynamic camera movements, intricate facial expressions, and authentic lighting and shadow effects.
This is what the website’s dashboard looks like:
On sign-up, users get 80 free points per month and produce good-quality output, albeit with slightly lower resolution for the free version. Each session is limited to generating 4 seconds (the paid version allows for 8 seconds).
Head over to the Vidu website and sign up through email. On the top navigation bar, click on the “Create Video” button.
Here’s an example:
Prompt: A Chinese man sitting at a table, eating noodles with chopsticks
Below the output video file, you can choose to upscale or reuse the prompt by clicking on the ‘ConfigCopy’ button. Here’s the final result:
This video is a 4-second, 688 × 384 file. Because of the small size, the generation took less than a minute. Note that other AI video tools that generate 1080p resolution files take at least 2–3 minutes per video. Each generation costs 4 credits.
The settings page is quite simple. You can change the video style between general and animation. Note that the video style applies to text-to-video only, and the 8-second duration option is exclusive to paying customers.
Let’s try this prompt in animated style:
Prompt: In a softly lit bathroom, a teddy bear styled like an American animated character is taking a bath. The bear, partially submerged in a bubble-filled bathtub, holds a phone to its ear with one paw while scrubbing itself with the other. The ambient lighting is gentle and refreshing, casting a warm and inviting glow over the scene. The bathroom tiles are a soothing pastel color, complementing the cozy and whimsical atmosphere. The teddy bear’s expressive face shows concentration as it multitasks, combining the mundane act of bathing with the casual activity of a phone conversation.
Oh wow. I was very impressed with the quality of the output video. It looks like it came out of an animated film from Studio Ghibli. However, you may notice that the AI model struggles with coherence. In the prompt, the bear is supposed to be holding a phone to its ear with one paw while scrubbing itself with the other.
Now let’s see how the image-to-video feature performs. After you upload the image, specify whether you want it to be used as the first frame or character reference of the video.
Here’s the reference image from Midjourney:
Prompt: Triumphant Marathon Runner Approaching the Finish Line, Eiffel Tower in Festive Atmosphere
This looks super cool. I was surprised to see Vidu deliberately adding more subjects to the scene with legible text on the runner’s bib number.
One area where most of the AI video generators struggle is text rendering. Let’s see how Vidu handles this prompt:
Prompt: A wall with a graffiti that says “Vidu is cool”
The texts are not accurate, but the letters are legible. Looking at these results, it seems to be better than Kling at generating text in videos.
Here’s a summary of the subscription plans:
Users can also opt for annual subscriptions and get a 20% discount.
And before I end this piece, Kling just launched a subscription plan starting at $5 and up to $46 per month.
The pro tier gives you the following benefits:
For free users like me, the daily login bonus of 66 credits is still in place. Visit klingai.com for more information about the subscription plans.
Overall, Vidu is a great addition to the short list of publicly available AI video generators. In terms of quality, it is ahead of Runway Gen-3 Alpha but a bit behind OpenAI’s Sora. I appreciate that free users get free monthly credits, although it would be better if they were provided daily.
Also, text rendering and coherence to the prompt are still some of the hardest areas to solve in AI videos. While Vidu still struggles with this, there’s already a big difference compared to how it was a few years ago.
I am glad that video generation is finally catching up with text and image generation in 2024. In the coming months, we could see more AI video generators released with improved quality and cheaper subscriptions.
Software engineer, writer, solopreneur