Science - Technology

China launches AI model that creates videos from text - OpenAI's rival

TH (Synthesis) April 30, 2024 15:20

A Chinese company has just introduced an artificial intelligence model that can create videos from text input similar to OpenAI's Sora tool.

Chú thích ảnh
Image from demo clip created by Vidu app, with text input

China has just introduced a text-to-video artificial intelligence (AI) tool similar to OpenAI's viral app Sora, although this new model can only create videos no longer than 16 seconds, compared to the US company's 60 seconds.

Vidu, the country's best hope so far in catching up to Sora, was launched over the weekend by startup Shengshu Technology in a partnership with Beijing's prestigious Tsinghua University, according to the South China Morning Post (SCMP).

The company says the model can create 1080p video based on simple text prompts.

Vidu has achieved significant improvements in video effects, mainly reflected in the simulation of the real physical world, multi-lens language, high spatial-temporal consistency, and understanding of Chinese elements. According to Zhu Jun, Vidu “has imagination,” “can simulate the physical world,” and “produces a 16-second video with consistent characters, scenes, and timelines.” He added that this artificial intelligence model can also understand “Chinese elements.”

During the model's launch, Shengshu released several demo clips, including one featuring a panda playing a guitar while sitting on the grass and another of a puppy swimming in a lake, both of which show vivid details.

Unlike OpenAI's ChatGPT, which inspired a slew of China-based competitors after its November 2022 launch, the impressive videos created by Sora and released in February this year have failed to attract the same level of enthusiasm from China's Big Tech startups.

Industry experts say one of the factors holding back Chinese companies' growth in this area is a lack of computing power.

For Sora to generate a one-minute clip, it takes eight Nvidia A100 graphics processing units (GPUs) to run for more than three hours, according to Li Yangwei, a technical consultant working in the field of intelligent computing in Beijing. “Sora requires a lot of computing power for inference,” he said.

The US government has tightened export restrictions on advanced chips made by Nvidia, including the A100 and H100 GPUs, which have become the most sought-after components for training AI systems, but are banned from shipping to China.

Chú thích ảnh
Image from video produced by Vidu model from text prompt

Beijing-based Shengshu was founded in March 2023, with a core team consisting mainly of members from Tsinghua University's AI Institute, as well as members from Alibaba Group Holding, Tencent Holdings and ByteDance. Alibaba Group, owner of the South China Morning Post, is also working on its own video-generating AI models.

Last month, Shengshu raised hundreds of millions of yuan from investors including Qiming Ventures, Zhipu AI and Baidu Ventures.

China's first AI text-to-video model was introduced by Shengshu about two months after Sora, a similar model released by US-based OpenAI, made a big splash around the world.

The two superpowers, the US and China, are currently competing with each other on many aspects related to artificial intelligence, from the technology to design AI hardware and software, to the raw materials that power AI systems.

AI is one of the most talked about keywords in 2023, but 7 years ago, China proposed an ambitious development program with the goal of becoming a global “AI innovation center” by 2030. In it, China aims to achieve “world-leading level” by 2025 and become a “major AI innovation center of the world” by 2030, prioritizing AI as the main driving force for industrial upgrading and economic transformation.

TH (Synthesis)
(0) Comments
Latest News
China launches AI model that creates videos from text - OpenAI's rival