Kuaishou's Kling AI Model Revolutionizes Video Generation

Kuaishou's Kling AI 3.0 model enhances video generation capabilities, allowing for seamless multimodal creation and improved realism.

The Chinese developer Kuaishou has unveiled the third version of its video generation model, Kling AI.

🚀 Introducing the Kling 3.0 Model: Everyone a Director. It’s Time.

An all-in-one creative engine that enables truly native multimodal creation.

— Superb Consistency: Your characters and elements, always locked in.
— Flexible Video Production: Create 15s clips with precise… pic.twitter.com/CJBILOdMZs
— Kling AI (@Kling_ai) February 4, 2026

“Kling 3.0 operates on a deeply unified training platform, providing truly native multimodal input and output. With seamless audio integration and advanced element consistency control, the model imbues the generated content with a stronger sense of life and coherence,” the announcement states.

The model combines multiple tasks: transforming text, images, and references into video, adding or removing content, and modifying and transforming clips.

Video duration has increased to 15 seconds. Other enhancements include more flexible frame management and precise adherence to prompts. Overall realism has improved, with character movements becoming more expressive and dynamic.

New Multi-Shot functionality analyzes prompts to determine scene structure and shot types. The tool automatically adjusts camera angles and composition.

The model supports various editing solutions, from classic dialogue setups to parallel storytelling and scenes with voiceovers.

“No more tedious cutting and editing—one generation is enough to create a cinematic clip and make complex audiovisual forms accessible to all creators,” the announcement adds.

Kling 3.0 is truly "one giant leap for AI video generation"! Check out this amazing mockumentary from Kling AI Creative Partner Simon Meyer! pic.twitter.com/Iyw919s6OJ
— Kling AI (@Kling_ai) February 5, 2026

In addition to standard video generation from images, Kling 3.0 supports multiple images as references and video sources as scene elements.

The model captures the characteristics of characters, objects, and episodes. Regardless of camera movement and plot development, key objects remain stable and consistent throughout the video.

Developers have enhanced native audio: the system synchronizes speech more accurately with facial expressions and allows users to specify individual speakers in dialogue scenes.

The list of supported languages has expanded to include Chinese, English, Japanese, Korean, and Spanish, with improved handling of dialects and accents.

Additionally, the team has upgraded the multimodal model O1 to Video 3.0 Omni.

Users can upload audio with speech lasting three seconds and extract voice or record video with characters lasting three to eight seconds to capture their main characteristics.

Competitors to Sora Are Closing In

OpenAI launched its video generation model Sora in February 2024. The tool generated excitement on social media, but the public release occurred only in December.

Nearly a year later, users gained access to video generation from text descriptions, “bringing images to life,” and enhancing existing clips.

The iOS app for Sora was released in September and quickly attracted attention, with over 100,000 installs on its first day. The service surpassed one million downloads faster than ChatGPT, despite being invite-only.

However, the trend soon reversed. By December, downloads had dropped by 32% compared to the previous month. The downward trend continued in January, with the app being downloaded 1.2 million times.

This decline can be attributed to several factors. Firstly, competition intensified with Google’s Nano Banana model, which strengthened Gemini's position.

Sora also competes with Meta AI and its Vibes feature. In December, market pressure increased from the startup Runway, whose Gen 4.5 model outperformed its counterparts in independent tests.

Secondly, OpenAI's product faced copyright issues. Users were creating videos featuring popular characters like “SpongeBob” and “Pikachu,” prompting the company to tighten restrictions.

By December, the situation stabilized after a deal with Disney allowed users to generate videos with the studio's characters. However, this did not lead to an increase in downloads.

It’s worth noting that in October, deepfakes featuring Sam Altman flooded Sora.