Kling just announced VIDEO 3.0 - a significant upgrade from their 2.6 and O1 models.
Key improvements:
*Extended duration:*
• Up to 15 seconds of continuous video (vs previous 5-10 seconds)
• Flexible duration ranging from 3-15 seconds
• Better for complex action sequences and scene development
*Unified multimodal approach:*
• Integrates text-to-video, image-to-video, reference-to-video
• Video modification and transformation in one model
• Native audio generation (synchronized with video)
*Two variants:*
• VIDEO 3.0 (upgraded from 2.6)
• VIDEO 3.0 Omni (upgraded from O1)
*Enhanced capabilities:*
• Improved subject consistency with reference-based generation
• Better prompt adherence and output stability
• More flexibility in storyboarding and shot control
This positions Kling competitively against:
- Runway Gen-4.5 ($95/month)
- Sora 2 (limited access)
- Veo 3.1 (Google)
- Grok Imagine (just topped rankings)
The 15-second duration is particularly interesting - enables more narrative storytelling vs the typical 5-second clips. Combined with native audio, this could change workflows for content creators.
Pricing isn't mentioned in the announcement. Previous Kling models ranged from $10-40/month, significantly cheaper than Runway.
Anyone have access to test this yet? Curious how the quality compares to Runway and Sora at this new duration.
What I'm most curious about is the native audio generation - is it just ambient sound/music, or can it generate synchronized speech? If it's the latter with reasonable lip-sync, that could eliminate a lot of post-production work for explainer videos and short-form content.
Also wondering about the API availability. Having this accessible programmatically would open interesting possibilities for automated content pipelines.
reply