OpenRouter launches video generation API, an interface that calls mainstream models such as Sora 2, Veo 3.1, Seedance, and others.

robot
Abstract generation in progress

ME News report. On April 16 (UTC+8), according to Dongcha Beating monitoring, the AI model aggregation platform OpenRouter has officially launched a video generation API. The first batch supports text-to-video and image-to-video, connecting with Seedance 2.0/1.5, Veo 3.1, Wan 2.7/2.6, and Sora 2 Pro, with further expansion planned.

Video generation APIs are fragmented far more than text models: request formats differ from one provider to another, parameter names differ, billing units differ, and even different capabilities within the same model family (text-to-video, image-to-video, reference character generation) often map to different endpoints. OpenRouter’s approach is to build a unified schema on top that automatically routes requests to the correct endpoint based on request parameters. If an image is included, it goes to the image-to-video route; if a reference character is specified, it goes to the character consistency endpoint—developers don’t need to worry about underlying differences.

Parameter normalization also covers the details that are easy to trip over. For example, Veo 3.1 supports 4-, 6-, and 8-second clips, while Wan 2.6 supports 5 or 10 seconds; if you pass an incorrect duration, it will directly return an error. OpenRouter provides a model capability query endpoint, /api/v1/videos/models, which returns the resolution, duration, aspect ratio, pricing, and model-specific parameters that each model supports. Developers or programming agents can check once before calling to avoid trial and error.

Because video generation is measured in minutes, the API uses an asynchronous mode: after submitting a prompt, it returns a task ID, and once the task is complete, the video can be retrieved.

OpenRouter has also open-sourced a multimodal workflow demo application, showing the chained process of having LLM generate detailed prompts, having image models generate characters, and having video models generate scenes.

This is also the most direct value after integrating video generation into a unified routing system: developers can combine text, image, and video models under the same API without separately connecting to each provider’s SDK.

(Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin