Key Capabilities
| Feature | Description |
|---|---|
| Text-to-video | Generate video from a text description |
| Image-to-video (first frame) | Use one image as the first frame |
| Image-to-video (first and last frame) | Use two images as the first and last frames respectively |
| Multimodal reference | Combine images, video, and audio as references (1–9 images, up to 3 videos, up to 3 audio clips) |
| Video editing | Modify elements in an existing video using a reference image |
| Video extension | Extend and concatenate reference videos |
| Audio generation | Automatically generate synchronized voice, sound effects, and background music |
| Web search enhancement | Enhance generation with real-time internet content (text-to-video only) |
| Return last frame | Retrieve the last frame of the generated video |
Output Specifications
| Property | Value |
|---|---|
| Resolution | 720p, 1080p, 2k |
| Aspect ratio | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive |
| Duration | 4–15 seconds |
| Format | mp4 |
Workflow
Examples
Text-to-Video
Multimodal Reference (Image + Video + Audio)
In the prompt you can referencecontent array items using @image1, @video1, @audio1, numbered by media type starting from 1.
Important: Assets must be passed in strict order: text, image_url, video_url, audio_url. Do not reorder them, as this may cause errors; when including multiple assets, also ensure no other asset types are mixed in.
cURL
Video Editing
cURL
Web Search Enhancement (Text-to-Video Only)
cURL
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | seedance-2.0-ultra |
content | array | Yes | Input content array (text, image_url, video_url, audio_url) |
content[].type | string | Yes | text, image_url, video_url, or audio_url |
content[].text | string | For text type | Text prompt (recommended ≤500 Chinese chars or ≤1000 English words) |
content[].image_url.url | string | For image type | Image URL, Base64, or asset://<ID> |
content[].video_url.url | string | For video type | Video URL or asset://<ID> (mp4/mov, ≤50 MB, 2–15 seconds) |
content[].audio_url.url | string | For audio type | Audio URL, Base64, or asset://<ID> (wav/mp3, ≤15 MB) |
content[].role | string | Conditionally | first_frame, last_frame, reference_image, reference_video, reference_audio |
generate_audio | boolean | No | Generate synchronized audio, default true |
resolution | string | Yes | 720p, 1080p, or 2k. Required for Ultra. |
ratio | string | No | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive, default adaptive |
duration | integer | No | 4–15 seconds, default 5 |
tools | array | No | [{"type": "web_search"}] enables web search (text-to-video only) |
watermark | boolean | No | Whether to add a watermark, default false |
Input Modes
| Mode | Content items | role values |
|---|---|---|
| Text-to-video | 1× text | — |
| Image-to-video (first frame) | text (optional) + 1× image_url | first_frame or omitted |
| Image-to-video (first and last frame) | text (optional) + 2× image_url | first_frame + last_frame |
| Multimodal reference | text (optional) + images/videos/audio | reference_image, reference_video, reference_audio |
| Video editing | text + image_url + video_url | reference_image + reference_video |
| Video extension | text + video_url(s) | reference_video |
Note: First frame, first-and-last-frame, and multimodal reference are three mutually exclusive modes that cannot be combined.
Referencing Assets in Prompts
In the text prompt you can use@<type><N> placeholders to reference media items in the content array, numbered in order of appearance within each media type:
| Placeholder | Referenced item |
|---|---|
@image1, @image2, … | 1st, 2nd, … image_url item |
@video1, @video2, … | 1st, 2nd, … video_url item |
@audio1, @audio2, … | 1st, 2nd, … audio_url item |
"The character from @image1 walks through the scene in @video1, with @audio1 as background music".
Note: Assets must be passed in strict order: text, image_url, video_url, audio_url. Do not reorder them, as this may cause errors; when including multiple assets, also ensure no other asset types are mixed in.
Resolution and Pixel Values by Aspect Ratio
| Resolution | 16:9 | 4:3 | 1:1 | 3:4 | 9:16 | 21:9 |
|---|---|---|---|---|---|---|
| 720p | 1280x720 | 1112x834 | 960x960 | 834x1112 | 720x1280 | 1470x630 |
| 1080p | 1920x1080 | 1664x1248 | 1440x1440 | 1248x1664 | 1080x1920 | 2208x944 |
| 2k | 2560x1440 | 2224x1668 | 1920x1920 | 1668x2224 | 1440x2560 | 2944x1260 |
API Reference
View the interactive API documentation for Seedance 2.0 Ultra.

