Skip to main content
Seedance 2.0 is ByteDance’s latest video generation model, available through the Starrise AI API. It supports text-to-video, image-to-video, multimodal reference input, video editing, video extension, and synchronized audio generation.

Key Capabilities

FeatureDescription
Text-to-videoGenerate video from a text description
Image-to-video (first frame)Use one image as the first frame
Image-to-video (first and last frame)Use two images as the first and last frames respectively
Multimodal referenceCombine images, video, and audio as references (1–9 images, up to 3 videos, up to 3 audio clips)
Video editingModify elements in an existing video using a reference image
Video extensionExtend and concatenate reference videos
Audio generationAutomatically generate synchronized voice, sound effects, and background music
Web search enhancementEnhance generation with real-time internet content (text-to-video only)
Return last frameRetrieve the last frame of the generated video

Output Specifications

PropertyValue
Resolution480p, 720p, 1080p
Aspect ratio16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive
Duration4–15 seconds
Formatmp4
Frame rate24 fps

Workflow

1. POST /v1/video/generations  →  task_id
2. Poll GET /v1/video/generations/{task_id}  →  status
3. When status = "succeeded"  →  download video URL (valid for 24 hours)

Asset Management Workflow

If you need to use persistent image, video, or audio assets (e.g. a fixed character reference), you can pre-upload them via the Asset Management API and then reference them with asset://<ID> in generation requests.

Step 1: Create an Asset Group

First, create an asset group to obtain a group ID.
curl https://ai.alad.com/volc/asset/CreateAssetGroup \
  -H 'Authorization: Bearer YOUR_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "volc-asset",
    "Name": "your-custom-name"
}'
Example response:
{
    "Id": "group-20260427160000-xxxxx"
}

Step 2: Create an Asset in the Group

Using the group ID from Step 1, upload an image asset (e.g., a character reference face image).
curl https://ai.alad.com/volc/asset/CreateAsset \
  -H 'Authorization: Bearer YOUR_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "volc-asset",
    "GroupId": "group-20260427160000-xxxxx",
    "Name": "character-reference",
    "AssetType": "Image",
    "URL": "https://example.com/example.png"
}'
Example response:
{
    "Id": "asset-20260427160000-xxxxx"
}

Step 3: Generate Video Using the Asset

Reference the asset ID from Step 2 to generate a video.
curl https://ai.alad.com/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0",
    "content": [
      {
        "type": "text",
        "text": "The person in @image1 walks down a sunny street, cinematic quality"
      },
      {
        "type": "image_url",
        "image_url": {
          "url": "asset://asset-20260427160000-xxxxx"
        },
        "role": "reference_image"
      }
    ],
    "resolution": "720p",
    "duration": 5
  }'
The API returns an asynchronous task ID (prefixed with asyn).

Step 4: Poll for Results

Use the task ID to query the generation status.
curl https://ai.alad.com/v1/video/generations/asynxxxx \
  -H "Authorization: Bearer YOUR_API_KEY"
Once the task completes, the response will contain a pre-signed S3 download URL.
Notes
  • Download URLs expire in 12 hours and must be re-fetched after expiry.
  • If task progress reaches 100% but returns an error, it typically means the generated content was blocked by the provider’s content moderation system (e.g., celebrity likenesses or copyrighted content). In this case, try modifying the prompt or replacing the reference image.

Examples

Text-to-Video

curl https://ai.alad.com/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0",
    "content": [
      {
        "type": "text",
        "text": "A cat playing piano in a sunlit room, cinematic lighting"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 8
  }'

Multimodal Reference (Image + Video + Audio)

In the prompt you can reference content array items using @image1, @video1, @audio1, numbered by media type starting from 1.
Important: Assets must be passed in strict order: text, image_url, video_url, audio_url. Do not reorder them, as this may cause errors; when including multiple assets, also ensure no other asset types are mixed in.
cURL
curl https://ai.alad.com/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0",
    "content": [
      {
        "type": "text",
        "text": "The character from @image1 dances in the scene from @video1, with @audio1 as background music, festive New Year atmosphere"
      },
      {
        "type": "image_url",
        "image_url": {"url": "https://example.com/character.jpg"},
        "role": "reference_image"
      },
      {
        "type": "video_url",
        "video_url": {"url": "https://example.com/clip.mp4"},
        "role": "reference_video"
      },
      {
        "type": "audio_url",
        "audio_url": {"url": "https://example.com/bgm.mp3"},
        "role": "reference_audio"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 11
  }'

Video Editing

cURL
curl https://ai.alad.com/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0",
    "content": [
      {
        "type": "text",
        "text": "Replace the water in @video1 with the perfume from @image1, keep the same camera motion"
      },
      {
        "type": "image_url",
        "image_url": {"url": "https://example.com/perfume.jpg"},
        "role": "reference_image"
      },
      {
        "type": "video_url",
        "video_url": {"url": "https://example.com/original.mp4"},
        "role": "reference_video"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 5
  }'

Web Search Enhancement (Text-to-Video Only)

cURL
curl https://ai.alad.com/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2.0",
    "content": [
      {
        "type": "text",
        "text": "Macro shot of vivid flower petals on a tree, gradually zooming in"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 11,
    "tools": [{"type": "web_search"}]
  }'

Request Parameters

ParameterTypeRequiredDescription
modelstringYesseedance-2.0
contentarrayYesInput content array (text, image_url, video_url, audio_url)
content[].typestringYestext, image_url, video_url, or audio_url
content[].textstringFor text typeText prompt (recommended ≤500 Chinese chars or ≤1000 English words)
content[].image_url.urlstringFor image typeImage URL, Base64, or asset://<ID>
content[].video_url.urlstringFor video typeVideo URL or asset://<ID> (mp4/mov, ≤50 MB, 2–15 seconds)
content[].audio_url.urlstringFor audio typeAudio URL, Base64, or asset://<ID> (wav/mp3, ≤15 MB)
content[].rolestringConditionallyfirst_frame, last_frame, reference_image, reference_video, reference_audio
generate_audiobooleanNoGenerate synchronized audio, default true
resolutionstringNo480p, 720p, or 1080p, default 720p
ratiostringNo16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive, default adaptive
durationintegerNo4–15 seconds, default 5
toolsarrayNo[{"type": "web_search"}] enables web search (text-to-video only)
watermarkbooleanNoWhether to add a watermark, default false

Input Modes

ModeContent itemsrole values
Text-to-video1× text
Image-to-video (first frame)text (optional) + 1× image_urlfirst_frame or omitted
Image-to-video (first and last frame)text (optional) + 2× image_urlfirst_frame + last_frame
Multimodal referencetext (optional) + images/videos/audioreference_image, reference_video, reference_audio
Video editingtext + image_url + video_urlreference_image + reference_video
Video extensiontext + video_url(s)reference_video
Note: First frame, first-and-last-frame, and multimodal reference are three mutually exclusive modes that cannot be combined.

Referencing Assets in Prompts

In the text prompt you can use @<type><N> placeholders to reference media items in the content array, numbered in order of appearance within each media type:
PlaceholderReferenced item
@image1, @image2, …1st, 2nd, … image_url item
@video1, @video2, …1st, 2nd, … video_url item
@audio1, @audio2, …1st, 2nd, … audio_url item
Example: "The character from @image1 walks through the scene in @video1, with @audio1 as background music".
Note: Assets must be passed in strict order: text, image_url, video_url, audio_url. Do not reorder them, as this may cause errors; when including multiple assets, also ensure no other asset types are mixed in.

Resolution and Pixel Values by Aspect Ratio

Resolution16:94:31:13:49:1621:9
480p864x496752x560640x640560x752496x864992x432
720p1280x7201112x834960x960834x1112720x12801470x630
1080p1920x10801664x12481440x14401248x16641080x19202208x944

API Reference

View the interactive API documentation for Seedance 2.0.