Generate a song from an image by analyzing it and creating music based on visual content. The process can optionally include custom lyrics, voice conversion, and various musical parameters.
| Parameter | Type | Required | Description |
|---|---|---|---|
image_file | UploadFile | Optional | Upload the image to analyze. Required if image_url is not provided. |
image_url | String | Optional | Public or S3 URL to the input image. Required if image_file is not provided. |
prompt | String | Optional | Additional text to guide the song generation (max 300 characters). |
lyrics | String | Optional | Custom lyrics to include in the generated audio (max 3000 characters). |
negative_tags | String | Optional | Tags or themes to avoid in the song. |
make_instrumental | Boolean | Optional | If true, generates an instrumental version. Defaults to false. |
vocal_only | Boolean | Optional | If true, generates a vocal-only version. Defaults to false. |
key | String | Optional | Musical key for the song (e.g., โC majorโ, โA minorโ). |
bpm | Integer | Optional | Tempo in beats per minute. Defaults to 0 (auto-selected). |
webhook_url | String | Optional | Callback URL for async result delivery. |
voice_id | String | Optional | Voice ID for converting generated audio. Cannot be used with vocal_only mode. |
๐ก Note: You must provide eitheraudio_fileoraudio_urlโ at least one is required.
content-type: multipart/form-data
๐ Replace{path_to_your_audio_file},api_key, andwebhook_urlbefore executing.
Webhook responses include detailed metadata including task_id, conversion_id, audio files (conversion_path), lyrics etc.
prompt, replace_start_at, or replace_end_at, or neither audio_file nor audio_url provided.Image file to upload and analyze. Supported formats: JPEG, PNG, GIF, BMP, WEBP.
URL of the image to analyze. Either this or image_file must be provided.
"https://mybucket.s3.amazonaws.com/image.png"
Additional prompt to guide the song generation from the image.
300"Generate a relaxing acoustic track inspired by this scene."
Custom lyrics to include in the generated audio.
3000"Let the colors of the sunset fill your heart."
Tags or themes to avoid in the song.
"no heavy metal, avoid loud drums"
Generate instrumental output only. Lyrics will be ignored.
Generate vocal-only output.
Musical key for the song.
"C major"
Beats per minute for the song tempo. Defaults to 0 (auto-selected).
Optional callback URL for async processing results.
"https://example.com/webhook"
Voice ID for converting the generated audio. Cannot be used with vocal_only mode.