Image to Song
Generate a song from an image by analyzing it and creating music based on visual content. The process can optionally include custom lyrics, voice conversion, and various musical parameters.
Endpoint
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
image_file | UploadFile | Optional | Upload the image to analyze. Required if image_url is not provided. |
image_url | String | Optional | Public or S3 URL to the input image. Required if image_file is not provided. |
prompt | String | Optional | Additional text to guide the song generation (max 300 characters). |
lyrics | String | Optional | Custom lyrics to include in the generated audio (max 3000 characters). |
negative_tags | String | Optional | Tags or themes to avoid in the song. |
make_instrumental | Boolean | Optional | If true, generates an instrumental version. Defaults to false. |
vocal_only | Boolean | Optional | If true, generates a vocal-only version. Defaults to false. |
key | String | Optional | Musical key for the song (e.g., โC majorโ, โA minorโ). |
bpm | Integer | Optional | Tempo in beats per minute. Defaults to 0 (auto-selected). |
webhook_url | String | Optional | Callback URL for async result delivery. |
voice_id | String | Optional | Voice ID for converting generated audio. Cannot be used with vocal_only mode. |
๐ก Note: You must provide eitheraudio_fileoraudio_urlโ at least one is required.
content-type: multipart/form-data
Try it Yourself
Visit the image_to_song Endpoint Explorer to play around โ set your payload, hit send, and listen to the generated results live.Sample Request
cURL
Python
๐ Replace{path_to_your_audio_file},api_key, andwebhook_urlbefore executing.
Sample Response
Success (200 OK)
Webhook Delivery
Once the generation is complete, webhooks will be triggered to deliver the following:Standard Requests :
- 2 (webhooks) x conversion details (one per version)
- 2 (webhooks) x Lyrics with timestamp data
- 1 Album Cover Image
Webhook responses include detailed metadata including task_id, conversion_id, audio files (conversion_path), lyrics etc.
Common Errors
- 422 Unprocessable Entity: Missing required fields like
prompt,replace_start_at, orreplace_end_at, or neitheraudio_filenoraudio_urlprovided. - 500 Internal Server Error: An unexpected error occurred during processing.
The response provides a downloadable audio file.
Authorizations
Body
- Option 1
- Option 2
Image file to upload and analyze. Supported formats: JPEG, PNG, GIF, BMP, WEBP.
URL of the image to analyze. Either this or image_file must be provided.
"https://mybucket.s3.amazonaws.com/image.png"
Additional prompt to guide the song generation from the image.
300"Generate a relaxing acoustic track inspired by this scene."
Custom lyrics to include in the generated audio.
3000"Let the colors of the sunset fill your heart."
Tags or themes to avoid in the song.
"no heavy metal, avoid loud drums"
Generate instrumental output only. Lyrics will be ignored.
Generate vocal-only output.
Musical key for the song.
"C major"
Beats per minute for the song tempo. Defaults to 0 (auto-selected).
Optional callback URL for async processing results.
"https://example.com/webhook"
Voice ID for converting the generated audio. Cannot be used with vocal_only mode.