Skip to main content
POST
/
v1
/
image_to_song
Python
import requests

url = "https://api.musicgpt.com/api/public/v1/image_to_song"
headers = {"Authorization": "<API_KEY>"}
data = {
    "image_url": "https://mybucket.s3.amazonaws.com/image.png",
    "prompt": "Generate a relaxing acoustic track inspired by this scene.",
    "lyrics": "Let the colors of the sunset fill your heart.",
    "make_instrumental": False,
    "vocal_only": False,
    "key": "C major",
    "bpm": 120,
    "webhook_url": "https://example.com/webhook",
    "voice_id": "voice_123"
}

# Option 1: Using image URL
response = requests.post(url, headers=headers, data=data)
print(response.json())

# Option 2: Uploading a local image file
# with open("image.png", "rb") as f:
#     files = {"image_file": f}
#     response = requests.post(url, headers=headers, data=data, files=files)
#     print(response.json())
{
  "success": true,
  "message": "Message Published To Queue",
  "task_id": "task_12345",
  "conversion_id_1": "conv_12345",
  "conversion_id_2": "conv_54321",
  "eta": 300,
  "credit_estimate": 150.5
}
Generate a song from an image by analyzing its content and creating music based on visual cues. Users can optionally provide custom lyrics, select a musical key, adjust tempo, or request instrumental/vocal-only outputs.

Endpoint

POST / image_to_song
This endpoint processes an image (uploaded or via URL) to generate a song. The image is analyzed to create a descriptive prompt, which is then used to generate AI-driven music.

Request Parameters

ParameterTypeRequiredDescription
image_fileUploadFileOptionalUpload the image to analyze. Required if image_url is not provided.
image_urlStringOptionalPublic or S3 URL to the input image. Required if image_file is not provided.
promptStringOptionalAdditional text to guide the song generation (max 300 characters).
lyricsStringOptionalCustom lyrics to include in the generated audio (max 3000 characters).
negative_tagsStringOptionalTags or themes to avoid in the song.
make_instrumentalBooleanOptionalIf true, generates an instrumental version. Defaults to false.
vocal_onlyBooleanOptionalIf true, generates a vocal-only version. Defaults to false.
keyStringOptionalMusical key for the song (e.g., โ€œC majorโ€, โ€œA minorโ€).
bpmIntegerOptionalTempo in beats per minute. Defaults to 0 (auto-selected).
webhook_urlStringOptionalCallback URL for async result delivery.
voice_idStringOptionalVoice ID for converting generated audio. Cannot be used with vocal_only mode.
๐Ÿ’ก Note: You must provide either audio_file or audio_url โ€” at least one is required.
content-type: multipart/form-data

Try it Yourself

Visit the image_to_song Endpoint Explorer to play around โ€” set your payload, hit send, and listen to the generated results live.

Sample Request

cURL

curl -X POST "https://api.musicgpt.com/api/public/v1/image_to_song" \
-H "accept: application/json" \
-H "Authorization: <api_key>" \
-F "image_file=@/path/to/image.png" \
-F "prompt=Generate a relaxing acoustic track inspired by this scene." \
-F "lyrics=Let the colors of the sunset fill your heart." \
-F "make_instrumental=false" \
-F "vocal_only=false" \
-F "key=C major" \
-F "bpm=120" \
-F "webhook_url=https://example.com/webhook" \
-F "voice_id=voice_123"

Python

import requests

url = "https://api.musicgpt.com/api/public/v1/image_to_song"
headers = {"Authorization": "<API_KEY>"}
data = {
    "prompt": "Generate a relaxing acoustic track inspired by this scene.",
    "lyrics": "Let the colors of the sunset fill your heart.",
    "make_instrumental": False,
    "vocal_only": False,
    "key": "C major",
    "bpm": 120,
    "webhook_url": "https://example.com/webhook",
    "voice_id": "voice_123"
}

# Option 1: image_url
files = {}
data["image_url"] = "https://mybucket.s3.amazonaws.com/image.png"
response = requests.post(url, headers=headers, data=data, files=files)

# Option 2: File Upload
# with open("image.png", "rb") as f:
#     files = {"image_file": f}
#     response = requests.post(url, headers=headers, data=data, files=files)

print(response.json())
๐Ÿ” Replace {path_to_your_audio_file}, api_key, and webhook_url before executing.

Sample Response

Success (200 OK)

{
  "success": true,
  "message": "Message Published To Queue",
  "task_id": "task-xyz-123",
  "conversion_id_1": "image-abc",
  "conversion_id_2": "image-def",
  "eta": 40,
  "credit_estimate": 45
}

Webhook Delivery

Once the generation is complete, webhooks will be triggered to deliver the following:

Standard Requests :

  • 2 (webhooks) x conversion details (one per version)
  • 2 (webhooks) x Lyrics with timestamp data
  • 1 Album Cover Image
Webhook responses include detailed metadata including task_id, conversion_id, audio files (conversion_path), lyrics etc.

Common Errors

  • 422 Unprocessable Entity: Missing required fields like prompt, replace_start_at, or replace_end_at, or neither audio_file nor audio_url provided.
  • 500 Internal Server Error: An unexpected error occurred during processing.

The response provides a downloadable audio file.

Authorizations

Authorization
string
header
required

Body

multipart/form-data
image_file
file
required

Image file to upload and analyze. Supported formats: JPEG, PNG, GIF, BMP, WEBP.

image_url
string

URL of the image to analyze. Either this or image_file must be provided.

Example:

"https://mybucket.s3.amazonaws.com/image.png"

prompt
string

Additional prompt to guide the song generation from the image.

Maximum string length: 300
Example:

"Generate a relaxing acoustic track inspired by this scene."

lyrics
string

Custom lyrics to include in the generated audio.

Maximum string length: 3000
Example:

"Let the colors of the sunset fill your heart."

negative_tags
string

Tags or themes to avoid in the song.

Example:

"no heavy metal, avoid loud drums"

make_instrumental
boolean
default:false

Generate instrumental output only. Lyrics will be ignored.

vocal_only
boolean
default:false

Generate vocal-only output.

key
string

Musical key for the song.

Example:

"C major"

bpm
integer
default:0

Beats per minute for the song tempo. Defaults to 0 (auto-selected).

webhook_url
string

Optional callback URL for async processing results.

Example:

"https://example.com/webhook"

voice_id
string

Voice ID for converting the generated audio. Cannot be used with vocal_only mode.

Response

Successfully initiated image-to-song task

success
boolean
message
string
task_id
string
conversion_id_1
string
conversion_id_2
string
eta
integer

Estimated processing time in seconds

credit_estimate
number<float>