This endpoint allows users to replace a specific time segment of an audio clip using a textual prompt and optional lyrics. The inpainting operation blends new audio content into the selected range, guided by user-defined style and voice preferences.
To use Inpaint, upload a song and provide its full lyrics. Edit only the section in the lyrics you want to change, but keep the rest unchanged. Your edits must match the exact timestamp you select (e.g., if you pick 0:11–0:19, only update lyrics in that range).
ℹ️Experimental Feature: We do not guarantee continuous reliability and a bug-free experience as this feature is in beta.
| Parameter | Type | Required | Description |
|---|---|---|---|
audio_file | UploadFile | Optional | Upload the input audio file. Required if audio_url is not provided. |
audio_url | String | Optional | Public/S3/YouTube URL of the input audio. Required if audio_file is not provided. |
prompt | String | Required | Prompt describing how the replacement should sound. Example: “Replace this part with an opera-style vocal.” |
replace_start_at | Float | Required | Start time (in seconds) of the segment to replace. |
replace_end_at | Float | Required | End time (in seconds) of the segment to replace. |
lyrics | String | Optional | Optional lyrics to use for inpainting. |
gender | String | Optional | Voice style for vocal generation. One of: male, female, neutral. |
webhook_url | String | Optional | Callback URL for async response. |
💡 Note: You must provide eitheraudio_fileoraudio_url— at least one is required.
content-type: multipart/form-data
🔐 Replace{path_to_your_audio_file},api_key, andwebhook_urlbefore executing.
Webhook responses include detailed metadata including task_id, conversion_id, audio files (conversion_path), lyrics etc.
prompt, replace_start_at, or replace_end_at, or neither audio_file nor audio_url provided.URL or S3 path to the input audio.
"https://mybucket.s3.amazonaws.com/song.mp3"
A description of how the replacement should sound.
"Replace this part with an opera-style vocal."
Time in seconds to start replacing audio.
12.5
Time in seconds to stop replacing audio.
20
Uploaded input audio file.
Lyrics to be used for inpainting.
2000"This is where my story begins"
Voice style for the inpainted segment.
male, female, neutral "male"
Callback URL for async processing results.
"https://example.com/webhook"