Reference
CLI Commands
| Command | Purpose |
|---|---|
analyze | Probe video, sample frames, run vision analysis, and write analysis artifacts. |
generate | Analyze if needed and write generated background music as a WAV. |
render | Generate music and mux or mix it into a video output. |
eval | Evaluate generated audio against a video and timeline. |
vision-smoke | Run a small frame-analysis smoke check. |
music-smoke | Generate a short audio-only smoke sample. |
sample-pack | Create a review folder with style samples and optional renders. |
magenta-setup | Download or prepare Magenta resources, model files, and bridge dependencies. |
magenta-status | Print detected bridge, resources, model, and runtime readiness. |
See the CLI Guide for command examples, option groups, and the recommended run order.
Vision Providers
| Provider | Use case | Default API key behavior |
|---|---|---|
openai-compatible | LM Studio, local gateways, vLLM, compatible proxies | No key unless --vision-api-key-env is set |
openai | Hosted OpenAI-compatible API | Uses OPENAI_API_KEY |
anthropic | Anthropic Messages API | Uses ANTHROPIC_API_KEY |
Hosted providers require an explicit --vision-model.
Common Options
| Option | Values or default | Purpose |
|---|---|---|
--video PATH | required | Source video file. |
--prompt TEXT | required | Initial music direction. |
--frame-interval-seconds N | default: 5 | Sampling cadence for frame analysis. |
--duration N | default: full video | Limit generation or render duration. |
--workdir PATH | command-specific | Artifact output directory for manifests, frames, timelines, and reports. |
--vision-provider NAME | openai-compatible, openai, anthropic | Vision adapter to use. |
--vision-base-url URL | provider default | OpenAI-compatible gateway or provider root URL. |
--vision-model MODEL | required for hosted providers | Vision model name. |
--vision-profile PROFILE | default: balanced; values: fast, balanced, quality | Frame-analysis prompt/detail profile. |
--magenta-backend BACKEND | default: auto; values: auto, bridge, cli, synth | Music backend selection. |
--magenta-model MODEL | default: mrt2_small; values include mrt2_small, mrt2_base | Magenta model selection. |
--magenta-runtime RUNTIME | default: mlx; values: mlx, jax | Magenta runtime selection. |
--prompt-update-mode MODE | values: continuous, segment-stitch | Prompt update semantics. |
--audio-mode MODE | values: replace, mix | Render behavior. |
--music-volume-db DB | default: -3 | Gain applied to generated music in mix mode. |
--original-volume-db DB | default: -18 | Gain applied to source audio in mix mode. |
Output Artifacts
| Artifact | Purpose |
|---|---|
analysis.json | Structured frame observations. |
timeline.json | Time-aligned prompt timeline and weighted prompt slots. |
frames_manifest.json | Requested and actual frame extraction metadata. |
contact_sheet.jpg | Visual frame summary. |
segments.csv | Segment-level prompt and timing data. |
eval.md | Human-readable evaluation report when generated. |
music.wav | Generated 48 kHz stereo music. |
rendered-video.mov | Rendered video output when using render. |
Secrets
Do not store API keys in manifests or docs. Use environment variables:
OPENAI_API_KEYfor hosted OpenAI-compatible vision callsANTHROPIC_API_KEYfor Anthropic vision calls- a custom variable named by
--vision-api-key-envfor gateways or proxies