Overview
vidbgm is a local-first Rust application for generating adaptive background music from video with Magenta. Media processing and music generation run on your machine, so the core workflow is free to use after setup. Hosted vision providers are optional and may have their own costs.
It has two user surfaces:
- a CLI named
vidbgmfor analysis, generation, render, Magenta setup, status checks, and evals - a Tauri desktop app in
desktop/for the Video Music Studio workflow
The core pipeline is:
- Probe the video with
ffprobe. - Sample frames with
ffmpeg. - Analyze frames through a local or hosted vision provider.
- Convert frame observations into short weighted music prompt slots.
- Generate a local 48 kHz stereo WAV with Magenta RealTime or a fallback backend.
- Export audio-only, replace-video-audio, or mixed-original-audio output.
Mental Model
The app does not send long scene prose directly to Magenta. Frame analysis produces structured visual cues. The timeline layer turns those cues into short musical phrases, and the Magenta bridge receives weighted prompt slots such as Open road melodic techno or Driving motorik synth pulse.
CLI examples use sample-video.mov as a placeholder for your own local media. Large media files are ignored by git, so sample videos stay local.
Main Screens

