Skip to content

LLM Configuration

DBSprout ships with support for an embedded Qwen2.5-1.5B model that runs entirely on your machine.

Terminal window
# Install LLM support
pip install "dbsprout[llm]"
# Download the model
dbsprout models download
# Generate with spec engine (uses embedded model by default)
dbsprout generate --engine spec

The embedded model uses llama-cpp-python with GBNF grammar constraints to guarantee valid JSON output. Memory usage is under 2GB.

For faster or higher-quality spec generation, use a cloud LLM provider:

Terminal window
# Install cloud support
pip install "dbsprout[cloud]"
# Use OpenAI
export OPENAI_API_KEY="sk-..."
dbsprout generate --engine spec --llm-provider openai
# Use Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
dbsprout generate --engine spec --llm-provider anthropic

DBSprout uses LiteLLM under the hood, so any provider supported by LiteLLM works.

LLM settings can be configured in dbsprout.toml:

[llm]
provider = "embedded" # or "openai", "anthropic", "ollama", etc.
model = "qwen2.5-1.5b"
temperature = 0.1
max_tokens = 4096

LLM usage follows DBSprout’s privacy gradient:

TierWhat’s sentUse case
LocalNothing leaves your machineDefault, uses embedded model
RedactedSchema structure only, no dataColumn names/types sent to cloud
CloudSchema + sample dataBest accuracy, requires API key
TrainingFull data accessFine-tuning only
[privacy]
tier = "local" # default

The AI-generated spec is cached based on a hash of your schema. If your schema hasn’t changed, the cached spec is reused automatically. Cache location: .dbsprout/cache/.

Terminal window
# Clear the spec cache
dbsprout generate --engine spec --no-cache