LLM Configuration
Embedded Model (Default)
Section titled “Embedded Model (Default)”DBSprout ships with support for an embedded Qwen2.5-1.5B model that runs entirely on your machine.
# Install LLM supportpip install "dbsprout[llm]"
# Download the modeldbsprout models download
# Generate with spec engine (uses embedded model by default)dbsprout generate --engine specThe embedded model uses llama-cpp-python with GBNF grammar constraints to guarantee valid JSON output. Memory usage is under 2GB.
Cloud Providers
Section titled “Cloud Providers”For faster or higher-quality spec generation, use a cloud LLM provider:
# Install cloud supportpip install "dbsprout[cloud]"
# Use OpenAIexport OPENAI_API_KEY="sk-..."dbsprout generate --engine spec --llm-provider openai
# Use Anthropicexport ANTHROPIC_API_KEY="sk-ant-..."dbsprout generate --engine spec --llm-provider anthropicDBSprout uses LiteLLM under the hood, so any provider supported by LiteLLM works.
Configuration
Section titled “Configuration”LLM settings can be configured in dbsprout.toml:
[llm]provider = "embedded" # or "openai", "anthropic", "ollama", etc.model = "qwen2.5-1.5b"temperature = 0.1max_tokens = 4096Privacy Tiers
Section titled “Privacy Tiers”LLM usage follows DBSprout’s privacy gradient:
| Tier | What’s sent | Use case |
|---|---|---|
| Local | Nothing leaves your machine | Default, uses embedded model |
| Redacted | Schema structure only, no data | Column names/types sent to cloud |
| Cloud | Schema + sample data | Best accuracy, requires API key |
| Training | Full data access | Fine-tuning only |
[privacy]tier = "local" # defaultSpec Caching
Section titled “Spec Caching”The AI-generated spec is cached based on a hash of your schema. If your schema hasn’t changed, the cached spec is reused automatically. Cache location: .dbsprout/cache/.
# Clear the spec cachedbsprout generate --engine spec --no-cache