Skip to content

DBSprout

Seed data that actually fits your schema. Point at a database or a schema file — get realistic, constraint-aware rows. Offline, deterministic, no API key.

PyPI versionMIT licensePython 3.10+Works offline

lorem ipsum users. Impossible foreign keys. Distributions nothing like production. Fixtures that rot the moment the schema changes. DBSprout reads your real schema — live database or schema file — and grows data that actually fits it, with 100% referential integrity.

Schema-first

No config to start. Point at a database or a schema file.

Terminal window
dbsprout init --db postgresql://localhost/myapp

100% FK integrity

Tables are topologically ordered; FK columns sample from real parent keys; cycles and self-references are resolved automatically.

Deterministic & offline

The same --seed (default 42) produces identical output. No internet, API key, or account required.

-- schema.sql (input)
CREATE TABLE authors (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
country TEXT
);
CREATE TABLE books (
id INTEGER PRIMARY KEY,
author_id INTEGER NOT NULL REFERENCES authors(id),
title TEXT NOT NULL,
price NUMERIC(6,2),
published_on DATE
);
-- seeds/002_books.sql (generated)
INSERT INTO books (id, author_id, title, price, published_on) VALUES
(1, 3, 'The Salt Graves', 14.99, '2021-06-02'),
(2, 1, 'Northwind', 9.50, '2019-11-15'),
(3, 3, 'A Lantern Year', 22.00, '2023-02-28');
-- author_id values are sampled from real authors PKs
Terminal window
dbsprout init --file schema.sql
dbsprout generate --rows 1000
dbsprout validate
EngineSpeedQualityUse when
heuristic100K+ rows/sec~80% semanticDefault — fast fixtures, no model
speccached after first runhigh semanticYou want column-aware accuracy
statisticalfastdistribution-faithfulYou have a real data sample
finetunedcachedhighestYou trained a LoRA adapter
Terminal window
pip install dbsprout