An empty test database doesn’t catch real bugs. A database full of copied production data is a compliance violation. Flock gives you the third option: realistic synthetic populations seeded directly into your schema.Documentation Index
Fetch the complete documentation index at: https://docs.pidgeon.health/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
- Pidgeon CLI installed (
dotnet tool install -g pidgeon) - A database with an existing schema (PostgreSQL, MySQL, or SQL Server)
- Database credentials with read/write access
Step 1: Connect to your database
- Tables classified as patient, encounter, clinical, financial, or reference
- Foreign key relationships mapped
- Column types and constraints detected
Step 2: Learn patterns from existing data (optional)
If your database already has sample data, Flock can learn the statistical distribution patterns:- Column value distributions (age ranges, gender ratios)
- Referential patterns (which diagnosis codes appear together)
- Temporal patterns (encounter durations, admission-to-discharge intervals)
Step 3: Generate a synthetic population
Generate 1,000 patients with related records:- Demographics: Age, gender, race, and geography distributions matching US Census data
- Comorbidities: Realistic disease correlations (diabetes with hypertension, obesity with sleep apnea)
- Temporal coherence: Admissions before discharges, lab orders before results
- Family linkage: Realistic household structures and family relationships
Step 4: Preview with dry-run
Before writing anything, preview the generated SQL:Step 5: Seed the database
Step 6: Verify the results
Check that data was seeded correctly:Alternative output formats
Flock can generate data in multiple formats beyond SQL:Clean up synthetic data
Remove all Flock-generated records when you’re done:Next steps
- Explore output formats for HL7 streams and FHIR bundles
- Generate test messages from seeded population data
- Monitor interfaces processing your synthetic data

