Six building blocks — each independent, all composable.
Pull data from CSV, SQL databases, BigQuery, Snowflake, or in-memory DataFrames
through a unified load() interface.
Structure your pipeline as Bronze → Silver → Gold layers. Each layer transforms data and writes to configurable sinks with full lineage tracking.
Define dimension and fact tables, then query cross-table joins with a
declarative StarSchema.query() — no manual SQL required.
Register tables and named join relationships. The associative model lets you slice and explore data without writing join logic each time.
Compose reports from text, table, and chart sections. Render to HTML or Excel. Every report captures a lineage manifest at render time.
Mount interactive Plotly/Dash dashboards directly inside the TraceBi web server — no separate process needed.
Two ways to work with TraceBi — pick what fits your workflow.
Install and use TraceBi from a notebook or script. Build your medallion pipeline, query your star schema, and render reports to HTML or Excel — all with pure Python.
gold = GoldLayer(schema=schema).query(filters)
report = Report("Sales", sections=[
TableSection("Revenue", gold["revenue"]),
])
report.render(HTMLRenderer("out.html"))
Register your connectors, models, reports, and pipelines in
web/demo_app.py, then run the server to get a live dashboard
with API docs, report preview, and lineage diagrams.
registry.add_connector("orders", CSVConnector(...))
registry.add_pipeline("sales", runner)
--reload --port 8000
Every DataSet carries a chain of LineageNode records
that describe exactly which connector, transformation, and timestamp produced
each result. Click any report to visualise the full DAG.
Every UI feature is backed by a documented REST API. Automate report generation, trigger pipelines, or integrate TraceBi data into other tools — all without touching the UI.