indentia.ai

Data Hub & lineage

One hub for every shape of data — with the receipts attached.

Structured rows, unstructured documents and IoT telemetry rarely live together. In the Indentia Data Hub they do — joined by what they mean, not by where they came from. Every record arrives with full lineage, so a question always has an answer and a paper trail.

Map your sources Back to the platform

How it combines

Three data shapes, one entity-keyed hub.

Reactive & proactive

Acts when the data does. Looks when it doesn't.

Some sources emit events the moment something changes — a chat message arrives, a sensor crosses a threshold, a record gets updated. The hub reacts immediately. Other sources — old databases, file shares, archive systems — never tell you anything. The hub scans those on a schedule, detects deltas, and pulls only what's new. One model, two behaviours, no gaps.

  • Reactive — webhooks, CDC streams, NATS events, IoT telemetry. New data is visible within seconds.
  • Proactive — scheduled crawlers with delta detection. Only changed rows / files / objects come through.
  • Cross-source joins — a customer's contract (structured), their support emails (unstructured) and their device telemetry (IoT) all link to the same entity.
  • One query language — SPARQL over the unified graph. Lineage and data live side by side.

Capabilities

What the hub does for the data.

One hub for all data shapes

Structured tables, unstructured documents and IoT telemetry land in the same hub. Joined by entity (an order, a sensor, a person, a contract) — not by file location.

Lineage on every record

Every record carries an OpenLineage chain back to its source: which file, which sensor, which transformation, which approval. Answer the regulator with a query, not a forensic exercise.

Reactive ingestion

New events trigger pipelines automatically. A document drop, an IoT signal, a row change — each one fans out to the consumers that care, with backpressure to keep things sane.

Proactive scanning

For sources that don't emit events, the hub scans on a schedule — with delta detection so unchanged data doesn't get re-processed.

Data contracts

Each producer publishes a contract: schema, freshness, SLA. Breakages are caught at the boundary, not deep inside a downstream notebook.

Lineage and data, same store

Lineage is RDF in the same knowledge graph as the data itself. Query "show me every report that depended on this dataset" with one SPARQL statement.

Available to

Once it's in the hub, it's everywhere it needs to be.

Search

Hybrid retrieval that joins structured rows with unstructured paragraphs and live signals.

Agents

Multi-step agents reason across all three shapes — with lineage attached to every claim.

Analytics & BI

Lineage-aware datasets feed dashboards, notebooks and forecasting models.

Audit & compliance

Trace any output backward to every source that touched it.

Lineage in practice

Trace any answer back to every source that shaped it.

A regulator asks "where did this number come from?". A controller asks "which contracts referenced this clause version?". A model owner asks "which datasets train this classifier?". With lineage co-located in the knowledge graph, those become one-line queries — not month-long forensic projects.

Show me with our data