Module 07 — Ingest

Every feed.
Every format.

Suppliers, sellers, distributors. Prices, stock, attributes — from FTP, CSV, XML, EDI, API, email or scraping. Normalised into one catalog before your team starts the day.

Two hundred sources. Two hundred formats. Each one breaks differently. Every week.

The solution

One pipeline. All formats.

Encoding detection. Delimiter detection. ID column detection. Spec-file scoring. Identity resolution by GTIN. Format change detection that catches a source renaming a column at 03:47.

Capabilities

What the platform absorbs.

01

Identity resolution

GTIN as canonical key. Sources shipping mismatched SKUs get rejected before they pollute the catalog.

02

Change detection

Two SHA-256 fingerprints. A description edit doesn't invalidate the cached image. A new price triggers a re-index.

03

Validation rules

Per-field regex, options lists, range checks. Bad data flagged and pushed back, not silently healed.

04

Spec-file detection

Wide pivots, narrow Key/Value, multilingual exports — auto-detected by a transparent scoring rubric.

05

SLA per data type

Stock and price refresh on schedule. Descriptions update on change. Each cadence the right one.

How it works

Detect. Normalise. Upsert.

01 — DETECT

Read the file

Encoding, delimiter, ID column, type. Excel hyperlinks decoded. Specification files scored and merged correctly.

02 — NORMALISE

Map to canonical schema

Supplier fields aligned to the platform schema. Synonyms applied. Conflicts reconciled by source priority.

03 — UPSERT

Differential update

Fingerprint-gated writes. New / modified / unchanged status drives downstream re-indexing.

Proof

Production scale.

0K
Records ingested
0
Sources in production
0%
Spec-file detection on test set
Connects to

Feeds the rest of the platform.

Onboard a source.

Send us one feed in any format — supplier, seller, distributor. We onboard it, validate it, and keep it running.

Talk to us