Clozers X Ardent
Scaling data pipelines to country scale with AI
Summary
Clozers partnered with Ardent AI to re-architect ingestion so each batch lands in under 2 minutes p95, even as data volume grows. Before this work, adding more counties increased batch size and wait times, which starved downstream analysis and degraded product quality for new markets. Ardent parallelized the pipeline, introduced a durable queue where needed, and orchestrated the full flow in Airflow. Result: stable ingest latency and an unblocked path to market expansion.
Company
Clozers builds an AI-driven comping and analysis platform for real-estate investors, focused on fast, accurate comps and interactive analysis.
Ardent AI is an AI Data Engineer that builds, maintains, and scales production data pipelines.
Problem
Growth was blocked: more counties meant larger batches, longer ingestion, and delayed analytics
Tail latency was unpredictable, which caused SLA misses during spikes
Operability suffered, with retry storms and one-off hotfixes as volume increased
Constraints
Keep p95 < 2 minutes during normal operations and spikes
No data loss, duplicate-safe writes
No breaking changes to downstream analytics contracts
Stay inside the team’s Airflow-based scheduling, logging, and alerts
Approach
1) Make ingestion embarrassingly parallel
Split every batch into fixed-size chunks by tenant and logical key (account, region, time window)
Use Airflow dynamic task mapping so parallelism scales with batch size
Enforce fairness with pools and per-tenant concurrency limits
2) Decouple producers and workers with a durable queue
Writers enqueue chunk descriptors instead of pushing directly to workers
Stateless workers pull and process chunks in parallel, bounded by pool size
Backpressure shifts to queue depth, which keeps wall-clock stable as volume grows
3) Idempotent, effectively-once delivery
Each chunk carries an idempotency key
Writes are upserts with deterministic merge logic
A dead-letter queue isolates poison chunks for automated triage
4) Orchestrate and observe in Airflow
DAG layout: prepare → fan-out → merge, with retries for transient errors
Post-merge audits verify row counts, null rates, and schema contracts before success
Metrics and logs wired into existing dashboards and on-call alerts
Results
Latency: p95 < 2 minutes across observed batch sizes during rollout
Stability: tails no longer grow with input size, behavior is governed by queue depth and pool limits
Safety: idempotent merges and a DLQ remove the risk of double-writes and silent drops
Growth unlocked: Clozers can add counties without degrading product quality, since ingestion no longer blocks downstream analysis
Why Ardent
AI Data Engineer that proposes chunking strategies, generates operators, and wires end-to-end DAGs you can review and lock
Production fit with Airflow, retries, and observability already in place
Fast iteration: tuning chunk sizes, pool limits, and merge logic is quick and reversible
Takeaways for technical leaders
If ingest time rises with batch size, fix chunk size and decouple with a queue. Parallelism then scales while wall-time stays bounded
Idempotency plus a dead-letter queue turns flaky retries into safe retries
Treat Airflow as orchestration, not a data plane. Keep workers stateless and horizontal
An AI Data Engineer like Ardent accelerates the critical plumbing — planners, idempotent merges, audits, and runbooks — which is how you hit sub-2-minute p95 and keep adding markets