Skip to main content
Xephyr company logo
CASE STUDY

Privacy-Preserving Data Platform for Disability Services

Sunnyfield — Disability Services

days → ~6h

Identity model runtime

8

Source systems unified

dev / rc / prod

Environments

THE CHALLENGE

Challenge

Sunnyfield runs on eight separate enterprise systems spanning HR, payroll, rostering, learning, finance and care management. Data arrived through an unpredictable mix of SQL, SFTP, API and Excel with constant schema drift, and strict privacy obligations around participant and workforce PII meant any analytics layer had to be airtight. Leadership had no reliable single view of the workforce or participants, and no scalable way to build out BI or dashboards on top.

OUR APPROACH

Approach

We designed and built a two-layer production platform. A Python and Prefect orchestrator extracts from every source, applies deterministic PII hashing and generates master entity IDs. A dbt analytics layer sits on top with a four-stage dimensional model — raw, interim, staging, analytics — using SCD Type 2 temporal modelling and comprehensive data quality tests. The platform runs on Azure Synapse and SQL Server across three isolated environments, with row-level security enforced at the warehouse and column-level encryption on the identity store. Power BI consumes directly with RLS preserved end-to-end.

THE RESULTS

Results

Identity resolution runtime dropped from days to around six hours, unblocking downstream analytics. The pipeline is now resilient to real-world source drift — filename changes, header typos, format shifts — and has been steady-state for months. Ongoing data-engineering burden dropped to a fraction of one FTE. The platform now underpins executive dashboards and a second product line built on top of it.

GET STARTED

Ready to Transform Your Data?

Book a call with our team to discuss how AI and data can drive results for your business.