However this results in our fifth drawback, which is similar-yet-different information units. Why are there multiples? Which one ought to I exploit? Is that this information set nonetheless maintained, or is it a zombie information set that’s nonetheless recurrently up to date however with out anybody overseeing it? The issue involves a head when you might have essential computations that disagree with one another, attributable to counting on information units that needs to be equivalent however aren’t. Offering conflicting experiences, dashboards, or metrics to clients will lead to a lack of belief, and in a worst-case situation, lack of enterprise and even authorized motion.
Even if you happen to type out all of those issues—decreasing latency, decreasing prices, eradicating duplicate pipelines and information units, and eliminating break-fix work—you continue to haven’t offered something that operations can use. They’re nonetheless on their very own, upstream of your ETLs, as a result of the entire cleansing, structuring, transforming, and distribution work is simply actually helpful for these within the information analytics house.
Shift left for a headless information structure
Constructing a headless information structure requires a rethink of how we flow into, share, and handle information in our organizations—a shift left. We extract the ETL->bronze->silver work from downstream and put it upstream inside our information merchandise, a lot nearer to the supply.