Episode 3 & 4 | Data Destination & Data Governance | Data Journey
What are data destinations? In a very abstract sense, data destination is another input along the series of process elements in a data pipeline. However, when calling out an element as the destination, it is really seen as the final destination such as a database, data lake or data warehouse. And yet, any element within the data pipeline has aspects of a final destination (and scaling challenges).
It is common to have no more than three elements in a classic data pipeline. The source, where a puller and/or sender element initiates the data movement and the second is the final destination data platform.
Now at scale, a third data lake element is introduced to buffer the first element with the final element. But just about all cases, the data lake is part of the story at scale even when it is not in the middle. Often it is the case that data lakes are backups, if any part of the
pipeline fails. In this scenario, after the pipeline has been restored, the data lake is used to replay the stream of events. As mentioned in the first two episodes, there is a clear theme on the importance of cloud object storage in the data journey.
So how is data destination related to data governance? Thomas Hazel, CTO & Founder of ChaosSearch walks us through it in this video.
#reinventing #KnowBetter #RoadtoreInvent #DataJourney #DevOps #SRE #DataEngineers #DataScientists #datapipelines #schema
Data Journey: https://lnkd.in/dTYymbvy