Dl/Best Practices: Difference between revisions

From stonehomewiki
Jump to navigationJump to search
Line 3: Line 3:
= Data Ingestion =
= Data Ingestion =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview">Brief</div>
<div class="mw-collapsible-preview">Always save a copy of raw data</div>
<div class="mw-collapsible-content">
<div class="mw-collapsible-content">
When you do data ingestion, you want to save the raw data for the following reasons
* Your ingestion pipeline may have bugs, saving raw data allows you to fix bugs and re-populate the data
* Raw data may not meed the data quality and you may ignore it, in case you ignore it, keep the raw data allows you to check what kind of data quality problem they are, and sometimes you can inform the data producer to have it fixed.
</div>
</div>
</div>
</div>
<p></p>
<p></p>

Revision as of 08:59, 9 September 2024

Data Lake Knowledge Center

Data Ingestion