Dl/DatasetRepository: Difference between revisions

From stonehomewiki
Jump to navigationJump to search
 
(2 intermediate revisions by the same user not shown)
Line 22: Line 22:
* A data ingestion app can create dataset via API
* A data ingestion app can create dataset via API
* A data ingestion app can add dataframe to a dataset via API
* A data ingestion app can add dataframe to a dataset via API
</div>
</div>
<p></p>
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview">Dataset SLA Management</div>
<div class="mw-collapsible-content">
Lots of datasets has new dataframe with a given frequency, and if the new dataframe not showed up within certain time, we need to be alerted and be aware of it so we can fix the underneath issue, for example, fix a broken data pipeline.
So we allow:
* User can define a frequency for new dataframe for a dataset
* System will monitor the dataset for new dataframe, it new dataframe missed the frequency, it sends an alert.
* User can also see the new dataframe published time so check how often a dataset missed it's fresness SLA.
</div>
</div>
<p></p>
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview">Dataset Quality Management</div>
<div class="mw-collapsible-content">
* Allow user to define a set of rules to check dataset quality
* Web UI to render dataset quality
* backend daemon to check dataset quality based on rules
</div>
</div>
</div>
</div>
<p></p>
<p></p>

Latest revision as of 00:40, 7 March 2023

Data Lake Knowledge Center

Purpose