Dl/Best Practices: Difference between revisions

From stonehomewiki
Jump to navigationJump to search
Line 1: Line 1:
<p> [[dl/home|Data Lake Knowledge Center]] </p>
<p> [[dl/home|Data Lake Knowledge Center]] </p>
= Platform =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview">Spark</div>
<div class="mw-collapsible-content">
Apache Spark is a good platform for batch based data processing as well as streaming based data processing. Advantage:
* Scalable
* Well supported (DataBricks is backing up this product)
* Well adopted
* Supported by many cloud providers (AWS EMR, Azure Azure HDInsight, [https://cloud.google.com/dataproc GCP Dataproc], oci dataflow)
</div>
</div>
<p></p>


= Data Ingestion =
= Data Ingestion =

Revision as of 09:33, 9 September 2024

Data Lake Knowledge Center

Platform

Data Ingestion

Data Governance