Dl/Overview: Difference between revisions

From stonehomewiki
Jump to navigationJump to search
Line 5: Line 5:
<div class="mw-collapsible-preview">ETL Flow</div>
<div class="mw-collapsible-preview">ETL Flow</div>
<div class="mw-collapsible-content">
<div class="mw-collapsible-content">
{| class="wikitable grid mono section"
{{#mermaid:
|-
! Diagram
! Description
|-
|{{#mermaid:
graph TD
graph TD
     Scheduler[Apache Airflow/Scheduler]
     Scheduler[Apache Airflow/Scheduler]
Line 26: Line 21:
     USER --5 git push-->ER
     USER --5 git push-->ER
}}
}}
| style="vertical-align:top;" |
<br />
 
* 1 Airflow Scheduler trigger DAG (DAG is generated based on metadata)
* 1 Airflow Scheduler trigger DAG (DAG is generated based on metadata)
** The ETL job is a task within an airflow DAG
** The ETL job is a task within an airflow DAG
Line 32: Line 28:
* 3 ETL executor uses dbt library to submit job to Apache Spark via JDBC interface (e.g. via Thrift Server)
* 3 ETL executor uses dbt library to submit job to Apache Spark via JDBC interface (e.g. via Thrift Server)
* 4 Thrift Server take the SQL and pass it to Apache Spark to execute
* 4 Thrift Server take the SQL and pass it to Apache Spark to execute
|}
</div>
</div>
</div>
</div>
<p></p>
<p></p>

Revision as of 18:02, 25 November 2025

Data Lake Knowledge Center

ETL