Dl/Overview: Difference between revisions
From stonehomewiki
Jump to navigationJump to search
Stonezhong (talk | contribs) (→ETL) |
Stonezhong (talk | contribs) (→ETL) |
||
| Line 5: | Line 5: | ||
<div class="mw-collapsible-preview">ETL Flow</div> | <div class="mw-collapsible-preview">ETL Flow</div> | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
{{#mermaid: | |||
graph TD | graph TD | ||
Scheduler[Apache Airflow/Scheduler] | Scheduler[Apache Airflow/Scheduler] | ||
| Line 26: | Line 21: | ||
USER --5 git push-->ER | USER --5 git push-->ER | ||
}} | }} | ||
<br /> | |||
* 1 Airflow Scheduler trigger DAG (DAG is generated based on metadata) | * 1 Airflow Scheduler trigger DAG (DAG is generated based on metadata) | ||
** The ETL job is a task within an airflow DAG | ** The ETL job is a task within an airflow DAG | ||
| Line 32: | Line 28: | ||
* 3 ETL executor uses dbt library to submit job to Apache Spark via JDBC interface (e.g. via Thrift Server) | * 3 ETL executor uses dbt library to submit job to Apache Spark via JDBC interface (e.g. via Thrift Server) | ||
* 4 Thrift Server take the SQL and pass it to Apache Spark to execute | * 4 Thrift Server take the SQL and pass it to Apache Spark to execute | ||
</div> | </div> | ||
</div> | </div> | ||
<p></p> | <p></p> | ||