Dl/Overview: Difference between revisions

From stonehomewiki
Jump to navigationJump to search
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 2: Line 2:
= Data Tiers =
= Data Tiers =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview">Using RDBMS</div>
<div class="mw-collapsible-preview">Data Tiers</div>
<div class="mw-collapsible-content">
<div class="mw-collapsible-content">
{| class="wikitable grid mono section"
{| class="wikitable grid mono section"
Line 41: Line 41:


* dimention tables and fact tables that forms star schema
* dimention tables and fact tables that forms star schema
* a star schema is designed in such a way that it can answer any question about a business area.
<hr /><br />
<hr /><br />


Line 47: Line 48:


* Various query result for specific business questions materized in tables
* Various query result for specific business questions materized in tables
** Queries are generated from star schema from gold tier
* Tables may be replicated to a RDBMS for BI tool to access (sometime you can expose them directly, e.g. Spark Thrift Server)
* Tables may be replicated to a RDBMS for BI tool to access (sometime you can expose them directly, e.g. Spark Thrift Server)


Line 61: Line 63:
graph TD
graph TD
     Scheduler[Apache Airflow/Scheduler]
     Scheduler[Apache Airflow/Scheduler]
     ETLE[ETL Executor&lt;Airflow Task&gt;]
     ETLE[ETL Executor#40;Airflow Task#41;]
     LC[Local ETL Code]
     LC[Local ETL Code]
     ER[ETL Code Repo]
     ER[ETL Code Repo]
     JDBC[JDBC&lt;Thrift Server&gt;]
     JDBC[JDBC#40;Thrift Server#41;]
     User[User&lt;Data Engineer&gt;]
     User[User#40;Data Engineer#41;]
     Spark[Apache Spark]
     Spark[Apache Spark]
     Scheduler --2: trigger--> ETLE
     Scheduler --2: trigger--> ETLE

Latest revision as of 21:02, 25 November 2025

Data Lake Knowledge Center

Data Tiers

ETL

BI Connection