Dl/Overview: Difference between revisions
From stonehomewiki
Jump to navigationJump to search
Stonezhong (talk | contribs) No edit summary |
Stonezhong (talk | contribs) (→ETL) |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 2: | Line 2: | ||
= Data Tiers = | = Data Tiers = | ||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | <div class="toccolours mw-collapsible mw-collapsed expandable"> | ||
<div class="mw-collapsible-preview"> | <div class="mw-collapsible-preview">Data Tiers</div> | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
{| class="wikitable grid mono section" | {| class="wikitable grid mono section" | ||
| Line 41: | Line 41: | ||
* dimention tables and fact tables that forms star schema | * dimention tables and fact tables that forms star schema | ||
* a star schema is designed in such a way that it can answer any question about a business area. | |||
<hr /><br /> | <hr /><br /> | ||
| Line 47: | Line 48: | ||
* Various query result for specific business questions materized in tables | * Various query result for specific business questions materized in tables | ||
** Queries are generated from star schema from gold tier | |||
* Tables may be replicated to a RDBMS for BI tool to access (sometime you can expose them directly, e.g. Spark Thrift Server) | * Tables may be replicated to a RDBMS for BI tool to access (sometime you can expose them directly, e.g. Spark Thrift Server) | ||
| Line 61: | Line 63: | ||
graph TD | graph TD | ||
Scheduler[Apache Airflow/Scheduler] | Scheduler[Apache Airflow/Scheduler] | ||
ETLE[ETL Executor | ETLE[ETL Executor#40;Airflow Task#41;] | ||
LC[Local ETL Code] | LC[Local ETL Code] | ||
ER[ETL Code Repo] | ER[ETL Code Repo] | ||
JDBC[JDBC | JDBC[JDBC#40;Thrift Server#41;] | ||
User[User | User[User#40;Data Engineer#41;] | ||
Spark[Apache Spark] | Spark[Apache Spark] | ||
Scheduler --2: trigger--> ETLE | Scheduler --2: trigger--> ETLE | ||