Dl/glossary: Difference between revisions
From stonehomewiki
Jump to navigationJump to search
Stonezhong (talk | contribs) No edit summary |
Stonezhong (talk | contribs) No edit summary |
||
| Line 34: | Line 34: | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
A URI that uniquely identifies an asset, for example: | A URI that uniquely identifies an asset, for example: | ||
* <code> | * <code>asset://s3/bucket_name/foo.parquet</code> -- represent a parquet file stored in AWS S3 | ||
* <code> | * <code>asset://mysql/myserver/mydb/foo</code> -- represent a table in MySQL, server name is myserver, dbname is mydb, table name is foo | ||
* <code> | * <code>asset://mysql/myserver/mydb/foo/?batch_id=1&</code> -- represent a table in MySQL, server name is myserver, dbname is mydb, table name is foo, with a filter, which batch_id column need to match 1 | ||
</div> | </div> | ||
</div> | </div> | ||
| Line 45: | Line 45: | ||
<div class="mw-collapsible-preview">Dataset</div> | <div class="mw-collapsible-preview">Dataset</div> | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
It is a | It is a set of dataframes that has the common schema. | ||
* dataset name is not unique, but name + major_version + minor_version is unique | * dataset name is not unique, but name + major_version + minor_version is unique | ||