Dl/DataType
From stonehomewiki
Revision as of 04:56, 23 August 2023 by Stonezhong (talk | contribs) (Created page with "<p> Data Lake Knowledge Center | Models</p> = Introduction = <div class="toccolours mw-collapsible mw-collapsed expandable"> <div class="mw-collapsible-preview">Definition</div> <div class="mw-collapsible-content"> <big><b>A DataType object describe the schema of nested data.</b></big> <b>Fields</b> <pre><nowiki> id: UUID Primary key type: str The type of the data. For example: "int", "number", "object". We follow the [https://json-sc...")
Data Lake Knowledge Center | Models
Introduction
Definition
A DataType object describe the schema of nested data.
Fields
id: UUID
Primary key
type: str
The type of the data. For example: "int", "number", "object". We follow the [https://json-schema.org/ JSON Schema] standard to use type.
properties: Optional[dict]
If type is "object", this field list all properties.
description: Optional[str]
Human readable document about this type
items: DataType
If type is "array", this specifies the array element type
Examples:
# This represent a "string" type
{
id: "617b2e86-9698-4b99-8956-57ce99d8de39",
type: "string"
}
Considerations
url should have enough information to locate the data
url field should have enough information for user to locate the data. For example, "s3://mubucket/stock_quotes/2023-08-20.jsonl" is a good url if your datalake only lives one AWS region. If your datalake crosses multiple AWS regions, you should put region ID in the url so you know from which region the bucket belongs to.
Retrieved from "https://home.stonezhong.net/index.php?title=Dl/DataType&oldid=285"