Dl/DataType: Difference between revisions
From stonehomewiki
Jump to navigationJump to search
Stonezhong (talk | contribs) |
Stonezhong (talk | contribs) |
||
| (6 intermediate revisions by the same user not shown) | |||
| Line 66: | Line 66: | ||
= Considerations = | = Considerations = | ||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | <div class="toccolours mw-collapsible mw-collapsed expandable"> | ||
<div class="mw-collapsible-preview"> | <div class="mw-collapsible-preview">Referencing a type</div> | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
A type can be referenced in the format <code>{"$ref": type_id}</code> | |||
For example: | |||
<pre><nowiki> | |||
{"$ref": "#/$defs/aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"} references a type with id of "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" | |||
</nowiki></pre> | |||
</div> | |||
</div> | |||
<p></p> | |||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | |||
<div class="mw-collapsible-preview">Possible extension of the JSON schema to support more primitive types</div> | |||
<div class="mw-collapsible-content"> | |||
JSON Schema only support very few primitive types, such as string, number, int, etc. However, many system support more primitive types than JSON, for example, parquet file can have "timestamp" columns, however, "timestamp" is not supported by JSON Schema specifications. We will add more common primitive types to extend JSON schema specification to fits into many of the DataLake use cases. | |||
</div> | |||
</div> | |||
<p></p> | |||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | |||
<div class="mw-collapsible-preview">Description field helps to improve data discovery</div> | |||
<div class="mw-collapsible-content"> | |||
We build full-text index on description field. This allows user to search type or field based on key word shows up in description field, which improves data discovery. | |||
</div> | |||
</div> | |||
<p></p> | |||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | |||
<div class="mw-collapsible-preview">Schema definition can also include constrains</div> | |||
<div class="mw-collapsible-content"> | |||
Those constrains can help us to validate the data in a content ignostic way uniformly. | |||
For example: | |||
<pre><nowiki> | |||
# This represent a structure which has 2 number member fields for "x" and "y". | |||
{ | |||
id: "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb", | |||
type: "object", | |||
properties: { | |||
"x": { | |||
type: {"$ref": "#/$defs/aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"}, | |||
description: "The x coordinate value, measured in miles", | |||
minimum: 0.0 | |||
}, | |||
"y": { | |||
type: {"$ref": "#/$defs/aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"}, | |||
description: "The y coordinate value, measured in miles", | |||
minimum: 0.0 | |||
} | |||
}, | |||
description: "Represent a point on a two dimensional canvas" | |||
} | |||
</nowiki></pre> | |||
</div> | </div> | ||
</div> | </div> | ||
<p></p> | <p></p> | ||