Dl/Product Release Checklist: Difference between revisions

From stonehomewiki
Jump to navigationJump to search
 
(10 intermediate revisions by the same user not shown)
Line 6: Line 6:
* You should use a centralized logging system so it is easy to search logs from different service/component of your product.
* You should use a centralized logging system so it is easy to search logs from different service/component of your product.
* Your logging should not contain sensitive information or offensive words.
* Your logging should not contain sensitive information or offensive words.
</div>
</div>
<p></p>
= Metrics =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* Your application should emit telemetrics (time series) to a metrics system (e.g. AWS Cloud Watch)
* You should have a centrailzed UI to watch telemetrics from different service/component of your product
</div>
</div>
<p></p>
= Alarms =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* You should define alarms based on your telemetrics
* The alarm should be able to notify your devop, for example, via pagerduty
</div>
</div>
<p></p>
= Security =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* Security vulnerability assessment
** Make sure your product does not have security vulnerabilities
* Access Control
** prevent unauthorized access to protected information
*** access could be "read", "write", "delete", "list", etc.
* Access Audit
** Make sure access to the product is tracked, tracked information should include:
*** Who is accessing?
*** What kind of access? (read/write/delete/list/etc...)
*** When the access happened
*** What has been accessed?
** access audit log should be organized in such way that is easy to search
** access audit log should be retained in reasonable time, also the retained duration should comply to government regulations.
* SSO Authentication
** Your Web UI should use SSO to authenticate user. An anti pattern is to have your product maintain it's own username/password, (e.g. current Airflow for Tier-1 and Tier-2)
*** Having 4~5 products with each maintain their own username and password is a nightmare!
</div>
</div>
<p></p>
= Service Availability =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* Highly Available
** Your service should be highly available. A common pattern is haing redundancy, so if your active server is down, your standby server can take over the control. And we expect the switch to be automatic.
</div>
</div>
<p></p>
= Capacity =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* You should deploy your service over the day-to-day capacity. For example, you should be prepared your service to handle 200% of traffic comparing your normal traffic.
* You should have "capacity review" constantly, a common practice is to review capacity every year, and book the capacity for the entire year (with predicted growth)
</div>
</div>
<p></p>
= Beta Environment =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* For any product, you should have a beta environment
* You should always deploy your change to beta environment first, verify nothing is broken before deploy to production.
</div>
</div>
<p></p>
= User facing document =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* Any product should have a user facing document.
* User facing document should be in sync with product evolvement
</div>
</div>
<p></p>
= Design document =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* Make sure document your design.
* Make sure your deisgn doc is in sync when you change your deisgn.
</div>
</div>
<p></p>
= CICD Pipeline =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* Your product's development environment should support CICD
* Any product using Python should have at least 80% of code coverage (line based, branch based)
</div>
</div>
<p></p>
= Data Safety =
<div class="toccolours mw-collapsible mw-collapsed expandable">
<div class="mw-collapsible-preview"></div>
<div class="mw-collapsible-content">
* To prevent from physical data loss or logical data loss, you need to backup data periodically
* Certain percent of data lost is tolerable since backup does not happen continously.
</div>
</div>
</div>
</div>
<p></p>
<p></p>

Latest revision as of 17:55, 7 February 2024

Logging

Metrics

Alarms

Security

Service Availability

Capacity

Beta Environment

User facing document

Design document

CICD Pipeline

Data Safety