Dl/devops: Difference between revisions
From stonehomewiki
Jump to navigationJump to search
Stonezhong (talk | contribs) |
Stonezhong (talk | contribs) No edit summary |
||
| (5 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
<p> [[dl/home|Data Lake Knowledge Center]] </p> | |||
= Purpose = | = Purpose = | ||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | <div class="toccolours mw-collapsible mw-collapsed expandable"> | ||
<div class="mw-collapsible-preview"></div> | <div class="mw-collapsible-preview"></div> | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
* Fix problems promptly to make sure the team | * Fix problems promptly to make sure the team delivers what it promised with high quality. | ||
</div> | </div> | ||
</div> | </div> | ||
| Line 14: | Line 16: | ||
For any typical devop task, make sure you have runbooks or SOPs well documented. This helps: | For any typical devop task, make sure you have runbooks or SOPs well documented. This helps: | ||
* | * New team member to ramp up quickly | ||
* | * Create a standard for solving the same type of problems, thus reduce human mistakes. | ||
* | * Prevent people from wasting time, since they just need to follow runbooks or SOPs. | ||
* | * Help the team to predict the workload and the time for solving devop issues based on runbooks or SOPs. | ||
</div> | |||
</div> | |||
<p></p> | |||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | |||
<div class="mw-collapsible-preview">Always tries to fix the root cause</div> | |||
<div class="mw-collapsible-content"> | |||
Many times, operational failure reveals weakness about the underlying product. In such cases, you should also think about fixing the underlying product, improve robustness so it is more resistance to operational failure. | |||
</div> | |||
</div> | |||
<p></p> | |||
<div class="toccolours mw-collapsible mw-collapsed expandable"> | |||
<div class="mw-collapsible-preview">Make sure any failures are reported</div> | |||
<div class="mw-collapsible-content"> | |||
For example, any data pipeline failure, which require human to restart the pipeline should have a ticket cut. Having a failure without the team being notified should be absolutely avoided. | |||
</div> | </div> | ||
</div> | </div> | ||
<p></p> | <p></p> | ||