Dl/devops
From stonehomewiki
Purpose
- Fix problems promptly to make sure the team delivers what it promised with high quality.
Best Practices
Always have runbook or SOP
For any typical devop task, make sure you have runbooks or SOPs well documented. This helps:
- New team member to ramp up quickly
- Create a standard for solving the same type of problems, thus reduce human mistakes.
- Prevent people from wasting time, since they just need to follow runbooks or SOPs.
- Help the team to predict the workload and the time for solving devop issues based on runbooks or SOPs.
Always tries to fix the root cause
Many times, operational failure reveals weakness about the underlying product. In such cases, you should also think about fixing the underlying product, improve robustness so it is more resistance to operational failure.
Make sure any failures are reported
For example, any data pipeline failure, which require human to restart the pipeline should have a ticket cut. Having a failure without the team being notified should be absolutely avoided.
Retrieved from "https://home.stonezhong.net/index.php?title=Dl/devops&oldid=67"