A nice little nugget of a problem was handed to me today: identify ways to help an operations team reduce their system maintenance / deployment window [for production system updates] that has somehow grown to require a xx-hour window, and achieve zero downtime (or as close as possible).
The environemnt is complicated in the extreme: highly regulated industry, compliance requirements, clustered servers, high availability, PCI security zones, 3rd party software/service providers, cloud service providers/integrations (SaaS and PaaS), frequent commercial software upgrades/patches, vendor constraints on database schema changes, disaster recovery dependencies, a legion of upstream and downstream data integration dependencies.
For the last year I've been carefully planting seeds of certain ideas in various conversations with key stakeholders within an organization - to begin the gradual introduction of concepts and practices such as DevOps, Continuous Deployment, and Continuous Operations. Now that a sufficient level of pain has been experienced, there is a broad consensus and acceptance that there needs to be change.
"He was not in a hurry, 'hurry' being one human concept he had failed to grok at all. He was sensitively aware of the key importance of correct timing in all acts — but with the Martian approach: correct timing was accomplished by waiting."I have some ideas, but as a good researcher, first order of business is to review current directions, trends, peer articles. This posting will be a place for me to share some of the information that may be of interest to others:
- Stranger in a Strange Land, by Robert E. Heinlein
Zero Downtime, Instant Deployment and Rollback
Jevgeni Kabanov (ZeroTurnaround)
Pragmatic Continuous Delivery, at W-JAX 2012
Continuous Operations for Zero Downtime Deployments
The Virtualization Practice
Deploying the Netflix API
Best Practices for Zero Risk, Zero Downtime Database Maintenance
VMware vSphere High Availability 5.0 Deployment Best Practices
Free Ebook: Continuous Delivery — What It Is and How to Get Started