Tuesday, November 26, 2013

2013-11-26 Tuesday - Deployment Optimization



A nice little nugget of a problem was handed to me today: identify ways to help an operations team reduce their system maintenance / deployment window [for production system updates] that has somehow grown to require a xx-hour window, and achieve zero downtime (or as close as possible).

The environemnt is complicated in the extreme: highly regulated industry, compliance requirements, clustered servers, high availability, PCI security zones, 3rd party software/service providers, cloud service providers/integrations (SaaS and PaaS), frequent commercial software upgrades/patches, vendor constraints on database schema changes, disaster recovery dependencies, a legion of upstream and downstream data integration dependencies.

For the last year I've been carefully planting seeds of certain ideas in various conversations with key stakeholders within an organization - to begin the gradual introduction of concepts and practices such as DevOps, Continuous Deployment, and Continuous Operations. Now that a sufficient level of pain has been experienced, there is a broad consensus and acceptance that there needs to be change.

"He was not in a hurry, 'hurry' being one human concept he had failed to grok at all. He was sensitively aware of the key importance of correct timing in all acts — but with the Martian approach: correct timing was accomplished by waiting."
Stranger in a Strange Land, by Robert E. Heinlein

I have some ideas, but as a good researcher, first order of business is to review current directions, trends, peer articles.  This posting will be a place for me to share some of the information that may be of interest to others:

Zero Downtime, Instant Deployment and Rollback
http://www.ebaytechblog.com/2013/11/21/zero-downtime-instant-deployment-and-rollback/

Jevgeni Kabanov (ZeroTurnaround)
Pragmatic Continuous Delivery, at W-JAX 2012
http://vimeo.com/79959315

Continuous Operations for Zero Downtime Deployments
http://www.virtualizationpractice.com/continuous-operations-for-zero-downtime-deployments-22680/

The Virtualization Practice
http://www.virtualizationpractice.com/

Deploying the Netflix API
http://techblog.netflix.com/2013/08/deploying-netflix-api.html


Cloud Architecture Tutorial
Constructing Cloud Architecture the Netflix Way
Gluecon May 23rd, 2012, by Adrian Cockroft
http://www.slideshare.net/adrianco/netflix-architecture-tutorial-at-gluecon

Cassandra in the Netflix Architecture, Denis Sheahan
CassandraEU London March 28th, 2012
http://www.slideshare.net/acunu/cassandra-eu-2012-netflixs-cassandra-architecture-and-open-source-efforts

Patterns for Continuous Delivery, Reactive, High Availability, DevOps and Cloud Native Open Source with Netflix OSS
Adrian Cockroft + Ben Christensen, YOW! Workshop Dec'2013
https://speakerdeck.com/adrianco/patterns-for-continuous-delivery-reactive-high-availability-devops-and-cloud-native-open-source-with-netflixoss

Best Practices for Zero Risk, Zero Downtime Database Maintenance
http://www.oracle.com/us/products/database/311390-133499.pdf

VMware vSphere High Availability 5.0 Deployment Best Practices
http://www.vmware.com/files/pdf/techpaper/vmw-vsphere-high-availability.pdf

Free Ebook: Continuous Delivery — What It Is and How to Get Started
http://info.puppetlabs.com/download-free-continuous-delivery-ebook.html

The Phoenix Project, A Novel About IT, DevOps & Helping Your Business Win
http://www.amazon.com/Phoenix-Project-DevOps-Helping-Business/dp/0988262592/

How Draw Something Scaled to 50 million New Users, in 50 Days, with Zero Downtime
http://www.infoq.com/presentations/games-scalability-omgpop

I Ain't Afraid of No Downtime: Scaling Continuous Deployment, by Cody Powell
http://www.codypowell.com/taods/2012/04/i-aint-afraid-of-no-downtime-scaling-continuous-deployment.html

Mandi Walls free ebook, Building a DevOps Culture [Kindle]
http://www.amazon.com/Building-DevOps-Culture-Mandi-Walls-ebook/dp/B00CBM1WFC

Daily Dose of DevOps: 27 People to Follow on Twitter
http://puppetlabs.com/blog/daily-dose-devops-27-people-follow


Selected QCON 2013 San Francisco presentations:

Adopting Continuous Delivery, Adjusting your Architecture
Rachel Laycock, ThoughtWorks
http://qconsf.com/system/files/presentation-slides/Adopting%20Continuous.pdf
 Build Your Own PaaS the Netflix Way
Sudhir Tonse, Manager, Cloud Platform Infrastructure, Netflix
http://qconsf.com/system/files/presentation-slides/BuildYourOwnPaaSTheNetflixWay-QConSF.pdf
Facebook Infrastructure
Pedro Canahuati, Director, Infrastructure Operations
http://qconsf.com/system/files/presentation-slides/ScalingtheOperationsOrganizationatFacebook.pdf

Tools:
Liquidbase:
  • Improved checksum performance
  • CORE-1509: Significantly decreased memory usage, especially with large sql files
  • CORE-1533: Performance improvements in dropAll

ZeroTurnAround's LiveRebel:

 log4j2:
  • "Log4j 2 can automatically reload its configuration upon modification"
  • "Log4j 2 contains next-generation Asynchronous Loggers based on the LMAX Disruptor library. In multi-threaded scenarios Asynchronous Loggers have 10 times higher throughput and orders of magnitude lower latency than Log4j 1.x"

  • Note the performance benchmark results recently posted on takipiblog.com


Puppet Labs:

No comments: