Today’s post is regarding https://techcrunch.com/2017/02/28/amazon-aws-s3-outage-is-breaking-things-for-a-lot-of-websites-and-apps/
These type of occurrences are becoming more and more common. Tons of company has placed a ton of faith into the Amazon ecosystem, and time and time again, it looks like Amazon has let them down. When these things broke, it broke at a MASSIVE scale (AWS outage knocks Amazon, Netflix, Tinder and IMDb in MEGA data collapse, https://www.theregister.co.uk/2015/09/20/aws_database_outage/ )
http://research.omicsgroup.org/index.php/Amazon_Web_Services
There were other outages in 2012, 2013, and probably more unlisted. I think it’s an interesting challenge that Amazon is tackling, and I feel like more and more of the web is putting all of their eggs into one giant basket.
I wonder, if we were to build a truly scalable, and unlikely to be impacted system, maybe it might make sense to diversify the system’s infrastructure to utilize multiple services. Maybe some redundancy at the DNS layer, then some more at the LB, some more at how things are replicated, localized and so on… Just something to reflect on due today’s outage, “How can I prevent my organization from being impacted by this?”