Service Outages Illustrate Need For 3rd Party SLAs

This post was originally published on the Rigor Web Performance blog.

It is still early in 2015, but there have already been numerous instances of web technology platforms experiencing service disruptions lasting over an hour in length. These disruptions often times have a ripple affect across the internet, impacting anyone that integrates with the service.

A couple of weeks ago, I wrote an article detailing one such outage related to New Relic’s RUM beacon. Earlier this year, a Facebook outage caused major service interruptions for its major partners Instagram, Vimeo and many others. A Facebook spokesperson told The Verge that the outage “occurred after we introduced a change that affected our configuration systems.”

facebook outage

Outages like this that have a wide-ranging impact bring to light the need for organizations to have service level agreements (SLAs) with all 3rd-party integrations and services that they host. These agreements are already commonplace for “mission critical” services such as cloud infrastructure, CDNs, and ISPs. However, seemingly harmless integrations with services such as social media networks, ad-serving networks, trackers, or marketing analytics plugins have been immune from this practice.

These “harmless” integrations can quickly become the source of serious performance problems for your website or web application. This is why I encourage sites to have clearly defined SLAs in place for all 3rd-parties that they host and to monitor these services with a neutral 3rd-party monitoring platform to report on the availability of the service.

fox guarding hen houseRight now, most 3rd-parties do not offer real-time monitoring of their scripts, and the few that do monitor their scripts internally. At Rigor, we often advise customers to be wary of reports on vendor performance that originate from the vendor themselves.  We operate under the simple mantra “trust, but verify”.

When working with 3rd-party providers that require hosted scripts on your site, here are some items to discuss when negotiating a clear SLA:

  • An annual percentage uptime guarantee
  • A process for reimbursing site owners if uptime drops below the guarantee
  • A neutral 3rd party monitoring platform to report on the availability of the service

Hopefully as site owners become more aware of the impact of “harmless” 3rd-party integrations and services, the demand for properly optimized scripts, improved monitoring and reporting, and greater organizational accountability will rise.

Published by

Chapman Lever

The Most Interesting Man in the World

Leave a Reply

Your email address will not be published. Required fields are marked *