Site Reliability Engineering (SRE) with Niall Murphy

November 26, 2019

It seems Site Reliability Engineering (SRE) is everywhere. But what exactly is it and how does it fit into the modern software development landscape?

On Nov 26th, we have the honor of hosting a one-day session on SRE with one of the masterminds behind the growing movement, Niall Murphy.

Site Reliability Engineering - the Run part of DevOps

We've all heard the DevOps mantra "you build it, you run it", a great idea, but what does it mean in practice?

In June we had Donovan Brown here to share how he and Microsoft does DevOps, with a focus on software engineering and how to build and ship great products faster. Now we have the opportunity to take that conversation to the next stage! This time we have invited Niall Murphy, the "father" of SRE, to share what SRE is. As an influencer in how Google and Microsoft has adopted SRE to improve the way they operate their cloud services, Niall will give us insights how we can run our services better as well.

Here's a quick summary of what SRE can include:

  • Systems engineering and automation
  • Measuring service level objectives
  • Monitoring systems
  • Release management (at scale without downtime)
  • On-call procedures
  • Incident management

So join us for an interesting talk on how you can take running your systems to the next level!

Agenda

08.30 - 09.00, Breakfast

09.00 - 10.30, Presentations

SRE for small and large companies

SRE is often perceived as a useful but relatively narrow role only appropriate for large scale systems engineering in very large organizations, and irrelevant to everyone else. We hope to convince you that at its core, SRE contains a set of principles that apply as easily to a single-person startup as they do to Google. Along the way, we'll try to produce some evidence that even your boss might find compelling about why your organization should adopt some SRE standard practices.

10.30 - 11.00, Discussions, wrap-up

About Niall

Niall Murphy is the global head of Azure Site Reliability Engineering (SRE) in Microsoft’s Dublin, Ireland office, from which he leads teams across the world looking after critical components of the Azure cloud. He has worked in Internet infrastructure since the mid-1990s, is the author/co-author of numerous best-selling and award-winning books (Site Reliability Engineering most notably), and is probably one of the few people in the world who holds degrees in Computer Science, Mathematics, and Poetry Studies. He lives in Dublin, Ireland, with his wife and two special needs children.

Make sure to check out Niall's book on SRE: Site Reliability Engineering: How Google Runs Production Systems, https://landing.google.com/sre/books/

Venue

The day will be spent at Hobo Hotel in Stockholm inner city, at Brunkebergstorg 4 in Stockholm. The event is free for anyone to attend, just register by clicking the Register button at the top of this page.