16 Nov 2018

2nd European Chaos Engineering Day - Wednesday 5 December


The goal of the workshop is to gather the European chaos engineering community. You will see great talks, learn about chaos engineering, advanced Devops technology and meet cool people.

Overview of the program (long version below):

  • Lorin Hochstein, Senior Software Engineer at Netflix, Distinguished keynote speaker, What I’ve learned doing chaos at Netflix
  • Philipp Leitner, Assistant Professor at Chalmers University, Gothenburg, AWS Lambda and #serverless. What’s all the fuzz about?
  • Nazareno V. Feito Matias, Principal engineer at Oracle, Chaos Engineering at Oracle
  • Long Zhang, PhD student in computer science at KTH Royal Institute of Technology, A Chaos Engineering System for Live Analysis and Falsification of Exception-handling in the JVM
  • Benoit Baudry, Professor at KTH Royal Institute of Technology and Director of the Castor Software Research Center, Diversity and Chaos

Participation is free, register by RSVP at.

Program:

  • Lorin Hochstein, Senior Software Engineer at Netflix, Distinguished keynote speaker
    • Title:  What I’ve learned doing chaos at Netflix 
    • Abstract: Fred Brooks once observed that “the programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination.” And yet, software systems fail in ways that simultaneously defy the imaginations of their creators and are strikingly similar to failures in physical systems. In this talk, I’ll share what I’ve learned about failures in distributed systems based on my experiences working at Netflix. I’ll discuss the strategies we employed for building and applying chaos engineering tools to find vulnerabilities, which strategies worked well, and which ones did not go as well as we hoped.
    • Bio: Lorin Hochstein is a Senior Software Engineer on the CORE (Cloud Operations & Reliability Engineering) Team at Netflix, where he works on ensuring that Netflix remains available.  He was previously Senior Software Engineer at SendGrid Labs, Lead Architect for Cloud Services at Nimbis Services, Computer Scientist at the University of Southern California’s Information Sciences Institute, and Assistant Professor in the Department of Computer Science and Engineering at the University of Nebraska–Lincoln. He has a PhD in computer science from the University of Maryland.
  • Philipp Leitner, Assistant Professor at Chalmers University, Gothenburg
    • AWS Lambda and #serverless. What’s all the fuzz about? To some, AWS Lambda and other “serverless” technologies embody the future of cloud computing. Yet non-trivial industrial success stories are, at least today, few and far between. In this talk we will explore the idea of serverless computing, and discuss (based on recent research results) promises and challenges in industrial adoption. We’ll introduce the “serverless mindset”, and discuss what kinds of applications lend themselves well to being built on top of AWS Lambda. Finally, we will also glimpse into the future of serverless, and discuss some secret (and some not-so-secret) plans that cloud providers have with serverless.
  • Benoit Baudry, Professor at KTH Royal Institute of Technology and Director of the Castor Software Research Center.
  • Nazareno V. Feito Matias, Principal engineer at Oracle
    • Chaos Engineering at Oracle. The purpose of the talk is not tool-centric of a tool that people may not use, but on how to do chaos engineering experiments on a company that has many environments, acquired products and a plethora of different teams; and also how difficult it is to get management buy-in on chaos engineering as of now. Also things to consider when designing a chaos engineering tool/system, growing pains basically. Another thing is that a chaos engineering tool is not only shutting down servers but it can have a certain intelligence, with a backend of mathematical models, applied statistics and some machine learning. I might also demonstrate our chaos engineering tool called MadBull (similar to the ChaosMonkey but written in Python, 100% terminal/cli and towards Oracle Cloud)
  • Long Zhang, PhD student in computer science at KTH Royal Institute of Technology
    • A Chaos Engineering System for Live Analysis and Falsification of Exception-handling in the JVM. I will introduce a novel design and implementation of a chaos engineering system in Java called ChaosMachine. It provides a unique and actionable analysis on exception-handling capabilities in production, at the level of try-catch blocks.? Then I will also share some interesting evaluations on our approach, which reveal both strengths and weaknesses of the resilience code of a software system at the level of exception handling.

More information and registration