The James Webb Space Telescope — making 300 points of failure reliable

If you ask any Site Reliability or DevOps engineer how they feel about a deployment plan with over 300 single points of failure, you’d see a lot of nauseous faces and an outbreak of nervous tics! However, NASA has decided that the best way to design, deploy and operate the JWST depends on their 300 and more steps succeeding, one after the other, with no recourse or second chance. How did we get into this situation and (more importantly) why do we have the confidence that this will succeed?


Daca imi da si mie ani de zile, sau cat timp dureaza un proiect NASA, da ma descurc si eu cu 300 points of failure.