Uncovering bugs in Distributed Storage Systems during Testing (not in production!) – Deligiannis et al. 2016
We interviewed technical leaders and senior managers in Microsoft Azure regarding the top problems in distributed system development. The consensus was that one of the most critical problems today is how to improve testing coverage so that bugs can be uncovered during testing and not in production. The need for better testing techniques is not specific to Microsoft; other companies such as Amazon and Google, have acknowledged that testing methodologies have to improve to be able to reason about the correctness of increasingly more complex distributed systems that are used in production.
The AWS team used formal methods with TLA+, which was highly effective but falls short of checking the actual executable code. The Microsoft IronFleet team used the Dafny language and program verifier to verify system correctness and compile it to…
View original post 1,019 more words