Evaluating BFT Protocols for Spire

Henry Schuh & Sam Beckley

Our goal is to survey Byzantine Fault Tolerance protocol implementations, and identify the best option that meets the performance and scalability requirements of the Spire system. Benchmarking results, conclusions, and a discussion of next steps are available in the presentation slides. This project is for Advanced Distributed Systems and Networks (Spring 2017).

Overview

SCADA (Supervisory Control And Data Acquisition) systems run the world's infrastructure, controlling everything from regional power stations to nuclear aircraft carriers. In the past these systems communicated using proprietary protocols. As a result, there was little thought given to security, and very few cyber-threats to these systems, Stuxnet being the notable exception. Now, however, many of these systems are being plugged into the internet as a means of saving money and can no longer rely on an air gap to protect them. The DSN lab has developed a system, Spire, to provide SCADA systems with intrusion tolerance and other security features.

The core of Spire is Prime, an intrusion-tolerant consensus algorithm. While Prime guarantees bounded performance under attack - a must for SCADA systems that need to respond to events in 100s of milliseconds - it brings with it complexity that does not scale well, putting a limit on the number of Prime replicas there can be in the Spire system and ultimately limiting the number of intrusions the system can tolerate.

Our work has been looking into alternatives to the current Prime implemetation to provide better performance and scalability. We first considered using trusted hardware to reduce the number of replicas needed to tolerate a fault, though ultimately this proved unfeasible as the hardware was too slow. We then began a survey of other intrusion tolerant protocols, and ultimately selected BFT-SMART as a viable candidate to achieve high performance in Spire. Our results and discussion are available in the presentation slides.

Resources

BFT-SMART

BFT-SMART is based on the BFT protocol defined in Cachin's Yet Another Visit to Paxos. The implementation is presented in State Machine Replication for the Masses with BFT-SMART by Sousa and Alchieri. The code is available on GitHub.

Prime

Prime is presented in Prime: Byzantine Replication Under Attack by Amir et al. Past versions are available at the project site.

PBFT

PBFT is presented Castro and Liskov's Practical Byzantine Fault Tolerance. Its implementation is available at the project site.

Min-BFT

Min-BFT is presented in Efficient Byzantine Fault Tolerance by Veronse et al. Its implementation is available at the project site.

Spinning

Spinning is presented in Spin One's Wheels? Byzantine Fault Tolerance with a Spinning Primary by Veronse et al. Its implementation is available at the project site.

BFT-Mencius

BFT-Mencius is presented in Bounded Delay in Byzantine-Tolerant State Machine Replication by Milosevic et al. Currently, there is no available implementation.