|
Scalable Byzantine Replication Under Attack
An NSF grant 0716620 (August 2007 - July 2010) to Johns Hopkins University. A component of the NSF Cyber Trust program.
Principal Investigator: Yair Amir.
Overview
As network environments become increasingly hostile, even well protected
distributed information systems, constructed with security in mind, are likely
to be compromised. Hence, architecting large-scale distributed systems that
function correctly and provide adequate performance even when parts of them are
compromised is one of the most important challenges. Byzantine replication is
emerging as a promising direction to mitigate server compromises. Experience
with Byzantine replication protocols reveals considerable shortcomings in the
underlying theoretical foundation. Namely, existing fault models, metrics, and
correctness criteria used to reason about and construct Byzantine replication
algorithms, fail to capture properties that manifest themselves in wide-area
environments. Since all existing Byzantine replication algorithms were designed
to meet the standard safety and liveness criteria, they all exhibit critical
vulnerabilities not covered by the standard models. This project will develop
the theoretical foundation, architectural framework, and algorithmic techniques
for a scalable wide-area Byzantine replication system that provides strong
performance guarantees under attack. This includes:
- Expanding the existing theoretical fault models to better encapsulate the
unique characteristics and performance vulnerabilities associated with scalable
wide-area Byzantine replication systems.
- Defining useful metrics for evaluating and comparing different
architectures, configurations, and algorithms with a focus on their performance
under sophisticated attacks.
- Developing an architectural framework for scalable Byzantine replication
that can be customized to topology, performance, and resiliency requirements of
specific wide-area systems.
- Developing specific algorithms for wide-area environments that provide
strong performance guarantees under attack.
Students
Results and Current Activities
- We have shown that standard leader-based replication protocols are vulnerable to performance attacks
that can render them useless in practice even though they meet the traditional correctness criteria of safety
and liveness.
 
- We have devised a new correctness criterion, Bounded-Delay, to assess the performance of state
machine replication protocols when under attack.
 
- We developed Prime (Performance-Oriented Replication In Malicious Environments), a new state machine
replication protocol that achieves a new level of resilience to performance attacks and meets the Bounded-Delay
correctness criterion.
 
- Collaborating with the Navigators group while at the University
of Lisboa, Portugal, we analyzed the extent to which randomized intrusion-tolerant replication protocols (which do
not rely on a leader for coordination) are vulnerable to performance degradation under attack. We devised a timing
attack that can likely be used to significantly reduce performance.
 
- We developed a new view change protocol for Prime that offers a fundamentally stronger performance guarantee
by preventing faulty servers from delaying progress by causing long view changes.
 
- We designed and are implementing a scalable attack-resilient architecture for intrusion-tolerant wide-area
replication.
Related Publications
-
Byzantine Replication Under Attack
Yair Amir, Brian Coan, Jonathan Kirsch, John Lane
In Proceedings of the 38th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2008),
Anchorage, Alaska, June 2008, pp. 197-206.
Byzantine-resilient replication protocols satisfy two standard
correctness criteria, safety and liveness, in the presence of
Byzantine faults. In practice, however, faulty processors can, in some
protocols, significantly degrade performance by causing the system to
make progress at an extremely slow rate. While ``correct'' in the
traditional sense, systems vulnerable to such performance degradation
are of limited practical use in adversarial environments. This paper
argues that techniques for mitigating such performance attacks are
needed to bridge this ``practicality gap'' for intrusion-tolerant
replication systems. We propose a new performance-oriented correctness
criterion, and we show how failure to meet this criterion can lead to
performance degradation. We present a new Byzantine replication
protocol that achieves the criterion and evaluate its performance in
fault-free configurations and when under attack.
-
Prime: Byzantine Replication Under Attack
Yair Amir, Brian Coan, Jonathan Kirsch, John Lane
Technical Report CNDS-2009-4.
Presentations
|