We focused our work on the client-daemon communication in Spines. We develpoped a socket-like API for unreliable, best-effort UDP communication, and also for the session-based TCP reliable communication. This brings us the level of transparency necessary for making Spines easy to use in the socket programming paradigm, in a first step towoards complete transparency. We also analyzed the possibilities of poviding IP multicast service using Spines while using only simple unicast communication at the network level.
We developed an end-to-end reliability over our hop-by-hop reliability approach. We have a complete socket capability, similar to a TCP socket that flows over the overlay end-to-end. As a by product of our approach, we can now provide a TCP-fair implementation of an efficient user-level reliable protocol.
We demonstrated that employing hop-by-hop reliability techniques considerably reduces the average latency and jitter of reliable communication while still being fair with external Internet traffic. In order to deploy our protocols over the Internet we considered networking aspects such as congestion control, internal and external fairness, flow control and end-to-end reliability.
We showed that the benefit of hop-by-hop reliability greatly overcomes the overhead associated with reliable overlay routing given by factors such as processing overhead and CPU scheduling, and achieves much better performance compared to standard end-to-end TCP connections deployed on the same overlay network.
We designed a framework for application level, transparent reliable multicast using the hop-by-hop reliability in Spines. The framework includes end-to-end reliablility, congestion and flow control, and relaxed semantics over reliable multicast that handle partitions, merges, crashes and recoveries. We started the implementation of this framework in our overlay infrastructure.
We investigated some of the survivability aspects of Spines, both in wireless and wired environments. We developed a mechanism of trust based on monitoring the abnormal behaviour of overlay nodes, and an acusation system that would eventually reroute packets to avoid untrusted nodes. We released the first version of Spines (www.spines.org) under a standard BSD licence.
We implemented best-effort multicast in Spines, with an interface that resembles the standard IP Multicast service. Our preliminary tests show that Spines is very scalable with the number of senders, receivers and groups. We plan to release a new version of Spines that incorporates best effort multicast soon.
We have benchmarked the replication server with and without the PostgreSQL database both in a local area cluster and on general wide area networks using the Emulab facility hosted by the University of Utah.
We were able to obtain performance results that show the efficiency of our replication architecture due to the use of an enhanced synchronization algorithm. We show that latency is not a limiting factor in attaining high throughput in wide-area network environments. We are able to sustain similar aggregate throughput on both local area and wide-area setups, outperforming existing synchronous replication solutions and providing grounds for a wide range of applications to adopt replication as a measure for fault tolerance and high availability.
Wackamole is a tool that helps with making a cluster highly available. It manages a bunch of virtual IPs that should be available to the outside world at all times. Wackamole ensures that exactly one machine within the cluster is listening on each virtual IP address that Wackamole manages. If it discovers that particular machines within the cluster are not alive, it will almost immediately ensure that other machines acquire the virtual IP addresses the down machines were managing. At no time will more than one connected machine be responsible for any virtual IP.
Wackamole also works toward achieving a balanced distribution of the public IPs within the cluster it manages.
Wackamole uses the membership notifications provided by the Spread Toolkit , also developed in our lab, to generate a consistent state that is agreed upon among all of the connected Wackamole instances. Wackamole uses this knowledge to ensure that all of the public IP addresses served by the cluster will be covered by exactly one Wackamole instance. We have designed and formally proven the correctness of the algorithm used by Wackamole.
Wackamole now supports four platforms, Linux, FreeBSD, Solaris 8, and Mac OSX. Development has also focused on making Wackamole more robust and fixing deployment issues we received from users. Based on email queries and downloads Wackamole has started to make an impact as a different model for IP failover for clusters and to be used in practice.
An On-Demand Secure Routing Protocol Resilient to Byzantine Failures |
ps,
ps.gz,
pdf.
In ACM Workshop on Wireless Security (WiSe) , Atlanta, Georgia, September
28 2002.
Baruch Awerbuch,
Dave Holmer,
Cristina Nita-Rotaru,
and Herbert Rubens.
An ad hoc wireless network is an autonomous self-organizing system of
mobile nodes connected by wireless links where nodes not in direct
range can communicate via intermediate nodes. A common technique used
in routing protocols for ad hoc wireless networks is to establish the
routing paths on-demand, as opposed to continually maintaining a
complete routing table. A significant concern in routing is the
ability to function in the presence of byzantine failures which
include nodes that drop, modify, or mis-route packets in an attempt to
disrupt the routing service.
|
On the Performance of Consistent Wide-Area Database Replication |
Technical Report CNDS-2003-1, September 2002.
Yair Amir, Claudiu Danilov, Michal Miskin-Amir, Jonathan Stanton and Ciprian Tutu. In this paper we design a generic, consistent replication architecture that enables transparent database replication and we present the optimizations and tradeoffs of the chosen design. We demonstrate the practicality of our approach by building a prototype that replicates a PostgreSQL database system. We provide experimental results for consistent wide-area database replication. We claim that the use of an optimized synchronization engine is the key to building a practical synchronous replication system for wide-area network settings. |
Reliable Communication in Overlay Networks |
ps,
ps.gz,
pdf.
In the Proceedings of the IEEE International Conference on
Dependable Systems and Networks (DSN03), San Francisco, June 2003.
Yair Amir and Claudiu Danilov.
Reliable point-to-point communication is usually achieved in overlay
networks by applying TCP/IP on the end nodes of a connection.
This paper presents an hop-by-hop reliability approach that
considerably reduces the latency and jitter of reliable connections.
Our approach is feasible and beneficial in overlay networks that
do not have the scalability and interoperability requirements of
the global Internet.
|
N-Way Fail-Over Infrastructure for Survivable Servers and Routers. |
To appear in the Proceedings of the IEEE International Conference on
Dependable Systems and Networks (DSN03), San Francisco, June 2003.
Yair Amir, Ryan Caudy, Ashima Munjal, Theo Schlossnagle and Ciprian Tutu. Maintaining the availability of critical servers and routers is an important concern for many organizations. At the lowest level, IP addresses represent the global namespace by which services are accessible on the Internet. We introduce Wackamole, a completely distributed software solution based on a provably correct algorithm that negotiates the assignment of IP addresses among the currently available servers upon detection of faults. This reallocation ensures that at any given time any public IP address of the server cluster is covered exactly once, as long as at least one physical server survives the network fault. The same technique is extended to support highly available routers. The paper presents the design considerations, algorithm specification and correctness proof, discusses the practical usage for server clusters and for routers, and evaluates the performance of the system. |
On November 15, 2002 we released version 2.0.0 of Wackamole, an NxWay fail-over for IP addresses in a cluster. Version 2.0.0 supports the Linux, FreeBSD, Solaris 8, and Mac OSX operating systems. Wackmole is available at www.wackamole.org.
Both Spines and Wackamole project have experienced a steady stream of downloads from our website including commercial, individual, and academic users over the last year. All in all we registered 60 distinct downloads for Spines and over 1000 downloads of Wackamole. We know of several organizations that use Wackamole in production both as NxWay failover for servers and as NxWay failover for routers (which is interesting because we never thought about it ourselves).
The OASIS Dem/Val project used Wackamole to provide failover for edge routers. The same project also used the replication technology to maintain consistent state among different wide area sites. The replication technology is currently evaluated by the Future Combat System (FCS) project.