Overview of High Availability
AMPS is designed for high performance, mission-critical applications. Those systems typically need to meet availability guarantees. To reach those availability guarantees, systems need to be fault tolerant. It's not realistic to expect that networks will never fail, components will never need to be replaced, or that servers will never need maintenance. For high availability, you build applications that are fault tolerant: that keep working as designed even when part of the system fails or is taken offline for maintenance. AMPS is designed with this approach in mind. It assumes that components will occasionally fail or need maintenance and helps you to build systems that meet their guarantees even when part of the system is offline.
When you plan for high availability, the first step is to ensure that each part of your system has the ability to continue running and delivering correct results if any other part of the system fails. You also ensure that each part of your system can be independently restarted without affecting the other parts of the system.
The AMPS server includes the following features that help ensure high availability:
Transaction logging writes messages to persistent storage. In AMPS, the transaction log is not only the definitive record of what messages have been processed, it is also fully queryable by clients. Highly available systems make use of this capability to keep a consistent view of messages for all subscribers and publishers. The AMPS transaction log is described in detail in the chapter on Record and Replay Messages.
Replication allows AMPS instances to copy messages between instances. AMPS replication is peer-to-peer, and any number of AMPS instances can replicate to any number of AMPS instances. Replication can be filtered by topic. By default, AMPS instances only replicate messages published to that instance. An AMPS instance can also replicate messages received via replication using passthrough replication: the ability for instances to pass replication messages to other AMPS instances.
Heartbeat monitoring to actively detect when a connection is lost. Each client configures the heartbeat interval for that connection.
The only communication between instances of AMPS is through replication. AMPS instances do not share state through the filesystem or any out-of-band communication.
AMPS high availability and replication do not rely on a quorum or a controller instance. Each instance of AMPS processes messages independently. Each instance of AMPS manages connections and subscriptions locally, for maximum availability.
The AMPS client libraries include the following features to help ensure high availability:
Heartbeat monitoring to actively detect when a connection is lost. As mentioned above, the interval for the heartbeat is configurable on a connection-by-connection basis. The interval for heartbeat can be set by the client, allowing you to configure a longer timeout on higher latency connections or less critical operations, and a lower timeout on fast connections or for clients that must detect failover quickly.
Automatic reconnection and failover allows clients to automatically reconnect when disconnection occurs, and to locate and connect to an active instance.
Reliable publication from clients, including an optional persistent message store. This allows message publication to survive client restarts as well as server failover.
Subscription recovery and transaction log playback allows clients to recover the state of their messaging after restarts.
When used with a regular subscription or a sow and subscribe, the HAClient can restore the subscription at the point the client reconnects to AMPS.
When used with a bookmark subscription, the HAClient can provide the ability to resume at the point the client lost the connection. These features guarantee that clients receive all messages published in the order published, including messages received while the clients were offline. Replay and resumable subscription features are provided by the transaction log, as described in Record and Replay Messages.
For details on each client library, see the developer's guide for that library. Further samples can be found in the client distributions, available from the 60East website at http://www.crankuptheamps.com/develop.
Last updated