Example: Regional Distribution with HA

Combining the first two scenarios allows your application to distribute messages as required and to have high availability in each region. This involves having two or more servers in each region, as shown in the figure below.

Diagram showing replication across regions with high availability within each

Each region is configured as a Group, indicating that the instances within that region should be treated as equivalent, and are intended to have the same topics and messages. Within each group, the instances replicate to each other using sync acknowledgments, to ensure that publishers and subscribers can fail over between the instances. Since a client in a given region does not connect to a server outside the region, we can configure the replication links between the regions to use async acknowledgment, which could potentially reduce the amount of time that an application publishing to AMPS must store outgoing messages before receiving an acknowledgment that a given message is persisted. (Setting these links to use async acknowledgment does not affect the speed of replication or change the behavior of replication in any other way -- this setting only specifies when an instance of AMPS acknowledges the message as persisted.)

The instances in each region are configured to be part of a Group for that region, since these instances are intended to have the same topics and messages. Within a region, the instances replicate to each other using sync acknowledgment. Replication connections to instances at the remote site use async acknowledgment.

In a configuration like the one above, an application must only be allowed to fail over to other instances in its own region. Since replication to other regions uses async acknowledgments, a publisher may have received an acknowledgment that a given message is persisted before it is stored in instances in the other regions, or a subscriber may have received a persisted acknowledgment for a message that has not yet been persisted in other regions.

The instances are configured to downgrade (either via whatever monitoring/server health system is in use or via an AMPS action) to ensure that publishers do not retain an unworkably large number of messages in the event that one of the instances goes offline for an extended period of time. As with all connections where instances replicate to each other, this replication must be configured to have a connection in each direction, from New York 1 to New York 2 as well as from New York 2 to New York 1. (AMPS may optimize this to a single network connection if possible.)

Each instance at a site ensures that it provides passthrough replication to the other instance for both the local group and the remote groups. To optimize bandwidth, the instances at a site may only provide passthrough to the remote instance for the local group. This ensures that once a message arrives at the local group (either from a remote group or over replication from a remote group), it is fully distributed to the local group. To optimize bandwidth, at the risk of increasing the chances of message loss if an entire region goes offline, each instance at a site only passes through messages from the local group to remote sites. This configuration balances fault-tolerance and performance and attempts to minimize the bandwidth consumed between the sites.

Each instance at a site replicates to the remote sites. The instance specifies one Destination for each remote site, with the servers at the remote site listed as failover equivalents for the remote site. With the passthrough configuration, this ensures that each message is delivered to each remote site exactly once. Whichever server at the remote site receives the message, distributes it to the other server using passthrough replication. Notice that some features of AMPS, such as distributed queues (though not LocalQueue or GroupLocalQueue), require full passthrough to ensure correct delivery of messages.

With this configuration, publishers at each site publish to a local AMPS instance. Subscribers subscribe to messages from their local AMPS instances. Both publishers and subscribers use the high availability features of the AMPS client libraries to ensure that if the primary local AMPS instance fails, they automatically fail over to the other instance. Replication is used to deliver both high availability and disaster recovery. In the table below, each row represents a replication destination. Servers in brackets are represented as sets of InetAddr elements in the Destination definition.

Server

Group

Destinations

PassThrough

NewYork 1

NewYork

NewYork 2 / sync ack

.*

[London 1, London 2] / async ack

NewYork

NewYork 2

NewYork

NewYork 1 / sync ack

.*

[London 1, London 2] / async ack

NewYork

London 1

London

London 2 / sync ack

.*

[NewYork 1, NewYork 2] / async ack

London

London 2

London

London 1 / sync ack

.*

[NewYork 1, NewYork 2] / async ack

London

PreviousExample: Regional Distribution NextExample: Hub and Spoke / Expandable Mesh

Last updated 1 month ago