Example: Regional Distribution with HA
Last updated
Last updated
Copyright 2013-2024 60East Technologies, Inc.
Combining the first two scenarios allows your application to distribute messages as required and to have high availability in each region. This involves having two or more servers in each region, as shown in the figure below.
Each region is configured as a Group
, indicating that the instances within that region should be treated as equivalent, and are intended to have the same topics and messages. Within each group, the instances replicate to each other using sync
acknowledgments, to ensure that publishers and subscribers can fail over between the instances. Because a client in a given region does not connect to a server outside the region, we can configure the replication links between the regions to use async
acknowledgment, which could potentially reduce the amount of time that an application publishing to AMPS must store outgoing messages before receiving an acknowledgment that a given message is persisted. (Setting these links to use async
acknowledgment does not affect the speed of replication or change the behavior of replication in any other way -- this setting only specifies when an instance of AMPS acknowledges the message as persisted.)
The figure below shows the expanded detail of the configuration for these servers.
The instances in each region are configured to be part of a Group
for that region, since these instances are intended to have the same topics and messages. Within a region, the instances replicate to each other using sync
acknowledgment. Replication connections to instances at the remote site use async
acknowledgment.
In a configuration like the one above, an application must only be allowed to fail over to other instances in its own region. Since replication to other regions uses async
acknowledgments, a publisher may have received an acknowledgment that a given message is persisted before it is stored in instances in the other regions, or a subscriber may have received a persisted acknowledgment for a message that has not yet been persisted in other regions.
The instances are configured to downgrade (either via whatever monitoring/server health system is in use or via an AMPS action) to ensure that publishers do not retain an unworkably large number of messages in the event that one of the instances goes offline for an extended period of time. As with all connections where instances replicate to each other, this replication must be configured to have a connection in each direction, from Chicago 1
to Chicago 2
as well as from Chicago 2
to Chicago 1
. (AMPS may optimize this to a single network connection if possible.)
Each instance at a site ensures that it provides passthrough replication to the other instance for both the local group and the remote groups. To optimize bandwidth, the instances at a site may only provide passthrough to the remote instance for the local group. This ensures that once a message arrives at the local group (either from a remote group or over replication from a remote group), it is fully distributed to the local group. To optimize bandwidth, at the risk of increasing the chances of message loss if an entire region goes offline, each instance at a site only passes through messages from the local group to remote sites. This configuration balances fault-tolerance and performance and attempts to minimize the bandwidth consumed between the sites.
Each instance at a site replicates to the remote sites. The instance specifies one Destination
for each remote site, with the servers at the remote site listed as failover equivalents for the remote site. With the passthrough configuration, this ensures that each message is delivered to each remote site exactly once. Whichever server at the remote site receives the message, distributes it to the other server using passthrough replication. Notice that some features of AMPS, such as distributed queues (though not LocalQueue
or GroupLocalQueue
), require full passthrough to ensure correct delivery of messages.
With this configuration, publishers at each site publish to a local AMPS instance. Subscribers subscribe to messages from their local AMPS instances. Both publishers and subscribers use the high availability features of the AMPS client libraries to ensure that if the primary local AMPS instance fails, they automatically fail over to the other instance. Replication is used to deliver both high availability and disaster recovery. In the table below, each row represents a replication destination. Servers in brackets are represented as sets of InetAddr
elements in the Destination
definition.
Chicago 1
Chicago
Chicago 2 / sync ack
.*
[NewYork 1, NewYork 2] / async ack
Chicago
[London 1, London 2] / async ack
Chicago
Chicago 2
Chicago
Chicago 1 / sync ack
.*
[NewYork 1, NewYork 2] / async ack
Chicago
[London 1, London 2] / async ack
Chicago
NewYork 1
NewYork
NewYork 2 / sync ack
.*
[Chicago 1, Chicago 2] / async ack
NewYork
[London 1, London 2] / async ack
NewYork
NewYork 2
NewYork
NewYork 1 / sync ack
.*
[Chicago 1, Chicago 2] / async ack
NewYork
[London 1, London 2] / async ack
NewYork
London 1
London
London 2 / sync ack
.*
[Chicago 1, Chicago 2] / async ack
London
[NewYork 1, NewYork 2] / async ack
London
London 2
London
London 1 / sync ack
.*
[Chicago 1, Chicago 2] / async ack
London
[NewYork 1, NewYork 2] / async ack
London