Configuring Replication

Replication configuration involves the configuration of two or more instances of AMPS. For testing purposes both instances of AMPS can reside on the same physical host before deployment into a production environment. When running both instances on one machine, the performance characteristics will differ from production, making this setup more useful for testing configuration correctness than testing overall performance.

Any instance that is intended to receive messages via replication must define an incoming replication transport as one of the Transports for the instance. An instance may have only one incoming replication transport.

Any instance that is intended to replicate messages to another instance must specify a Replication stanza in the configuration file with at least one Destination. An instance can have multiple Destination declarations: each one defines a single outgoing replication connection.

For details on setting up a replication transport, refer to Configuring Incoming Replication Transports and for configuring destinations, see Configuring Outgoing Replication Destinations.

In AMPS replication, instances should only be configured as part of the same Group if they are fully equivalent. That is, not only should they contain the same messages, but they should be considered failover alternatives for applications and other AMPS servers. If two servers are not intended to be fully replicated (for example, if there is one-way replication between a production server and a test server), they should have different Group values.

It's important to make sure that when running multiple AMPS instances on the same host there are no conflicting ports. AMPS will emit an error message and will not start properly if it detects that a port specified in the configuration file is already in use.

For the purposes of explaining this example, we're going to assume a simple hot-hot replication case where we have two instances of AMPS - the first host is named amps-1 and the second host is named amps-2. Each of the instances are configured to replicate data to the other. That is, all messages published to amps-1 are replicated to amps-2 and vice versa. This configuration ensures that a message published to one instance is available on the other instance in the case of a failover (although, of course, the publishers and subscribers should also be configured for failover).

Every instance of AMPS that will participate in replication must have a unique Name among all of the instances that are part of replication.

All instances that have the same Group must be able to be treated as equivalent by AMPS replication and AMPS clients.

If two instances of AMPS should be treated differently (for example, one instance receives publishes while the other is a read-only instance that receives one-way replication), those instances should be in different groups.

Replication Setup Example

We will first show the relevant portion of the configuration used in amps-1, and then we will show the relevant configuration for amps-2.

All topics to be replicated must be recorded in the transaction log. The examples below omit the transaction log configuration for brevity. Please reference the Record and Replay Messages chapter for information on how to configure a transaction log and choose which topics are recorded in the transaction log.

<AMPSConfig>
    <Name>amps-1</Name>
    <Group>DataCenter-NYC-1</Group>
    ...

    <Transports>
        <Transport>

            <!-- The amps-replication transport is required. This is a
                 proprietary message format used by AMPS to replicate
                 messages between instances. This AMPS instance will
                 receive replication messages on this transport. The
                 instance can receive messages from any number of
                 upstream instances on this transport.

                 An instance of AMPS may only define one incoming
                 replication transport.  -->

            <Name>amps-replication</Name>
            <Type>amps-replication</Type>
            <InetAddr>10004</InetAddr>
        </Transport>

        <!-- Transports for application use also need to
             be defined. An amps-replication transport can
             only be used for replication -->

        ... transports for client use here ...

    </Transports>

    ...

    <!-- All replication destinations are defined inside the Replication block. -->

    <Replication>

        <!-- Each individual replication destination defines outgoing
             replication, that is, messages being replicated from this
             instance of AMPS to another instance of AMPS. -->

        <Destination>

            <!-- The replicated topics and their respective message
                 types are defined here. AMPS allows any number of
                 Topic definitions in a Destination. -->
            <Topic>
                <MessageType>fix</MessageType>

                <!-- The Name definition specifies the name of the
                     topic or topics to be replicated. The Name option
                     can be either a specific topic name or a regular
                     expression that matches a set of topic names. -->
                <Name>topic</Name>

            </Topic>
            <Topic>
              <!-- Replicate any topic that uses the JSON message type
                   and that starts with /orders -->
              <MessageType>json</MessageType>
              <Name>^/orders/</Name>
            </Topic>

            <Name>amps-2</Name>

            <!-- Fully synchronize messages, including messages that
                 were not originally published to this instance. -->
            <PassThrough>.*</PassThrough>

            <!-- The group name of the destination instance (or instances).
                 The name specified here must match the Group defined for
                 the remote AMPS instance, or AMPS reports an error and
                 refuses to connect to the remote instance. -->
            <Group>DataCenter-NYC-1</Group>

            <!-- Replication acknowledgment can be either synchronous or
                 asynchronous. This does not affect the speed or priority
                 of the connection, but does control when this instance
                  willacknowledge the message as safely persisted. -->

            <SyncType>sync</SyncType>

            <!-- The Transport definition defines the location to which
                 this AMPS instance will replicate messages. The InetAddr
                 points to the hostname and port of the downstream replication
                 instance. The Type for a replication instance should always
                 be amps-replication. -->
            <Transport>

                <!-- The address, or list of addresses, for the replication
                     destination. -->
                <InetAddr>amps-2-server.example.com:10005</InetAddr>
                <Type>amps-replication</Type>
            </Transport>
        </Destination>
    </Replication>

    ...

</AMPSConfig>

For the configuration of amps-2, we will use the following example. While this example is similar, only the differences between the amps-1 configuration will be called out.

<AMPSConfig>
    <Name>amps-2</Name>
    <Group>DataCenter-NYC-1</Group>

    ...


    <Transports>

        <!-- The amps-replication transport is required. This
             is a proprietary message format used by AMPS to
             replicate messages between instances. This AMPS
             instance will receive replication messages on
             this transport. The instance can receive messages
             from any number of upstream instances on this
             transport.

             An instance of AMPS may only define one incoming
             replication transport.  -->

        <Transport>
            <Name>amps-replication</Name>
            <Type>amps-replication</Type>

            <!-- The port where amps-2 listens for replication messages
                 matches the port where amps-1 is configured to send its
                 replication messages. This AMPS instance will receive
                 replication messages on this transport. The instance can
                 receive messages from any number of upstream instances on
                 this transport. -->
            <InetAddr>10005</InetAddr>
        </Transport>
    </Transports>

    ...

    <Replication>
        <Destination>

            <!-- The Topic definitions for amps-2 match
                 the definitions for amps-1 so that these
                 topics contain the same messages in the
                 transaction log on both instances. -->

            <Topic>
                <MessageType>fix</MessageType>
                <Name>topic</Name>
            </Topic>
            <Topic>
              <MessageType>json</MessageType>
              <Name>^/orders/</Name>
            </Topic>
            <Name>amps-1</Name>

            <!-- Fully synchronize messages, including messages that
                 were not originally published to this instance. -->
            <PassThrough>.*</PassThrough>

            <Group>DataCenter-NYC-1</Group>

            <SyncType>sync</SyncType>
            <Transport>

                <!-- The replication destination port for amps-2 is
                     configured to send replication messages to the
                     same port on which amps-1 is configured to listen
                     for them. -->
                <InetAddr>amps-1-server.example.com:10004</InetAddr>
                <Type>amps-replication</Type>
            </Transport>
        </Destination>
    </Replication>

    ...

</AMPSConfig>

These example configurations replicate the topic named topic of the message type nvfix and any topic of the message type json that begins with /orders/ between the two instances of AMPS. To replicate more topics, these instances could add additional Topic blocks.

Downstream Persistence Acknowledgment: Sync vs Async

When publishing to a topic that is recorded in the transaction log, it is recommended that publishers request a persisted acknowledgment. The persisted acknowledgment message is how AMPS notifies the publisher that a message received by AMPS is considered to be safely persisted, as specified in the configuration. (The AMPS client libraries automatically request this acknowledgment on each publish command when a publish store is present for the client -- that is, any time that the client is configured to ensure that the publish is received by the AMPS server.)

Depending on the replication destination configuration for the AMPS instance that receives the message, that persisted acknowledgment message will be delivered to the publisher at different times in the replication process.

There are two options: sync (synchronous) or async (asynchronous) acknowledgment. These two acknowledgment SyncType options control when the instance of AMPS will acknowledge the message as persisted. In other words, this controls when the message publisher will receive a persisted acknowledgment.

AMPS will not return a persisted acknowledgment to the publisher for a message until:

The message has been stored to the local transaction log (and SOW as applicable), and
All downstream replication destinations using sync acknowledgment have acknowledged the message.

The acknowledgment type (SyncType) has no effect on how an instance of AMPS replicates the message to other instances of AMPS. The process of sending messages is identical for the instance sending messages. The instance that receives the messages has no information on the acknowledgment type the upstream link has configured, so all incoming messages are processed in the same way. The acknowledgment type only affects whether the instance pushing the message must receive an acknowledgment from that Destination before it will acknowledge a message as having been persisted.

It's typical for an instance of AMPS to have multiple destinations with different acknowledgment types. When this is the case, the instance can acknowledge a message when all destinations using sync acknowledgment have acknowledged the message. No destinations using async acknowledgment are considered.

The figure below shows the cycle of a message being published in a replicated instance, and the persisted acknowledgment message being returned back to the publisher. Notice that, with this configuration, the publisher will not receive an acknowledgment if the remote destination is unavailable.

60East recommends that when you use sync replication, you consider setting a policy for downgrading the link when a destination is offline, as described in Downgrading Acknowledgments for a Destination.

Diagram showing sequence for a publish with sync acknowledgment replication destination

The sequence for a destination that uses async acknowledgment is different.

For a destination that uses async acknowledgment, the AMPS instance replicating the message can send a persisted acknowledgment message back to the publisher as soon as the message is stored in the local transaction log and SOW stores. The instance does not wait for acknowledgment from the destination, which means that acknowledgment can happen before the replicated instance has stored the message.

The figure below shows the cycle of a message being published with a SyncType configuration set to async acknowledgment.

Diagram showing sequence for a publish with async acknowledgment replication destination

By default, replication destinations do not affect when a message is delivered to a subscription. Optionally, a subscriber can request the fully_durable option on a bookmark subscription (that is, a replay from the transaction log). When the fully_durable option is specified, AMPS does not deliver a message to that subscriber until all replication destinations using sync acknowledgment have acknowledged the message.

Every instance of AMPS that accepts publish commands, SOW delete commands or allows consumption of messages from queues, should specify at least one destination that uses sync acknowledgment. If a publish or queue consumer may fail over between two (or more) instances of AMPS, those instances should specify sync acknowledgment between them to prevent a situation where a message could be lost if an instance fails immediately after acknowledging a message to a publisher.

A destination configured for sync acknowledgment can be downgraded to async acknowledgment while AMPS is running. This can be useful in cases where a server is offline for an extended period of time due to hardware failure or persistent network issues. While the destination is downgraded, AMPS considers that destination to be using async acknowledgment, as described in the next section.

Downgrading Acknowledgments for a Destination

AMPS provides the ability to temporarily downgrade a replication link from synchronous to asynchronous acknowledgment. This feature is useful to relieve memory or storage pressure on publishers should a downstream AMPS instance prove unstable, unresponsive, or be experiencing excessive latency to the point that it should be considered to be offline. A link can be downgraded using an action or explicitly downgraded from the AMPS administrative console. Likewise, a link that has previously been downgraded can be upgraded using an action or from the AMPS administrative console.

When a replication link is downgraded, that link will use async acknowledgment until the link upgrades or until AMPS restarts.

Downgrading a replication link to using async (asynchronous) acknowledgment means that any persisted acknowledgment message that a publisher may be waiting on will no longer wait for the downstream instance to confirm that it has committed the message to its downstream Transaction Log or SOW store. AMPS immediately considers the downstream instance to have acknowledged the message for existing messages, which means that if AMPS was waiting for acknowledgment from that instance to deliver a persisted acknowledgment, AMPS immediately sends the persisted acknowledgment when the instance is downgraded.

Downgrading the acknowledgment type reduces the reliability guarantees provided by that replication link. Because those guarantees are reduced, the publisher can remove messages that it would have to retain if the guarantees were enforced.

The result of a link being downgraded is:

The number of messages that the publisher must retain is reduced, but
The downgraded link is unsafe for the publisher to fail over to
The downgraded link is unsafe for a bookmark subscriber to fail over to

Automatic downgrade is most suitable for a situation where an instance should be considered offline or unavailable. If an instance is configured to use an action to downgrade the acknowledgment type, it should also be configured to use an action to upgrade acknowledgment.

Downgrading a destination means that this instance will not wait for that destination to acknowledge a message before acknowledging that message to publishers or upstream instances. It does not affect any other behavior of the instance.

A publisher or queue consumer must not fail over from this instance to a destination that has been downgraded to async acknowledgment. This can cause message loss, since the upstream instance may have acknowledged a message that the downstream instance has not yet processed.

A bookmark subscriber must not fail over from this instance to a destination that has been downgraded to async acknowledgment. This can cause replay gaps, since that destination is no longer considered when determining whether a message is persisted.

Automatically Downgrading and Upgrading Acknowledgment

AMPS can be configured to automatically downgrade a replication link to async if the remote side of the link cannot keep up with persisting messages or becomes unresponsive. This option prevents unreliable links from holding up publishers but increases the chances of a single instance failure resulting in message loss, as described above. AMPS can also be configured to automatically upgrade a replication link that has previously been downgraded.

Since downgrading a link to a destination affects the consistency and durability guarantees provided by the set of AMPS instances as a whole, use caution when configuring the parameters. In general, it's a good idea to set an interval that is larger than the amount of time at which an instance would be considered to be unresponsive or offline.

Automatic downgrade is implemented as an AMPS action. To configure automatic downgrade, add the appropriate action to the configuration file as shown below:

<AMPSConfig>
    ...
    <Actions>
        <Action>
            <On>
                <Module>amps-action-on-schedule</Module>
                <Options>

                    <!--This option determines how often AMPS
                        checks whether destinations have fallen
                        behind. In this example, AMPS checks
                        destinations every 15 seconds. In most
                        cases, 60East recommends setting this to
                        half of the Interval setting. -->
                    <Every>15s</Every>
                </Options>
            </On>
            <Do>
                <Module>amps-action-do-downgrade-replication</Module>
                <Options>

                    <!--The maximum amount of time for a destination
                        to fall behind. If AMPS has been waiting for
                        an acknowledgment from the destination for
                        longer than the Interval, AMPS downgrades the
                        destination. In this example, AMPS downgrades any
                        destination for which an acknowledgment has taken
                        longer than 300 seconds. -->
                    <Age>300s</Age>
                </Options>
            </Do>
            <Do>
                <Module>amps-action-do-upgrade-replication</Module>
                <Options>

                   <!-- The threshold for upgrading the replication link
                        back to sync acknowledgment. If the destination is
                        behind by less than this amount, and was previously
                        downgraded to async acknowledgment, AMPS will upgrade
                        to sync acknowledgment. -->
                    <Age>10s</Age>
                </Options>
            </Do>
        </Action>
    </Actions>
   ...
</AMPSConfig>

In this configuration file, AMPS checks every 15 seconds to see if a destination has fallen behind by 300 seconds. If a destination has fallen behind by more than 300 seconds, that destination should no longer be considered online. Typically, this would be set to a duration longer than the time at which monitoring of that instance would produce alerts that the instance is unavailable.

AMPS downgrades the destination to async acknowledgment. That destination will no longer be considered when acknowledging messages to publishers. Once the link to the destination is downgraded, connections to this instance should not consider that destination to be safe for failover until the link has again been upgraded.

All publishers using a publish store should be able to hold a number of messages equal to the number of messages published, at peak message volume, for a time period equal to the periodicity of the downgrade check plus the threshold for downgrade. With the configuration above, a publisher that publishes at a peak rate of 10,000 messages per second should, at a minimum, be able to allocate a publish store that holds 750,000 messages.

In some cases, it is important that a destination maintain a minimum number of destinations that use sync acknowledgment. For those cases, an instance-level Tuning parameter is available that will prevent the action from downgrading a connection if doing so would reduce the number of destinations that use sync acknowledgment below the configured limit. This parameter does not guarantee whether a specific destination will continue using sync acknowledgment. This parameter only limits whether AMPS will downgrade a destination that meets downgrade criteria. AMPS will not upgrade a destination that has previously been downgraded if a connection is lost, even if this means that the number of currently connected destinations that use sync acknowledgment is less than the configured minimum. See the section on Instance-Level Configuration for details.

PreviousReplication Best Practices NextConfiguring Outgoing Replication Destinations

Last updated 27 days ago