Configuring Replication

Replication configuration involves the configuration of two or more instances of AMPS. For testing purposes both instances of AMPS can reside on the same physical host before deployment into a production environment. When running both instances on one machine, the performance characteristics will differ from production, so running both instances on one machine is more useful for testing configuration correctness than testing overall performance.

Any instance that is intended to receive messages via replication must define an incoming replication transport as one of the Transports for the instance. An instance may have only one incoming replication transport.

Any instance that is intended to replicate messages to another instance must specify a Replication stanza in the configuration file with at least one Destination. An instance can have multiple Destination declarations: each one defines a single outgoing replication connection.

In AMPS replication, instances should only be configured as part of the same Group if they are fully equivalent. That is, not only should they contain the same messages, but they should be considered failover alternatives for applications and other AMPS servers. If two servers are not intended to be fully replicated (for example, if there is one-way replication between a production server and a test server), they should have different Group values.

It's important to make sure that when running multiple AMPS instances on the same host there are no conflicting ports. AMPS will emit an error message and will not start properly if it detects that a port specified in the configuration file is already in use.

For the purposes of explaining this example, we're going to assume a simple hot-hot replication case where we have two instances of AMPS - the first host is named amps-1 and the second host is named amps-2. Each of the instances are configured to replicate data to the other. That is, all messages published to amps-1 are replicated to amps-2 and vice versa. This configuration ensures that a message published to one instance is available on the other instance in the case of a failover (although, of course, the publishers and subscribers should also be configured for failover).

Every instance of AMPS that will participate in replication must have a unique Name among all of the instances that are part of replication.

All instances that have the same Group must be able to be treated as equivalent by AMPS replication and AMPS clients.

If two instances of AMPS should be treated differently (for example, one instance receives publishes while the other is a read-only instance that receives one-way replication), those instances should be in different groups.

We will first show the relevant portion of the configuration used in amps-1, and then we will show the relevant configuration for amps-2.

All topics to be replicated must be recorded in the transaction log. The examples below omit the transaction log configuration for brevity. Please reference the Record and Replay Messages chapter for information on how to configure a transaction log and choose which topics are recorded in the transaction log.

<AMPSConfig>
    <Name>amps-1</Name>
    <Group>DataCenter-NYC-1</Group>
    ...

    <Transports>
        <Transport>

            <!-- The amps-replication transport is required.
                 This is a proprietary message format used by
                 AMPS to replicate messages between instances.
                 This AMPS instance will receive replication messages
                 on this transport. The instance can receive
                 messages from any number of upstream instances
                 on this transport.

                 An instance of AMPS may only define one incoming replication
                 transport.  -->

            <Name>amps-replication</Name>
            <Type>amps-replication</Type>
            <InetAddr>10004</InetAddr>
        </Transport>

  <!--
        Transports for application use also need to
        be defined. An amps-replication transport can
        only be used for replication -->

   ... transports for client use here ...

    </Transports>

    ...

    <!-- All replication destinations are defined inside the Replication block. -->

    <Replication>

        <!--
             Each individual replication destination defines outgoing
             replication, that is, messages being replicated from
             this instance of AMPS to another instance of AMPS.
          -->

        <Destination>

            <!-- The replicated topics and their respective message types are defined here. AMPS
                allows any number of Topic definitions in a Destination. -->
            <Topic>
                <MessageType>fix</MessageType>

                <!-- The Name definition specifies the name of the topic or topics to be replicated.
                    The Name option can be either a specific topic name or a regular expression that
                    matches a set of topic names. -->
                <Name>topic</Name>

            </Topic>
            <Topic>
              <!-- Replicate any topic that uses the JSON message type
                   and that starts with /orders -->
              <MessageType>json</MessageType>
              <Name>^/orders/</Name>
            </Topic>

            <Name>amps-2</Name>

            <!-- Fully synchronize messages, including messages that
                 were not originally published to this instance. -->
            <PassThrough>.*</PassThrough>

            <!-- The group name of the destination instance (or instances). The name specified
                here must match the Group defined for the remote AMPS instance, or AMPS reports
                an error and refuses to connect to the remote instance. -->
            <Group>DataCenter-NYC-1</Group>

            <!-- Replication acknowledgment can be either synchronous or
                 asynchronous. This does not affect the speed or priority of
                 the connection, but does control when this instance will
                 acknowledge the message as safely persisted. -->

            <SyncType>sync</SyncType>

            <!-- The Transport definition defines the location to which this AMPS instance will
                replicate messages. The InetAddr points to the hostname and port of the
                downstream replication instance. The Type for a replication instance should
                always be amps-replication. -->
            <Transport>

                <!-- The address, or list of addresses, for the replication destination. -->
                <InetAddr>amps-2-server.example.com:10005</InetAddr>
                <Type>amps-replication</Type>
            </Transport>
        </Destination>
    </Replication>

    ...

</AMPSConfig>

For the configuration of amps-2, we will use the following example. While this example is similar, only the differences between the amps-1 configuration will be called out.

<AMPSConfig>
    <Name>amps-2</Name>
    <Group>DataCenter-NYC-1</Group>

    ...


    <Transports>

            <!-- The amps-replication transport is required
                 This is a proprietary message format used by
                 AMPS to replicate messages between instances.
                 This AMPS instance will receive replication messages
                 on this transport. The instance can receive
                 messages from any number of upstream instances
                 on this transport.

                 An instance of AMPS may only define one incoming replication
                 transport.  -->

        <Transport>
            <Name>amps-replication</Name>
            <Type>amps-replication</Type>

            <!-- The port where amps-2 listens for replication messages matches the port where
                amps-1 is configured to send its replication messages. This AMPS instance will
                receive replication messages on this transport. The instance can receive
                messages from any number of upstream instances on this transport. -->
            <InetAddr>10005</InetAddr>
        </Transport>
    </Transports>

    ...

    <Replication>
        <Destination>

            <!-- The Topic definitions for amps-2 match
                 the definitions for amps-1 so that these
                 topics contain the same messages in the
                 transaction log on both instances. -->

            <Topic>
                <MessageType>fix</MessageType>
                <Name>topic</Name>
            </Topic>
            <Topic>
              <MessageType>json</MessageType>
              <Name>^/orders/</Name>
            </Topic>
            <Name>amps-1</Name>

            <!-- Fully synchronize messages, including messages that
                 were not originally published to this instance. -->
            <PassThrough>.*</PassThrough>

            <Group>DataCenter-NYC-1</Group>

            <SyncType>sync</SyncType>
            <Transport>

                <!-- The replication destination port for amps-2 is configured to send replication
                    messages to the same port on which amps-1 is configured to listen for them. -->
                <InetAddr>amps-1-server.example.com:10004</InetAddr>
                <Type>amps-replication</Type>
            </Transport>
        </Destination>
    </Replication>

    ...

</AMPSConfig>

These example configurations replicate the topic named topic of the message type nvfix and any topic of the message type json that begins with /orders/ between the two instances of AMPS. To replicate more topics, these instances could add additional Topic blocks.

Downstream Persistence Acknowledgment: Sync vs Async

When publishing to a topic that is recorded in the transaction log, it is recommended that publishers request a persisted acknowledgment. The persisted acknowledgment message is how AMPS notifies the publisher that a message received by AMPS is considered to be safely persisted, as specified in the configuration. (The AMPS client libraries automatically request this acknowledgment on each publish command when a publish store is present for the client -- that is, any time that the client is configured to ensure that the publish is received by the AMPS server.)

Depending on the replication destination configuration for the AMPS instance that receives the message, that persisted acknowledgment message will be delivered to the publisher at different times in the replication process.

There are two options: sync (synchronous) or async (asynchronous) acknowledgment. These two acknowledgment SyncType options control when the instance of AMPS will acknowledge the message as persisted. In other words, this controls when the message publisher will receive a persisted acknowledgment.

AMPS will not return a persisted acknowledgment to the publisher for a message until:

  • The message has been stored to the local transaction log (and SOW as applicable), and

  • All downstream replication destinations using sync acknowledgment have acknowledged the message.

The acknowledgment type (SyncType) has no effect on how an instance of AMPS replicates the message to other instances of AMPS. The process of sending messages is identical for the instance sending messages. The instance that receives the messages has no information on the acknowledgment type the upstream link has configured, so all incoming messages are processed in the same way. The acknowledgment type only affects whether the instance pushing the message must receive an acknowledgment from that Destination before it will acknowledge a message as having been persisted.

It's typical for an instance of AMPS to have multiple destinations with different acknowledgment types. When this is the case, the instance can acknowledge a message when all destinations using sync acknowledgment have acknowledged the message. No destinations using async acknowledgment are considered.

The figure below shows the cycle of a message being published in a replicated instance, and the persisted acknowledgment message being returned back to the publisher. Notice that, with this configuration, the publisher will not receive an acknowledgment if the remote destination is unavailable.

60East recommends that when you use sync replication, you consider setting a policy for downgrading the link when a destination is offline, as described in Downgrading Acknowledgments for a Destination.

The sequence for a destination that uses async acknowledgment is different.

For a destination that uses async acknowledgment, the AMPS instance replicating the message can send a persisted acknowledgment message back to the publisher as soon as the message is stored in the local transaction log and SOW stores. The instance does not wait for acknowledgment from the destination, which means that acknowledgment can happen before the replicated instance has stored the message.

The figure below shows the cycle of a message being published with a SyncType configuration set to async acknowledgment.

By default, replication destinations do not affect when a message is delivered to a subscription. Optionally, a subscriber can request the fully_durable option on a bookmark subscription (that is, a replay from the transaction log). When the fully_durable option is specified, AMPS does not deliver a message to that subscriber until all replication destinations using sync acknowledgment have acknowledged the message.

Every instance of AMPS that accepts publish commands, SOW delete commands or allows consumption of messages from queues, should specify at least one destination that uses sync acknowledgment. If a publish or queue consumer may fail over between two (or more) instances of AMPS, those instances should specify sync acknowledgment between them to prevent a situation where a message could be lost if an instance fails immediately after acknowledging a message to a publisher.

A destination configured for sync acknowledgment can be downgraded to async acknowledgment while AMPS is running. This can be useful in cases where a server is offline for an extended period of time due to hardware failure or persistent network issues. While the destination is downgraded, AMPS considers that destination to be using async acknowledgment, as described in the next section.

Downgrading Acknowledgments for a Destination

AMPS provides the ability to temporarily downgrade a replication link from synchronous to asynchronous acknowledgment. This feature is useful to relieve memory or storage pressure on publishers should a downstream AMPS instance prove unstable, unresponsive, or be experiencing excessive latency to the point that it should be considered to be offline. A link can be downgraded using an action or explicitly downgraded from the AMPS administrative console. Likewise, a link that has previously been downgraded can be upgraded using an action or from the AMPS administrative console.

When a replication link is downgraded, that link will use async acknowledgment until the link upgrades or until AMPS restarts.

Downgrading a replication link to using async (asynchronous) acknowledgment means that any persisted acknowledgment message that a publisher may be waiting on will no longer wait for the downstream instance to confirm that it has committed the message to its downstream Transaction Log or SOW store. AMPS immediately considers the downstream instance to have acknowledged the message for existing messages, which means that if AMPS was waiting for acknowledgment from that instance to deliver a persisted acknowledgment, AMPS immediately sends the persisted acknowledgment when the instance is downgraded.

Downgrading the acknowledgment type reduces the reliability guarantees provided by that replication link. Because those guarantees are reduced, the publisher can remove messages that it would have to retain if the guarantees were enforced.

The result of a link being downgraded is:

  • The number of messages that the publisher must retain is reduced, but

  • The downgraded link is unsafe for the publisher to fail over to

  • The downgraded link is unsafe for a bookmark subscriber to fail over to

Automatic downgrade is most suitable for a situation where an instance should be considered offline or unavailable. If an instance is configured to use an action to downgrade the acknowledgment type, it should also be configured to use an action to upgrade acknowledgment.

Downgrading a destination means that this instance will not wait for that destination to acknowledge a message before acknowledging that message to publishers or upstream instances. It does not affect any other behavior of the instance.

A publisher or queue consumer must not fail over from this instance to a destination that has been downgraded to async acknowledgment. This can cause message loss, since the upstream instance may have acknowledged a message that the downstream instance has not yet processed.

A bookmark subscriber must not fail over from this instance to a destination that has been downgraded to async acknowledgment. This can cause replay gaps, since that destination is no longer considered when determining whether a message is persisted.

Automatically Downgrading and Upgrading Acknowledgment

AMPS can be configured to automatically downgrade a replication link to async if the remote side of the link cannot keep up with persisting messages or becomes unresponsive. This option prevents unreliable links from holding up publishers but increases the chances of a single instance failure resulting in message loss, as described above. AMPS can also be configured to automatically upgrade a replication link that has previously been downgraded.

Since downgrading a link to a destination affects the consistency and durability guarantees provided by the set of AMPS instances as a whole, use caution when configuring the parameters. In general, it's a good idea to set an interval that is larger than the amount of time at which an instance would be considered to be unresponsive or offline.

Automatic downgrade is implemented as an AMPS action. To configure automatic downgrade, add the appropriate action to the configuration file as shown below:

<AMPSConfig>
    ...
    <Actions>
        <Action>
            <On>
                <Module>amps-action-on-schedule</Module>
                <Options>

                    <!--This option determines how often AMPS checks whether destinations have fallen
                    behind. In this example, AMPS checks destinations every 15 seconds. In most
                    cases, 60East recommends setting this to half of the Interval setting. -->
                    <Every>15s</Every>
                </Options>
            </On>
            <Do>
                <Module>amps-action-do-downgrade-replication</Module>
                <Options>

                    <!--The maximum amount of time for a destination to fall behind. If AMPS has been
                     waiting for an acknowledgment from the destination for longer than the
                     Interval, AMPS downgrades the destination. In this example, AMPS downgrades any
                     destination for which an acknowledgment has taken longer than 300 seconds. -->
                    <Age>300s</Age>
                </Options>
            </Do>
            <Do>
                <Module>amps-action-do-upgrade-replication</Module>
                <Options>

                   <!-- The threshold for upgrading the replication link back to sync
                        acknowledgment. If the destination is behind by less than this
                        amount, and was previously downgraded to async acknowledgment,
                        AMPS will upgrade to sync acknowledgment.

                        -->

                    <Age>10s</Age>
                </Options>
            </Do>
        </Action>
    </Actions>
   ...
</AMPSConfig>

In this configuration file, AMPS checks every 15 seconds to see if a destination has fallen behind by 300 seconds. If a destination has fallen behind by more than 300 seconds, that destination should no longer be considered online. Typically, this would be set to a duration longer than the time at which monitoring of that instance would produce alerts that the instance is unavailable.

AMPS downgrades the destination to async acknowledgment. That destination will no longer be considered when acknowledging messages to publishers. Once the link to the destination is downgraded, connections to this instance should not consider that destination to be safe for failover until the link has again been upgraded.

All publishers using a publish store should be able to hold a number of messages equal to the number of messages published, at peak message volume, for a time period equal to the periodicity of the downgrade check plus the threshold for downgrade. With the configuration above, a publisher that publishes at a peak rate of 10,000 messages per second should, at a minimum, be able to allocate a publish store that holds 750,000 messages.

In some cases, it is important that a destination maintain a minimum number of destinations that use sync acknowledgment. For those cases, an instance-level Tuning parameter is available that will prevent the action from downgrading a connection if doing so would reduce the number of destinations that use sync acknowledgment below the configured limit. This parameter does not guarantee whether a specific destination will continue using sync acknowledgment. This parameter only limits whether AMPS will downgrade a destination that meets downgrade criteria. AMPS will not upgrade a destination that has previously been downgraded if a connection is lost, even if this means that the number of currently connected destinations that use sync acknowledgment is less than the configured minimum. See the AMPS Configuration Guide section on instance-level parameters for details.

Last updated

Copyright 2013-2024 60East Technologies, Inc.