Replication Basics
Before planning an AMPS replication topology, it can be helpful to understand the basics of how AMPS replication works. This section presents a general overview of the concepts that are discussed in more detail in the following sections.
Replication is point-to-point. Each replication connection involves exactly two AMPS instances: a source (that provides messages) and a destination that receives messages.
Replication is always "push" replication. In AMPS, the source configures a destination, and pushes messages to that destination. (Notice that it is possible to configure the source to wait for the destination to connect rather than actively making an outgoing connection, but replication is still a "push" from the source to the destination once that connection is made). The source must be configured to push messages to the destination, and the source guarantees that all messages to be replicated must be acknowledged by the destination before they can be removed from the transaction log.
Replication is one-link by default. By default, an instance of AMPS only replicates messages that are published to directly to that instance by a client. Optionally, an instance of AMPS can be configured to also replicate messages that arrive over replication. Adding this configuration is typically required if there are more than two instances in a replicated set of AMPS instances. See the PassThrough Replication topic for details.
Replication relies on the transaction log. AMPS replicates the commands as preserved in the transaction log. This means, for example, that the results of delta publishes are replicated as fully-merged messages, since fully-merged messages are stored in the transaction log. Likewise, if duplicate messages arrive over different paths, only the first message to arrive will be stored in the transaction log, and that message is the one that will be replicated.
Replication always provides messages to a destination in the order in which the messages are recorded in the transaction log of the instance sending the message. Messages that are not stored in the transaction log cannot be replicated.
Replication provides a command stream. In AMPS replication, the server replicates the results of
publish
,delta_publish
andsow_delete
commands once those results are written to the transaction log. Each individual command is replicated, for low latency and fine-grained control of what is replicated. If a command is not in the transaction log (for example, a maintenance action has removed the journal that contains that command, or the command is for a topic that is not recorded in the transaction log), that command will not be replicated.Replication is intended to guarantee that the command stream for a set of topics on one instance is present on the other instance, with the ordering of each message source preserved. This means that there can be only one connection from a given upstream instance to a given downstream instance.
Replication is customizable by topic, message type, and content. AMPS can be configured to replicate the entire transaction log, or any subset of the transaction log. This makes it easy to use replication to populate view servers, test environments, or similar instances that require only partial views of the source data.
Replication guarantees delivery. AMPS will not remove a journal file until all messages in that journal file have been replicated to, and acknowledged by, the destination.
Replication is composable. AMPS is capable of building a sophisticated replication topology by composing connections. For example, full replication between two servers is two point-to-point connections, one in each direction. The basic point-to-point nature of connections makes it easy to reason about a single connection, and the composable nature of AMPS replication allows you to build replication networks that provide data distribution and high availability for applications across data centers and around the globe.
Replication acknowledgment is configurable. The acknowledgment mode provides different guarantees: async acknowledgment provides durability guarantees for the local instance, whereas sync acknowledgment provides durability guarantees for the local instance and the downstream instance.
Group identifies a set of instances that are intended to be fully equivalent. This identification is for the purposes of message contents, application failover, and AMPS replication failover. Instances that are not intended to be fully equivalent for all of these purposes should be given a different
Group
name, even if they are in the same data center or geographic location, or if they would be treated as equivalent for some, but not all, purposes.
More details on each of these points is provided in this section.
Benefits of Replication
Replication can serve two purposes in AMPS:
It can increase the fault-tolerance of AMPS by creating another instance to be used should an instance fail or be taken offline.
Replication can be used in message delivery to a remote site.
In order to provide fault tolerance and reliable remote site message delivery, for the best possible messaging experience, there are some guarantees and features that AMPS has implemented. Those features are discussed in the following sections.
Replication in AMPS supports filtering by both topic and by message content. This granularity in filtering allows replication sources to have complete control over what messages are sent to their downstream replication instances.
Additionally, replication can improve availability of AMPS by creating a redundant instance of an AMPS server. Using replication, all of the messages which flow into a primary instance of AMPS can be replicated to a secondary spare instance. This way, if the primary instance should become unresponsive for any reason, then the secondary AMPS instance can be swapped in to begin processing message streams and requests.
When an AMPS instance is a replication source, that instance guarantees that messages will not be removed from the transaction log until all destinations have acknowledged the message.
Last updated