Replaying Messages with Bookmark Subscription

One of the most useful and powerful features in AMPS is bookmark subscription, which is enabled by the transaction log. With bookmark subscription, an application requests a subscription that starts at a specific point in the transaction log. AMPS begins the subscription at the specified point, and provides messages from the transaction log.

Each message in the transaction log has a bookmark. A bookmark is an opaque, unique identifier that is added by AMPS to each message recorded in the transaction log. For messages provided from a transaction log, the field is included in the Bookmark header of the message. AMPS guarantees that bookmarks for the instance are monotonically increasing, which enables AMPS to rapidly find an individual bookmark within the transaction log.

A bookmark subscription simply requests that AMPS begin the subscription with the first message following the bookmark provided with the subscription. AMPS locates the bookmark in the transaction log, and begins the subscription at that point in time.

One way to think about a bookmark subscription is that AMPS publishes to the subscribing client only those messages that:

  1. Have bookmarks after the provided bookmark

  2. Match the subscription's Topic and Filter

  3. Have been written to the transaction log

AMPS provides these messages in the order in which they were recorded to the transaction log. Since a bookmark subscription requires a transaction log, when a client requests a bookmark subscription for a topic that is not being recorded in the transaction log, AMPS returns an error.

If the subscription requests a completed acknowledgment, that acknowledgment will be delivered to the subscription once replay has completed. Messages delivered to the subscription after the acknowledgment is delivered are from new publishes. By default, those messages are delivered once they are written to the transaction log.

AMPS allows an application to submit a comma-delimited list of bookmark values as the bookmark for a subscription request. In this case, AMPS begins replay at the oldest bookmark in the list. The client controls the bookmark provided on the subscription request. For a bookmark subscription, the AMPS server does not keep a persistent record of which bookmarks a specific client or subscription has processed. The AMPS client libraries provide facilities for easily tracking the messages which an application has processed so the application can resume at the appropriate point in the transaction log.

Requesting replay from the transaction log is how AMPS applications manage resumable subscriptions. The application keeps track of which messages have been processed, and requests replay from the appropriate point in the transaction log to resume the message stream. This record-keeping is built into the AMPS client libraries, and most often handled transparently when a bookmark store is set for the client. Notice that this means that the AMPS server itself does not track the progress of individual subscriptions, nor does the application need to inform the server of how far the subscription has progressed. The application state needed to resume the subscription is entirely handled on the application side, with no involvement by the AMPS server. (For details on how specific client libraries manage the application state, see the Developer Guide for that client library.)

Bookmark subscriptions are provided from the transaction log rather than the live publish stream. This lets AMPS adapt the pace of replay to the pace at which the subscriber is consuming replayed messages without triggering slow client offlining.

While there are similarities between a bookmark subscription used for replay and a State of the World (SOW) query, the transaction log and SOW are independent features that can be used separately. The SOW gives a snapshot of the current view of the latest data, while the journal is capable of playback of previous messages. Historical SOW queries provide a snapshot of the SOW at a defined point in the past, and are provided by the SOW database rather than the transaction log.

There are different ways that a client can request a bookmark replay from the transaction log. Each of these bookmark types meets a different need and enables a different recovery strategy that an application can use. The sections below describe the recovery types, the cases in which they can be used, and how the 60East clients implement them.

Replay of Full Transaction Log

The epoch bookmark, when requested on a subscription, will replay the transaction log starting at the very beginning. Once the transaction log has been replayed in its entirety, then the subscriber will begin receiving messages on the live incoming stream of messages. A subscriber does this by requesting a 0 in the bookmark header field of their subscription. The AMPS clients provide a constant for epoch, typically represented as EPOCH.

This type of bookmark can be used in a case where the subscriber has begun after the start of an event, and needs to catch up on all of the messages that have been published to the topic.

To ensure that no messages from the subscription are lost during the replay, AMPS replays messages from the transaction log until the client reaches the last message in the transaction log. Once all of the existing messages in the transaction log have been sent to the client, AMPS will cut over to the live subscription stream and provide messages to the client as soon as they are persisted.

Bookmark Replay from NOW

The NOW bookmark, when requested on a subscription, declines to replay any messages from the transaction log, and instead begins streaming messages from the live stream - returning any messages that would be published to the transaction log that match the subscription's Topic and Filter.

This type of bookmark is used when a client is concerned with messages that will be published to the transaction log, but is unconcerned with replaying the historical messages in the transaction log. This strategy is often used for applications that want to ensure that they do not miss messages, even if the application temporarily loses connectivity, but are not concerned with older messages. For this case, the application subscribes with NOW when the application starts, and then re-establishes the subscription with the most recently-processed bookmark if connectivity is lost. This resubscription behavior is typically handled by the client reconnection logic (as in the 60East HAClient implementations).

The NOW bookmark is performed using a subscribe query with "0|1|" as the bookmark field. The AMPS clients provide a constant for this value, typically represented as NOW.

Bookmark Replay with a Bookmark

Clients that store the bookmarks from published messages can use those bookmarks to recover from an interruption in service. By placing a subscribe query with the last bookmark recorded, a client will get a replay of all messages persisted to the transaction log after that bookmark. Once the replay has completed, the subscription will then cut over to the live stream of messages.

To perform a bookmark replay, the client places a bookmark subscription with the bookmark at which to start the subscription.

A bookmark subscription must always start at a specific point in the transaction log. The AMPS server uses a bookmark to locate that point in the transaction log and begin replay at that point.

If a bookmark is unknown (that is, the transaction log does not contain that bookmark), the AMPS server does not have a specific point at which to begin the replay and will assume that the subscriber has failed over from another instance that has not yet replicated publishes to this instance, and begin replay at NOW (the end of the transaction log).

Developer Note: the MOST_RECENT value

The AMPS client libraries provide a special constant value that requests that the library look up the appropriate recovery point in the bookmark store and then provide that recovery point in the subscription request. This special value is typically represented as MOST_RECENT, RECENT, or recent. When the application requests a bookmark subscription with a bookmark of MOST_RECENT, the client library looks for the most recent bookmarks processed for that subscription, then provides the appropriate bookmark or list of bookmarks when resubscribing. This process helps to ensure that the subscription begins at the last processed message, and the application receives the next unprocessed message for the subscription. If there is no record of a subscription, the AMPS clients will start with EPOCH, so that the first time a subscription is entered, the application gets the full record of available messages.

It's important to remember that the AMPS server has no knowledge of the MOST_RECENT value. MOST_RECENT itself is never sent to AMPS and never appears in the AMPS log. MOST_RECENT is simply a request to the AMPS client library to look up the exact bookmark value to provide to AMPS. The AMPS client libraries always translate a request for MOST_RECENT into either a specific value (typically a list of bookmarks) or EPOCH.

Bookmark Replay from a Moment in Time

The final type of bookmark supported is the ASCII-formatted timestamp. When using a timestamp as the bookmark value, the transaction log replays all messages that occurred after the timestamp, and then cuts over to the live subscription once the replay stream has been consumed.

This bookmark has the format of YYYYmmddTHHMMSS[Z] where:

  • YYYY is the four digit year.

  • mm is the two digit month.

  • dd is the two digit day.

  • T is the character separator between the date and time.

  • HH is the two digit hour.

  • MM is the two digit minute.

  • SS is the two digit second.

  • Z is an optional timezone specifier. AMPS timestamps are always in UTC, regardless of whether the timezone is included. AMPS only accepts a literal value of Z for a timezone specifier.

For example, a timestamp for January 2nd, 2015, at 12:35:

20150102T123500Z

With a timestamp, the AMPS server begins replay at the closest point in the current transaction log, even if there is no message recorded at that exact moment. This means that a timestamp prior to the beginning of the journal will start a replay at EPOCH.

Bookmark Replay with a Starting and Stopping Point

As of version 5.3.2, AMPS allows a subscriber to specify the point at which a bookmark replay should stop. When specified, a subscriber will not receive further messages after the replay reaches the stopping point. Also, when a stopping point is specified and a completed acknowledgment is requested for the subscription, AMPS will return the completed acknowledgment (if requested) when the stopping point is reached, rather than when replay reaches the point in the transaction log at which the subscription was entered.

To set a starting and stopping point, a subscription provides a subscription range specifier with the bookmark. The format of the subscription range specifier is as follows:

<begin_interval_specifier> <begin_bookmarks> : <end_bookmarks> <end_interval_specifier>

The begin_interval_specifier is one of:

The end_interval_specifier is one of:

Bookmarks provided for a subscription range specifier can be either timestamps (as described above) or literal bookmarks provided by AMPS.

For example, to replay messages received by the instance on June 4, 2020 we could construct an interval beginning at midnight on June 4 UTC (inclusive) and ending at midnight on June 5 UTC (exclusive), as follows:

[20200604T000000:20200605T000000)

AMPS allows future timestamps in a range specifier. For example, it would be valid for an application to enter a subscription that collects messages for a full business day and then completes, even if the application is started before business hours. AMPS will begin delivering messages at the start time specified, and deliver a completed ack (if requested) and stop delivering messages at the end time specified.

The starting and stopping points are most often provided as timestamps. AMPS also allows an application to specify starting and stopping points using message bookmarks. In this case, rather than finding the specific time specified, AMPS finds the point in the log at which the bookmark was recorded.

An application can choose to submit a list of bookmarks as the starting or stopping point. In this case, AMPS will find the earliest bookmark in the starting point list (as recorded in the local transaction log) and the latest bookmark in the stopping point list (as recorded in the local transaction log), and replay as though those two bookmarks had been provided to the replay command.

Content and Topic Filtering

As with all other subscriptions, bookmark subscriptions support content filtering.

Bookmark subscriptions provide only messages from topics that are recorded in the transaction log. In other words, when a bookmark subscription uses a topic regular expression, only messages from topics that are recorded in the transaction log are provided to the subscription. This ensures that a bookmark subscription provides a consistent, repeatable stream of messages. The topics provided to the subscription are the same during replay, when only messages recorded in the transaction log are available, and after replay completes, when every publish to AMPS is available. This also ensures that a bookmark subscription that replays messages for a specific timeframe gets the same messages as bookmark subscribers that had active subscriptions during that timeframe.

Content filtering is covered in greater detail in AMPS Expressions.

Delivery Rate Control for Bookmark Subscriptions

AMPS allows subscribers to specify the maximum delivery rate for messages delivered from a bookmark subscription. A subscriber specifies the maximum rate at which AMPS should deliver messages to the subscription. AMPS then limits the rate at which replay from the transaction log occurs so that the overall rate does not exceed the specified maximum. Rate control is not available for subscriptions that use the live option.

To request rate control, a subscriber provides the rate option on the subscription. A rate can be specified in either messages per second, number of bytes delivered per second, or a multiple of the original delivery rate. For example, the following subscription option limits delivery to 1000 messages per second:

rate=1000

To limit delivery to 500KB per second, a subscriber would provide this option:

rate=500KB

To limit delivery to double the speed at which messages were originally published, a subscriber would provide this option:

rate=2X

To limit delivery to half the speed at which messages were originally published, a subscriber would provide this option:

rate=.5X

When using a rate that is a factor of the original replay speed, you may want AMPS to skip over long gaps. For example, you may want to do load testing by replaying several days' worth of operations at a 5x multiplier. In that case, however, your load test does not need to be idle when there are gaps during which no messages are produced (for example, outside of trading hours or during holidays). For this situation, AMPS provides a rate_max_gap option that sets the maximum amount of time for a replay to wait to produce a message. For example, with an option string like:

rate=5X,rate_max_gap=10s

AMPS will attempt to produce messages at 5 times the original publish rate. In the event that there is a gap between messages of more than 50 seconds in the original publish stream (that is, 10 seconds in the replay), AMPS will wait for 10 seconds and then "skip ahead" to the next message in the replay.

Pausing and Resuming Bookmark Subscriptions

As of version 5.0, AMPS offers the ability to pause a bookmark subscription. When a subscriber requests that AMPS pause the subscription, AMPS stops providing messages from the bookmark subscription, but does not remove the subscription. The subscriber can then resume the subscription, and AMPS will again begin providing messages from the subscription. While the subscription is paused, AMPS maintains a record of the current position in the transaction log, and begins replay from that point.

This feature is most useful for starting replay of a number of subscriptions at the same point in the transaction log, and ensuring that those subscriptions are resumed together and progress at the same rate. This can be useful for testing purposes, or for reconstructing a sequence of events that involve multiple subscriptions.

Notice that sending a pause command for a subscription does not affect messages that have already been sent to a client, and that bookmark replays will automatically pace themselves to the rate that a client is consuming messages. This means that, while pause can be used to request that AMPS temporarily halt a subscription, this option is not recommended to control message rate for a client that is oversubscribed.

An application may create a subscription in the paused state by including pause as an option on the initial subscribe command. To pause an active subscription, a subscriber sends a subscribe command with the existing subscription ID and the pause option. To resume a subscription, a subscriber sends a subscribe command with the subscription ID (or a comma-separated list of subscription IDs) and the resume option. The AMPS clients provide convenience functions or constants for the pause and resume options.

AMPS allows a given client to pause or resume multiple subscriptions at once.

When multiple bookmark subscriptions are resumed at the same time, AMPS will attempt to combine replay for the subscriptions. When AMPS can combine replay, AMPS will guarantee that messages across subscriptions are delivered from the same replay, which can help to preserve order across subscriptions. AMPS can combine subscriptions when they are delivered to the same client connection, were paused at the same bookmark, delivered at the same rate and are resumed with the same command. This feature can be useful for synchronizing message delivery across a number of subscriptions. When using pause and resume for this purpose, an application typically includes the pause option on a number of subscriptions when the subscriptions are created, and then resumes the subscriptions when the application is ready to begin the replay.

Pausing a subscription stops AMPS from sending messages to the client once the pause command is processed. However, any messages already on the network, or in a network buffer on the client or the server will be delivered to the client.

AMPS allows you to begin a subscription in the paused state by providing the pause option when creating the subscription.

AMPS removes a paused subscription if the subscriber disconnects: for restarting a subscription across subscriber restarts, use the basic bookmark subscription features as described above.

Conflation and Bookmark Subscriptions

AMPS supports subscription conflation for bookmark subscriptions, as described in Conflated Subscriptions.

Conflation for bookmark subscriptions works the same way that conflation for regular subscriptions works. Messages from the replay are held by AMPS for the conflation interval. If during that interval the replay finds a message with the same conflation_key value, AMPS replaces the held message with the message from the replay. At the end of the conflation interval, AMPS provides the currently held message to the subscriber. During replay from the transaction log, the conflation interval refers to the timeline of the messages being replayed. That is, a conflation interval of 1s will provide conflated messages with the same conflation_key value published during the same second, even if during transaction log replay the messages are replayed at a much higher rate.

When using conflation, the bookmark provided on a message post conflation, is the bookmark for the first conflated message during the interval rather than the message that AMPS delivers at the end of the conflation interval.

Requesting Message Timestamps

Messages that are replayed from the transaction log will not have the timestamp field of the header populated by default. In order to request timestamps, provide the timestamp option when creating the subscription.

Selecting Message Durability Options

AMPS supports two distinct options for specifying message durability. By default, messages are provided to a bookmark subscription when they are persisted to the local transaction log.

Once replay from the transaction log is finished, AMPS sends messages to subscribers as the messages are processed. By default, AMPS waits until a message is persisted to the local transaction log before sending the message to subscribers. Since each message delivered is persisted, this approach ensures that the sequence of messages is consistent for this instance across client and server restarts, and that messages that are received by a subscriber will be available after a restart.

AMPS provides options that a subscriber can use to change the point at which AMPS delivers messages once replay from the transaction log has finished.

Using the 'fully_durable' Option for Bookmark Subscriptions

With the fully_durable option, once replay from the transaction log is finished, AMPS sends a message to the subscriber only when the message has been persisted in the local transaction log and all synchronous downstream replication destinations have acknowledged the message. This option is useful for applications where processing of a message should not begin until more than one AMPS instance has persisted the message.

This option will typically introduce more latency for incoming messages when those messages must be replicated. When this option is used and one or more of the synchronous downstream replication destinations that receives messages for this topic is offline, the instance will not deliver incoming messages until that destination comes back online or is downgraded to asynchronous replication.

Using the 'live' Option for Bookmark Subscriptions

In some cases, reducing latency may be more important than consistency. To support these cases, AMPS provides a live option on bookmark subscriptions. For bookmark subscriptions that use the live option, once replay has finished, AMPS sends messages to subscribers before the message has been persisted. This can provide a small reduction in latency at the expense of increasing the risk of inconsistency upon failover. For example, if a publisher does not republish a message after failover, your application may receive a message that is not stored in the transaction log and that other applications have not received.

The live option increases the risk of inconsistent data between your application and AMPS in the event of a failover. 60East recommends using this option only if the risk is acceptable and your application requires the small latency reduction this option provides.

Since the live option does not wait for messages to be persisted, subscriptions that use this option are subject to slow client offlining after replay from the transaction log is complete.

The rate, pause, and resume options are not supported with the live option.

Last updated

Copyright 2013-2024 60East Technologies, Inc.