Understanding Message Persistence

PreviousUsing the Transaction Log and Bookmark Subscriptions NextReplaying Messages with Bookmark Subscription

Last updated 7 days ago

Understanding Message Persistence

To take advantage of transactional messaging, the publisher and the AMPS instance work together to ensure that messages are written to persistent storage. AMPS lets the publisher know when the message is persisted, so that the publisher knows that it no longer needs to track the message.

When a publisher publishes a message to AMPS, the publisher assigns each message a unique sequence number. Once the message has been written to persistent storage, AMPS uses the sequence number to acknowledge the message and let the publisher know that the message is persisted. Once AMPS has acknowledged the message, the publisher considers the message published. For safety, AMPS always writes a message to the local transaction log before acknowledging that the message is persisted. If the topic is configured for synchronous replication, all replication destinations have to persist the message before AMPS will acknowledge that the message is persisted.

For efficiency, AMPS may not acknowledge each individual message. Instead, AMPS acknowledges the most recent persisted message to indicate that all previous messages have also been persisted, as described in . Publishers that need reliable publishing do not wait for acknowledgment to publish more messages. Instead, publishers retain messages that haven't been acknowledged, and republish messages that haven't been acknowledged if failover occurs. The AMPS client libraries include this functionality for persistent messaging (see descriptions of the publish store in client library documentation). See the section of this guide for further details.

Client Names and the Transaction Log

When a transaction log is configured, AMPS needs to be able to tell the difference between different publishers to be able to reliably persist and replay the message stream. AMPS uses the client name as a unique application identifier to be able to tell when a connection is a connection from a different client as compared to a new connection from the same client.

The contract between AMPS and the application is that the application must provide a client name that will uniquely and consistently identify a particular instance of the client application. The same instance of the same application should use the client name each time that instance connects, and should not use the same client name as another instance.

An individual instance of AMPS enforces this contract when the transaction log is configured. An individual instance will only allow one connection at a time with a given client name when the transaction log is configured.

To enforce this, if two clients attempt to connect with the same client name:

If the clients have the same authenticated user ID (or no user ID is set, in cases where default authentication is used), AMPS will consider this a case where the same program is attempting to reconnect. AMPS will consider the existing connection to be out of date and disconnect the existing connection.
If the clients have different authenticated user IDs, AMPS will consider these to be different applications attempting to use the same client name, and refuse to allow the new connection. AMPS will disconnect the new connection.

In either case, AMPS logs the disconnect reason in the event and error log. The disconnect reason for the connection that is removed will be logged as "name in use".

Message Sequence Numbers

Every message stored in the transaction log can be referred to by a bookmark, which is a combination of an identifier for the publisher and the sequence number of the message.

AMPS uses the sequence number to identify duplicate messages.

If a message arrives at AMPS (either from a publisher or over replication) with a sequence number that is equal to or lower than the highest sequence number seen for that publisher, the message is considered to be a duplicate and discarded.

There is no other significance to sequence numbers. The sequence numbers simply represent where, in the sequence of messages sent by that publisher, the current message falls.

In most applications, message sequence numbers are automatically managed as part of the store-and-forward mechanism of the AMPS client libraries (implemented as the PublishStore for the client). The PublishStore assigns sequence numbers and manages reliable publication to AMPS.

When a message without a sequence number is received for a topic in the transaction log, AMPS creates a different publisher identifier for these publishes, based on the publisher identifier and the name of the AMPS instance. AMPS uses that identifier for the origin of the message, and assigns a sequence number. This approach provides unique identifiers for the messages while preventing conflicts with sequence numbers that the publisher might use.