How Does the SOW Work?
Last updated
Last updated
Copyright 2013-2024 60East Technologies, Inc.
Much like tables in a relational database, topics in the AMPS SOW persist the most recent update for each message. AMPS identifies a message by using a unique key for the message. The SOW key for a given message is similar to the primary key in a relational database: each value of the key is a unique message. The first time a message is received with a particular SOW key, AMPS adds the message to the SOW. Subsequent messages with the same SOW key value update the message.
There are several ways to create a SOW key for a message:
Most applications specify that AMPS assigns a SOW key based on the content of the message. The fields to use for the key are specified in the SOW topic definition, and consist of one or more XPath expressions. AMPS finds the specified fields in the message and computes a SOW key based on the name of the topic and the values in these fields. 60East recommends this approach unless an application has a specific need for a different approach.
A topic can also be configured to require that a publisher provide a SOW key for each message when publishing the message to AMPS.
AMPS also supports the ability for custom SOW key generation logic to be defined in an AMPS module, which will be invoked to generate the SOW key for each message. While these SOW keys are generated automatically by AMPS, rather than being provided by the publisher, the logic to generate these keys is provided by the module, and the configuration required (if any) is determined by the module.
The following diagrams demonstrate how the SOW works, using a SOW topic that is configured to have AMPS determine the SOW key based on the /orderId
field within the message. As each message comes in, AMPS uses the contents of the /orderId
field to generate a SOW key for the message. The SOW key is used to identify unique records in the SOW, so AMPS will store a distinct record for each distinct /orderId
value published to this topic. The calculated SOW key will be returned in the SowKey
header of messages received from the topic in the SOW.
In the previous diagram, two messages are published where neither of the messages have matching keys existing in the ORDERS
topic. The messages are both inserted as new messages.
Some time after these messages are processed, an update comes in for the order with an orderId
of 2
. This message changes the price from 120 to 95. Since the incoming message has an orderId
of 2, this matches an existing record and overwrites the existing message for the same SOW key, as seen in the diagram below. AMPS replaces the entire record with the contents of the update.
Although the SOW key is derived from the content of the message in many cases, the SOW key is distinct from the content of the message. Each record in a SOW topic has a distinct SOW key, which is stored with the record. The SOW stores the full message in the message type format for performance. There is no re-serialization required to send a message to subscribers.
By default, a topic recorded in the SOW is persistent. For these topics, AMPS stores the contents of the SOW for that topic in a dedicated, memory-mapped file. This means that the total SOW does not need to fit into memory, and that the contents of the SOW database are maintained across server restarts. You can also define a transient SOW topic, which does not store the contents of the SOW to a persisted file.
The SOW file is separate from the transaction log, and you do not need to configure a transaction log to use a SOW. When a transaction log is present that covers the SOW topic, on restart AMPS uses the transaction log to keep the SOW up to date. When the latest transaction in the SOW is more recent than the last transaction in the transaction log (for example, if the transaction log has been deleted), AMPS takes no action. If the transaction log has newer transactions than the SOW, AMPS replays those transactions into the SOW to bring the SOW file up to date. If the SOW file is missing or damaged, AMPS rebuilds the SOW by replaying the transaction log from the beginning of the log.
When a SOW topic is persistent
, each Topic must be stored in a separate file. Only one instance of AMPS can access a given file; the same copy of the SOW file cannot be used by multiple instances of AMPS.
When the SOW for a Topic is transient, AMPS does not store the SOW for this topic across restarts. In this case, AMPS will synchronize the SOW with the transaction log when the server starts to restore the state of the topic. By default, this recovery processes the entire transaction log. You can use the RecoveryPoint
configuration option to specify that the topic should have only new publishes or should recover from a specific point in time (for example, you could use an environment variable to provide a timestamp to the RecoveryPoint
so that AMPS recovers only the last 24 hours of messages.)