SOW/Topic

This section describes the configuration options for recording a Topic in the State of the World.

All SOW topics require a basic definition of the messages to be recorded.

Element

Description

MessageType

(required)

Type of messages to be stored. To use AMPS generated SOW keys, the message type specified must support content filtering so that AMPS can determine the SOW key for the message. All of the default message types, except binary, support content filtering. Since the binary message type does not support content filtering, that type can only be used for a SOW when publishers use explicit keys.

See the Message Types in the AMPS User Guide for a discussion of the message types that AMPS loads by default. Some message types (such as Google Protocol Buffers) require additional configuration, and must be configured before using the message type in a SOW topic.

Name (required)

The name of the SOW topic. By default, unique messages published to this topic will be stored in a topic-specific SOW database.

Every SOW requires a method of determining which messages are unique. Several methods are provided within AMPS. See the AMPS User Guide for a discussion on Understanding SOW Keys, and the table below for relevant configuration items.

If no Name is provided, AMPS accepts Topic as a synonym for Name to provide compatibility with versions of AMPS previous to 5.0. Notice that if the topic uses a Pattern tag (described below) to record multiple logical topics in a single physical topic, the Name defines the physical topic only.

Required Considerations

A SOW topic must also consider the following options:

Option

Description

Storage and Recovery

A SOW topic must either specify the FileName to be used for persisting the topic, or declare that the topic will be in-memory only by setting the Durability to transient. See Storage and Recovery Definition below for details.

Record Identity

A SOW topic must define how to determine whether a given message is a distinct record in the topic. Typically, record identity is based on the content of one or more fields in the message content.

See Record Identity Definition below for details.

Memory and File Growth

If the SOW topic will contain a large number or records, or if an individual record will exceed the default allocation size, the topic must define the allocation size. See Memory and File Growth Options below for details.

Optional Considerations

A SOW topic may also provide the following options (notice, though, that there are restrictions on how some of these options are used with other options.

Option

Description

Indexing

A SOW topic can declare additional hash indexes or direct AMPS to create memo indexes before a field is queried by an application to improve performance.

See Indexing Options below for details.

Historical Query

A SOW topic can keep message state to allow "point in time" historical queries of current values. Notice that this option is not required for message-by-message replay; recording the topic in a transaction log provides full replay. Instead, this option provides the ability to determine what the current value for a message was at a particular point in time (even if that value was set, and remained unchanged, long before the point in time that is being queried). See Historical Query Options below for details.

Message Enrichment

A SOW topic can modify the content of messages as they are received from an application. See Message Enrichment Options below for details.

Recording Multiple Logical Topics in One Physical Topic

A SOW topic can be declared as a regular expression topic, where multiple topic names use the same definition and are stored in the same physical file. This can be useful when converting a system from topic-based routing, or any situation where a set of topics use the same message structure, the same method for determining the key, and each topic has a relatively small set of messages. See Multiple Logical Topics in One Physical Topic for details.

Storage and Recovery Definition

These options specify the storage for the topic and recovery behavior for the topic if the topic is not durably persisted or if the recovery file is not present.

Element

Description

FileName

The file where the State of the World (SOW) data will be stored.

This element is required for SOW topics with a Durability of persistent (the default) because those topics are persisted to the filesystem. This is not required for SOW topics with a Durability of transient.

This element should contain the path and the file name. The path can be either an absolute path or a path relative to the current working directory of the AMPS process.

Within this element, the escape %n will be replaced with the Name and MessageType of the topic. This can be a convenient way to avoid having to retype the topic name in this element.

Two different topics must not share the same file. Two instances of AMPS must not share the same file.

Durability

Defines the data durability of a SOW topic. SOW databases specified as persistent are stored to the file system and retain their data across instance restarts. Those specified as transient are not persisted to the file system and are recreated each time the AMPS instance restarts.

Notice that when the Durability is transient, and the topic is recorded in the transaction log, each time AMPS starts, AMPS will recover the state of the topic from the transaction log. The recovery begins at the RecoveryPoint specified for the topic, which defaults to epoch, the beginning of the transaction log. (For persistent topics, AMPS recovers from the last message written to the SOW topic, or from the RecoveryPoint if the SOW file is removed.)

Default: persistent

Valid values: persistent or transient

When a value of persistent is specified, the FileName element must be present.

Synonyms: Duration is also accepted for this parameter for backward compatibility with configuration prior to 4.0.0.1

RecoveryPoint

For SOW topics that are covered by the transaction log, the point from which to recover the SOW if the SOW file is removed, or if the SOW topic has transient duration.

This configuration item allows two values:

  • epoch recovers the SOW from the beginning of the transaction log

  • now recovers the SOW from the current point in the transaction log

Default: epoch

Expiration

Time for how long a record should live in the SOW database for this topic. The expiration time is stored on each message, so changing the expiration time in the configuration file will not affect the expiration of messages currently in the SOW.

AMPS accepts interval values for the Expiration, using the interval format described in the AMPS Configuration Guide section on units, or one of the following special values:

  • A value of disabled specifies that AMPS will not process SOW expiration for this topic. In this case, AMPS saves any expiration value set on a message by the publisher, but does not process expiration. This value must be set to disabled (the default) if History is enabled for this topic.

  • A value of enabled specifies that AMPS will process SOW expiration for this topic, with no expiration set by default. Instead, AMPS uses the value set on the individual messages (with no expiration set for messages that do not contain an expiration value).

Expiration must be disabled if History is enabled.

Default: disabled (messages never expire)

Record Identity Definition

Each SOW topic must define how AMPS will determine which messages are unique. An application can either have AMPS determine the key by specifying one or more Key fields, provide a SOW key with the publish command each time a message is published to AMPS. AMPS also provides the ability to provide a custom SowKey generator with a plugin module.

See the Understanding SOW Keys section of the AMPS User Guide for a full discussion. The following table lists the available configuration items for specifying how AMPS determines the SowKey for a message:

Options for Record Identity

Element

Description

Key

Specifies an XPath-based identifier within each message that AMPS will use to generate a SOW key, which determines whether a message is unique. This element can be specified multiple times to create a composite key from the combined value of the specified Key elements.

When one or more Key elements is specified for the SOW, AMPS generates the SOW key for each message. When no Key fields are specified and no KeyGenerator is specified, publishers must explicitly provide the SOW key for each message when the message is published.

60East recommends configuring a Key and having AMPS generate the SOW key for a message unless your application has specific needs that make this impractical.

AMPS automatically creates a hash index for the set of fields specified in the Key elements.

There is no default for this element.

KeyDomain

The seed value for SowKeys used within the topic when AMPS generates the SOW key. The default is the topic name, but it can be changed to a string value to unify SowKey values between different topics.

For example, if your application has a ShippingAddress SOW and a CreditRating SOW that both use /customerID as the SOW key, you can use a KeyDomain to ensure that the generated SowKey for a given /customerId is identical for both SOW topics. This does not affect how AMPS processes the SOW topics, but can make correlating information from different SOW topics easier in your application.

This option can only be specified when one or more Key fields are specified. When a SOW key generator module is used, or the publisher must send a SOW key, this option is not valid.

Default: The name of the SOW topic.

KeyGenerator

Specifies the SOW key generator module to use for this topic. When this configuration element is present, AMPS calls the specified module to generate a SOW key for each incoming message.

Default: Unset (no SOW key generator module). When there is no SOW key generator module specified, AMPS uses the specified Key fields if the Key fields are provided. If no generator is specified and no Key fields are specified, AMPS requires publishers to set a SOW key on each message published.

A KeyGenerator element contains the following elements:

  • Module: Required within a KeyGenerator element. The name of the module. This module must be loaded elsewhere in the configuration file.

  • Options: Contains one or more XML elements. These elements are provided to the key generator module as options. The options provided depend on the key generator. The creator of the key generator module must document the options for that module.

Memory and File Growth

The SOW topic configuration also specifies how the SOW file is allowed to grow. See the Setting SOW Parameters section in the Operations Best Practices topic of the AMPS User Guide for detailed recommendations. The configuration options for controlling how the file is allocated and how the file grows are listed below:

Element

Description

SlabSize

The size of each allocation for the SOW file, as a number of bytes. When AMPS needs more space for the SOW, it requests this amount of space from the operating system. This effectively sets the maximum message size that AMPS guarantees can be stored in the SOW. This size includes headers set by AMPS on the message.

60East recommends setting this value only if you will be storing messages larger than the default SlabSize or if performance or capacity testing indicates a need to tune SOW performance. If you plan to store messages larger than the default setting, 60East recommends a starting value of several times the maximum message size. For example, if your maximum message size is 2MB, a good starting point for SlabSize would be 8MB.

If it becomes necessary to tune the SlabSize, see the AMPS User Guide for a full discussion about tuning the SlabSize.

Default: 5MB

Maximum: 1GB

InitialSlabCount

The number of SOW slabs that AMPS will allocate on startup.

Default: 1

Maximum: 1024

Indexing Options

AMPS automatically creates a memo index for a field within a SOW topic when that field is used (for example, is used in a view or is queried by an application).

In addition, AMPS automatically creates a hash index (the primary key index) for the combination of fields used to define the SOW key. The options in this section allow you to manage index creation for a SOW topic.

Indexing is described in more detail in the Indexing SOW Topics section of the AMPS User Guide.

Element

Description

HashIndex

AMPS provides the ability to do fast lookup for SOW records based on specific fields.

When one or more HashIndex elements are provided, AMPS creates a hash index for the fields specified in the element. These indexes are created on startup and are kept up to date as records are added, removed, and updated.

The HashIndex element contains a Key element for each field in the hash index.

AMPS uses a hash index when a query uses a exact string match for all of the fields in the index. AMPS does not use hash indexes for range queries or regular expressions.

AMPS automatically creates a hash index for the set of fields specified in the set of Key fields for the SOW, if those fields are specified.

Index

AMPS automatically creates memo index fields as needed. This can include the first time a particular field is used in a query. AMPS supports the ability to create memo indexes for specific fields during startup using the Index configuration option.

When one or more Index elements are provided, AMPS creates memo indexes for any field specified in an Index element on startup, before a query that uses that field runs. Otherwise, AMPS indexes each field the first time a query uses the field. Adding one or more Index configurations to a SOW/Topic can improve retrieval performance the first time a query that contains the indexed fields runs for large SOW topics.

ExpectedKeyCountHint

For SOW topics that will contain a large number of distinct keys, providing an expected key count allows AMPS to pre-size the data structure that holds the key. This can provide a performance improvement for publishers by avoiding cases where AMPS has to resize the data structure.

There is no default for this value. When no value is provided, AMPS does not pre-size data structures for the SOW.

When provided, the value of this option should be the power of 2 closest to the maximum number of keys the topic is expected to hold. For example, a topic that is expected to hold 500,000,000 distinct records could set a value of 536870912 (that is, 2 ^ 29).

Historical Query Options

A SOW topic can, optionally, maintain the ability to query current values at a specific point in time. To specify this, include a History element in the topic configuration. The History element must include the following options:

Element

Description

History/Window (required if History is present)

Required when a History element is present.

For a historical SOW, the length of time to store history. For example, when the value is 1w, AMPS will store one week of history for this SOW.

Used within the History element.

History/Granularity (required if History is present)

Required when a History element is present.

For a historical SOW, the granularity of the history to store. For many applications, it is not necessary for AMPS to store all of the updates to the SOW. This parameter sets the resolution at which AMPS will save the state of a message. A value of 0s or equivalent specifies that AMPS will preserve every update within the Window.

For example, when you set a granularity of 1m, AMPS will save the state of the message no more frequently than once per minute, even when the state of the message is updated several times a minute.

Used within the History element.

Message Enrichment

AMPS can modify a message as it is published to the SOW topic from an application. See State of the World Message Enrichment in the AMPS User Guide for details.

Element

Description

Preprocessing

When present, specifies the message enrichment to be performed before AMPS determines the SOW key for the message.

The Preprocessing element must contain one or more Field elements that specify the enrichment to perform.

Enrichment

When present, specifies the message enrichment to be performed after AMPS determines the SOW key for the message.

The Enrichment element must contain one or more Field elements that specify the enrichment to perform.

Multiple Logical Topics in One Physical Topic

AMPS can store the last values for a set of topics that match the same naming pattern and use the same configuration in a single set of SOW data structure and physical SOW file. See Storing Multiple Logical Topics in One Physical Topic in the AMPS User Guide for details.

Element

Description

Pattern (required)

When present, declares that this topic will record multiple logical topics into one physical data structure and file, and specifies the pattern to use to determine if the topic that a message is published to will be captured in this topic.

Physical topics that include multiple logical topics have the benefits and limitations described in the AMPS User Guide.

There is no default for this element.

When this element is present, the Topic cannot specify History, Preprocessing, or Enrichment.

Legacy Protocols Note: The legacy header formats do not include support for subscribing to or querying from topics that use this element.

Last updated

Copyright 2013-2024 60East Technologies, Inc.