Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This chapter is for users who are new to AMPS and want to quickly get a simple instance of AMPS running. This chapter will describe how to install AMPS on a Linux system, describe the layout of the AMPS distribution, and use the included spark
command line AMPS client to send and receive a simple message. If you are on a Windows system without easy access to a Linux installation, a section at the end of the chapter includes information on configuring a Linux virtual machine.
This section covers the following topics:
Basic information on installing the AMPS server.
Information on running the server, including a description of the command line options for the server.
A brief description of the JSON format, which is used for the examples in this introduction.
An introduction to the basic command-line client included in the AMPS distribution.
AMPS runs on x64 Linux. This section describes how to install a development or evaluation system on Windows or MacOS
This section describes the monitoring interfaces provided for the AMPS server.
The AMPS engine binary is named ampServer
and is found in $AMPSDIR/bin
. Start the AMPS engine with a single command line argument that includes a valid path to an AMPS configuration file. You use the configuration file to enable and configure the AMPS features that your application will use. This guide discusses the most commonly used configuration options for each feature. The full set of options is described in the AMPS Configuration Guide.
The AMPS server generates a minimal sample configuration file with the --sample-config
option. You can save the sample configuration file to $AMPSDIR/amps_config.xml
with the following command line:
The server sample configuration only provides configuration for using AMPS to subscribe to and publish to ad hoc topics. The sample configuration file does not include any persistence for AMPS messages.
The file enables the instance monitoring interface (the "Galvanometer"), including the ability to query and subscribe to topics using a websocket connection.
A production configuration would likely provide persistent event and error logging to a file to allow an operations team to troubleshoot the instance and would typically persist monitoring statistics to a file. Such a configuration would likely enable additional message delivery features for certain topics and would also include configuration for high-availability and disaster recovery. The configuration would typically configure AMPS actions to perform routine maintenance.
On older processor architectures (and in some emulated environments) ampServer
will start the ampServer-compat
binary. The ampServer-compat
binary avoids using hardware instructions that are not available on these systems.
You can also set the AMPS_PLATFORM_COMPAT
environment variable to force ampServer
to start the ampServer-compat
binary. 60East recommends using this option only on systems that do not support the hardware instructions used in the standard binary. The ampServer-compat
binary will not perform as well as ampServer
, since it uses fewer hardware optimizations.
Once you have a configuration file saved to $AMPSDIR/amps_config.xml
you can start AMPS with that file as follows:
If your first start-up is successful, you should see AMPS display a simple message similar to the following to let you know that your instance has started correctly.
The version numbers and dates will be appropriate for the version that you've started.
If you see this, congratulations! You have successfully cranked up the AMPS!
The AMPS server binary supports the following command line options:
--verify-config
Parse and verify the specified configuration file, then exit.
--sample-config
Produce a minimal AMPS config.xml file to standard output, then exit.
--dump-config
Process the specified configuration file, resolving any Include directives and expanding environment variables. Dump the resulting file to standard output.
--version
Print the AMPS version string, then exit.
--help
Print usage information for the command line options accepted by the ampServer program, then exit.
--daemon
Run AMPS as a daemon process.
-D<variable>=<value>
Set the specified environment variable to the specified value when running the AMPS process. AMPS accepts any number of -D
options.
For example, to set the variable AMPS_PATH
to /mnt/fast/AMPS
use the command line option -DAMPS_PATH=/mnt/fast/AMPS
On the 60East website at http://www.crankuptheamps.com/evaluate the current release of AMPS is available for evaluation download.
To get started, download the Linux installation to a directory on your Linux system.
Installing AMPS is simply a matter of unpacking the distribution. The distribution contains the complete set of libraries and dependencies needed to run the AMPS server on a typical Linux server distribution. No additional software or packages are necessary for the server itself.
To install AMPS, unpack the distribution in the directory where you want the binaries and libraries to be stored. For the remainder of this guide, the installation directory will be referred to as $AMPSDIR
as if an environment variable with that name was set to the correct path.
Within $AMPSDIR
are the following sub-directories:
bin
AMPS engine binaries and utilities
docs
Documentation
lib
Library dependencies
sdk
Include files for the AMPS extension API
AMPS includes support for a wide variety of message types, as well as the ability to develop custom message types and to send binary payloads. This section focuses on JSON as the main message type used for samples in this guide. We use JSON for the guide because the format is simple, easily readable, and already in use in many environments.
JSON format is a simple, standardized message format. JSON has two basic constructs:
Objects that consist of key / value pairs
Arrays of values
JSON supports hierarchical construction: the value for a key can be a single value, an array of values, or another set of key/value pairs. For example, the following JSON message includes two nested sets of key value pairs. Notice that a key only needs to be unique within each set of values -- the name
value for the ship does not conflict with the name
value for the character.
Many AMPS applications use JSON as the payload. In addition, the amps
protocol used to send commands to AMPS represents commands in a simplified subset of JSON. For example, a publish command might look like:
The command to AMPS, using the amps
protocol, can be treated as a JSON document which contains the header information for AMPS -- in this case, a publish
to the topic test-topic
. The header is followed by the message body, the payload of the command.
While the amps
protocol is implemented as a subset of JSON, you can use any message type with the amps
protocol. The header for the command will still be JSON, while the body can be in the message type of your choice, as in the sample below, which publishes to an XML topic:
The AMPS client libraries create and parse AMPS headers. For example, the publish
method in the AMPS client libraries creates the appropriate header for a publish command based on the provided parameters.
Your applications use the Message
and Command
interfaces of the AMPS client libraries to work with the AMPS headers. There is no need for your application to parse or serialize the AMPS headers directly.
Welcome to the Advanced Message Processing System (AMPS) from 60East Technologies! AMPS is designed to help you quickly and easily develop and deploy data-intensive applications, with demanding requirements, for low latency and high performance. AMPS takes a nontraditional approach to messaging, storage, and analytics that is designed from the ground up for streaming data and highly-parallelized multicore systems.
AMPS isn't a traditional database or messaging product. This guide presents a brief introduction to help you understand the capabilities of AMPS and how AMPS operates.
AMPS is widely used for applications such as:
Tradeplant operations (including backtesting and historical analysis)
Risk calculations
Elastic worker farms
View servers
Message flow integration and "shock absorbers"
AMPS combines a set of capabilities that cut across traditional boundaries between applications that work with data.
AMPS is built around a fast messaging engine that supports both publish and subscribe (fan-out) and queued (competitive consumption) message delivery with full content filtering.
AMPS also provides an integrated database that applications can use as a current value cache, key/value document store, and fully queryable database -- or all of these at once. With this database, AMPS includes a built-in aggregation and analytics engine for near-real time analysis of streaming data, including aggregation across multiple topics or message formats.
Integrated message logging provides the ability to record and replay streams of messages with full fidelity.
AMPS is designed from the ground up for enterprise deployment at scale. AMPS provides an extensive set of high-availability features, including integrated replication and automatic failover and recovery for applications. Detailed monitoring and statistics are included from a RESTful interface for ease of data collection and integration with enterprise monitoring and management systems.
Authentication and entitlement capability applies to every operation in AMPS, for fine-grained control over permissions to meet enterprise policy and regulatory requirements. Access to data can be controlled at a topic level, at a message level (content-based security), or at the level of individual fields within a message (limiting the fields a given user has access to view).
60East developed AMPS to serve the needs of some of the most demanding data-intensive applications on the planet. The feature set and capabilities have been engineered for the highest levels of performance, designed for ease of use, and proven in production applications worldwide.
AMPS is designed to be a developer friendly product. 60East recommends reading about AMPS with a running instance of AMPS and your development environment of choice available. Although 60East makes every effort to clearly describe how AMPS works, there is no substitute for seeing exactly how a running instance behaves (not to mention the advantages of being able to try out ideas or do quick prototyping while you read).
The table below lists the main parts of the AMPS documentation:
Overview of AMPS functionality.
This is a good place to start if you are new to AMPS or if you are familiar with older versions of AMPS.
Detailed description of AMPS functionality, including guidance and best practices.
If you have detailed questions about how AMPS works, refer to the AMPS User Guide.
Guide to AMPS configuration.
This guide shows the configuration file syntax and accepted values. The guide also includes useful examples of configuring commonly-used options.
Guide to the RESTful monitoring interface and the AMPS statistics database.
Use this guide when creating a monitoring strategy or when collecting statistics about an instance.
Description of the commands sent from an AMPS client to the AMPS server and responses from the server.
Client Language Developer Guides
Guide to using a client library to work with AMPS.
This guide uses the spark
command line utility for basic examples for simplicity, although a production installation would use an application to perform these functions.
This section describes the overall AMPS approach and the features AMPS provides.
The AMPS messaging system is designed around a few simple principles:
Parallelize work and minimize waits and blocking to take full advantage of modern multisocket, multicore systems.
Eliminate redundant or unused work by only performing tasks that are necessary to provide the functionality requested by a given operation.
Reduce or eliminate cross-system coordination by solving the full range of data delivery and storage problems commonly faced by data-intensive applications.
Provide a small, flexible set of commands for ease of use.
Provide multiple delivery paradigms supporting both publish-subscribe delivery (many to many) and message queues (single consumption of a message) as well as the ability to query the state of a topic at a point in time.
Stay application-focused to provide exactly the capabilities that are heavily used in demanding high-performance applications.
Stay hardware aware and build for the future by engineering for next-generation commodity hardware and designing AMPS to fully exploit non-uniform memory access (NUMA), flash-based storage, and high-bandwidth networking.
These concepts are the foundation of how AMPS works and are helpful for understanding how to best use AMPS.
To best take advantage of AMPS, applications typically use the built-in features of AMPS rather than their traditional equivalents.
For example, rather than keeping a separate, independent record of each message published to AMPS for audit purposes, applications most often use one of the data persistence features in AMPS. This speeds development and simplifies deployment by eliminating integration effort, and also solves potential correctness issues which could be caused if messages in persistent storage become inconsistent with the messages provided through the messaging system. With AMPS, the messaging system itself can contain a fully-queryable and replayable record of the system.
As another example, AMPS provides integrated replication rather than relying on an external process. AMPS replication is aware of the format and semantics of the transaction log, the configuration of the instance, and the commands sent by publishers. This integration allows AMPS to very efficiently provide a full-fidelity message stream and to provide "self-healing" for an instance to catch up when it has been offline. Further, the message store used for replication (the AMPS transaction log) is also used for durable subscriptions and message replay. Designing and implementing these features together reduces complexity, storage requirements, and overhead to enable both capabilities. Within the AMPS server, the implementation uses a sophisticated parallelized algorithm for storage and replay that reduces overall latency and prevents slow consumers or replication destinations from affecting faster consumers. The overall result is to simplify configuration and application development, provide strong consistency and reliability guarantees, and provide the highest possible level of performance.
As a final example, rather than requiring a complex topic structure, requiring applications to oversubscribe and discard messages, AMPS provides both topic filtering and content filtering. AMPS includes an expressive filter grammar to provide precise selection of messages of interest to an individual subscriber. AMPS provides this capability to fully decouple publishers and subscribers. With AMPS, there's no need to maintain and administer a granular topic structure. Precise filtering and routing improves both network and processor utilization by providing only actionable messages to a subscriber. Likewise, for many applications, there is no need for a publisher to be aware of the processing performed by the subscriber or by AMPS itself.
The examples above highlight just a few of the capabilities AMPS provides and how the AMPS approach simplifies development, administration, and operations while providing reliability and performance benefits over conventional systems.
Some of the highlights of AMPS features include:
Topic based publish and subscribe, including full support for regular expressions to specify topic names.
Content filtering based on XPath identifiers (to specify the fields of a message) and SQL-92 (to form a predicate), with added support for Perl-Compatible Regular Expressions (PCRE2).
Message queues including content filtering for both publishers and subscribers, configurable strategies for delivery fairness, and truly distributed queues that can efficiently enforce queue semantics and delivery guarantees across a replicated network of AMPS instances.
Content-aware messaging support for a wide range of message types, including standard formats such as JSON, FIX, MessagePack, XML, Google Protocol Buffers, and BSON. AMPS also supports simple key/value pairs in FIX format (called NVFIX to emphasize that the format uses name/value pairs rather than FIX tags), and a high-performance binary protocol called BFlat. AMPS also supports uninterpreted binary messages, and allows you to create composite message types from existing types to easily combine messages of different types in a single payload.
An integrated database and record-aware current value storage (called State of the World, or SOW), with optional historical query capability.
Historical replay of message streams, including the ability to preserve the total message ordering across independent topics.
Integrated replication and high availability, including automatic resynchronization for instances that fail over.
Aggregation and Complex Event Processing (CEP), including the ability to aggregate information across different message streams and message streams of different formats.
Advanced messaging capabilities such as atomic query-and-subscribe, incremental (delta) updates, and out-of-focus notifications that tell a subscription when a record no longer matches.
Built in statistics and monitoring, with data provided via a standard RESTful interface.
Integrated authentication and entitlement across all AMPS features.
Actions for automating AMPS functionality, including both routine maintenance tasks and dataflow-aware processing (such as alerting in response to slowdowns or invalid data).
Client development kits for popular programming languages such as Java, C#/.NET, C++, Python, JavaScript, and Go.
Extensibility API in the AMPS server for adding message types, extending the functions available to the AMPS query language, adding new actions, integrating with enterprise authentication and entitlement systems, and more.
This guide provides an overview of the most commonly used functionality of AMPS, but it is not intended to cover all of the features of AMPS or provide an exhaustive discussion of any individual feature. As mentioned earlier, the AMPS User Guide provides full details about AMPS features.
Welcome to the AMPS Server Documentation! This set of documentation contains detailed information on the AMPS server itself for version 5.3.4.
If you are looking for developer docs for client libraries, or previous versions of the AMPS server documentation, see the documentation page on the 60East web site.
Here are some suggested starting points:
You can also visit the AMPS Server FAQ site for frequently asked questions, and the AMPS developer pages for resources on developing applications with AMPS.
New to AMPS
Beginning an Evaluation of AMPS
Understanding an AMPS Feature
(see the chapter on the feature in question)
Planning a Deployment of AMPS
Developing Applications with AMPS
(developer guide and API reference for your language of choice -- available from the AMPS ) (further reading in the and for features you will use)
Verifying Configuration Options
Troubleshooting an Issue
Contacting 60East Support
The AMPS server runs on 64-bit Linux operating systems. If you do not have access to a Linux system or a recent version of Windows, 60East recommends creating a Linux virtual machine to host the instance of AMPS. This is a convenient option for development systems and allows you to easily experiment with different AMPS configurations on a dedicated system.
This section provides general information for creating a virtual machine image for use as a local development or evaluation environment.
This section assumes that you are familiar with Linux and the virtualization program you will be working with. It focuses on information specific to AMPS.
If your development system is running a recent version of Windows, then Windows Subsystem for Linux 2 is a good option for developing with AMPS. Getting AMPS running is simply a matter of starting a Linux shell, downloading AMPS, and following the directions for Linux.
Notice that Windows Subsystem for Linux 2 does not provide access to some of the functionality that the AMPS server expects: in particular, the AMPS NUMA subsystem may not be able to determine the physical processor layout and may report warnings on startup. Nevertheless, this can be a very good option for doing AMPS evaluation and development on a Windows system.
When creating the virtual machine image, 60East recommends the following parameters:
x64 processor
At least 4GB of memory allocated to the virtual machine
Minimum of 120GB drive space (most will be consumed by the operating system image)
At least 2 virtual processors
AMPS itself can run with less memory, processor, and disk capacity than recommended here. However, these settings will typically provide reasonable performance and enough capacity to do basic development work.
When installing AMPS on Virtual Box, 60East strongly recommends setting the network hardware emulation to use the Paravirtualized network adapter (virtio-net). For recent versions of Linux, performance is dramatically improved (even over the loopback interface) when using this setting.
AMPS runs well on any Linux distribution that meets the basic requirements. The Ubuntu Linux distribution is a good choice, and is frequently used by both customers and the 60East developers as a development workstation environment. Visit https://www.ubuntu.com/download to download the latest released version of Ubuntu.
Whichever distribution you choose, 60East recommends that you download the .iso file and use that file to install the operating system.
AMPS itself doesn't require anything beyond a basic operating system distribution. For the best experience while you are evaluating and getting to know AMPS, 60East recommends that you choose a profile optimized for software development or desktop use.
Select the following additional packages if your distribution does not already install them:
Python 2.6/2.7 or 3. The utility scripts in the AMPS distribution require Python.
Java runtime environment (1.7 or more recent). The spark
command line AMPS client is written in Java, and requires a JRE. This guide assumes that you have a JRE available, and presents examples using spark
.
g++, gdb, and your IDE of choice if you will be developing C++ applications with AMPS.
A web browser such as Firefox or Google Chrome
AMPS is designed and tested to use a Linux-based filesystem such as ext4
or filesystems that provide full native Linux filesystem semantics (for example, using nfs
mounted filesystems for testing or archival purposes).
Mounting another type of filesystem (for example, an NTFS
volume) in a VM, container, or WSL 2 may cause failures or unexpected results, since that approach may not provide all of the filesystem operations that AMPS uses.
When using a container, VM, or WSL2, make sure that AMPS and the files that AMPS creates are hosted on filesystems that support Linux file operations, in particular, that a process running under the Linux environment can memory map files hosted on that filesystem.
Once you have created the virtual machine image and installed your Linux distribution of choice, you can install and start AMPS as described in Installing AMPS.
AMPS provides the spark
utility as a command line interface to interacting with an AMPS server. spark
provides many of the capabilities of the AMPS client libraries through this interface. The utility lets you execute AMPS commands from the command line. spark
is a Java application, and requires Java runtime environment version 1.7 or later on the system.
spark
is most commonly used for ad hoc testing or simple maintenance tasks. For more complicated tasks or more sophisticated maintenance, 60East recommends using one of the client libraries (such as the AMPS Python Client).
To test spark
with the sample configuration, run the following command:
This command tests connectivity to the AMPS server running at port 9007
on the local system. It confirms that the server is listening on that port using the default protocol for AMPS and accepts JSON messages on that port. The command should produce output like the following:
You can read more about spark
and other useful tools for troubleshooting AMPS in the Utilities chapter of the AMPS User Guide.
When the Admin
interface is configured (as it is in the sample configuration), you can get information about the state of the AMPS instance using either the Galvanometer monitoring tool or the RESTful interface to the AMPS statistics.
The Galvanometer is a Javascript-based application that runs in your browser and provides a visualization of the data provided by the RESTful interface. The Galvanometer also includes a lightweight read-only AMPS client application, based on the Javascript client library.
The RESTFul admin interface is a lightweight view of the statistics database that AMPS maintains.
These two interfaces are available at the following URIs:
In the URIs above, <host>
is the host the AMPS instance is running on and <port>
is the administration port configured in the configuration file (this is 8085
in the sample configuration).
Monitoring applications typically collect information from the RESTful statistics interface. Interactive or ad hoc monitoring can use either the Galvanometer, or the interface offered by the monitoring application in use locally once the statistics are collected.
One of the core features of AMPS is the ability to persist the most recent update for each distinct message published to a topic. To enable this for a topic, you add the topic to the SOW.
You can think of the SOW as a database that maintains a specific set of topics, equivalent to tables. Each distinct message published to that topic is equivalent to updating a row in the table. AMPS allows applications to query the table for the current state of the topic.
SOW topics also provide full support for pub/sub messaging. Applications can use a combination of queries and subscriptions as necessary. AMPS also includes a set of commands that perform an atomic query and subscribe, allowing an application to query a SOW topic and register for updates to the topic in a single operation, without risk of missing messages or receiving duplicates.
The most common uses of SOW topics include:
Quickly loading initial state for an application. For example, an application that tracks open orders can quickly retrieve a snapshot of all of the orders that are currently open, without having to wait for updates to the orders to be published.
Queryable snapshots of data flows. For example, an application that monitors telemetry data may need to quickly determine if any telemetry source has not provided an update within a given period of time. With a SOW topic, the application can run a simple query over the current state of the topic.
NoSQL document stores. SOW topics are frequently used as high-performance key/value stores: an application can choose to explicitly provide a key and store a document in the SOW. Documents can be efficiently retrieved by key, queried over the full content of the document, or any combination. As mentioned above, a consumer can retrieve the document and be automatically notified when the content of the document changes.
SOW topics are also the foundation of many of the more advanced capabilities of AMPS, including out-of-focus tracking, aggregation, and delta messaging. These are described later in this chapter.
For more information on the monitoring capabilities available in AMPS, see the chapter on in the AMPS user guide. For detailed information on the statistics available, see the .
For applications that are transitioning from topic-based routing and that, therefore, need to maintain the last value per topic for a large number of topics (hundreds, thousands, or more), AMPS provides the ability to reduce the overhead in creating a large number of identical topics that contain a single message. More details on the are available in the .
Galvanometer
http://<host>:<port>/
RESTful Statistics
http://<host>:<port>/amps
Storing a topic in the State of the World is most useful when your application needs to use the current state of the data being tracked. Storing a topic in the State of the World can be especially useful if your application would benefit from automatically receiving updates as soon as they are made (described in more detail in the Atomic Query and Subscribe topic).
Below you will find common uses of a SOW topic, which include examples of practical use cases:
An application needs the current state of a record, but does not need to recreate the message flow that created that record:
An order fulfillment system presents a view of all currently pending orders when the application starts up.
An application needs the current state of a record or set of records, even when the topic is high-volume or quickly changing:
A warehouse management application locates the current inventory level for a product.
A taxi dispatch company locates taxis currently within 10 blocks of an event.
An application wants to be able to publish incremental updates to a record:
A customer updates her shipping address. All pending orders for the customer are automatically updated without affecting any other information in the order, and processors working with the orders are notified of the change.
An application wants to receive only the changed fields of a record:
A mobile application displays the status of an order as the order progresses through the stages of validation: the application receives only the identifier for the record and the changed fields.
An application needs the AMPS server to calculate values based on the current values of a record or set of records:
A management console constantly calculates the real-time value of pending orders. The console uses a view, calculated based on data saved in a topic in the SOW.
An application wants to store application state for quick retrieval:
An order processing system publishes statistics on each step of the process: a separate process monitors and aggregates those statistics. The SOW also maintains historical state for the topic so the monitor can easily recreate a snapshot of the state at a point in time and compare day over day status.
Of course, the examples above are just a small sample of the ways the AMPS SOW can be used.
To create a SOW topic, you configure the topic in the SOW section of the AMPS configuration file.
At a minimum, SOW topics require a Name
, and the MessageType
of the messages to store in the SOW. If the SOW will be persistent, a FileName
is required. Most often, SOW topics use AMPS to generate the SOW Key, and one or more Key
definition elements are required to specify the fields that AMPS will use for the SOW Key.
For example, the following configuration file fragment specifies a SOW topic named test-sow
. The topic stores JSON-format messages, and uses the /id
field of incoming messages to that topic to uniquely identify messages. Records in this topic will be both maintained in memory and persisted to a file in the ./sow/
directory, so the contents of the topic will be retained across restarts of the AMPS instance. Notice that the file name specification uses the special format character %n
as a placeholder for the topic name and message type.
The AMPS User Guide and AMPS Configuration Guide contain full details on configuring a SOW topic.
The practical examples later in this section use the configuration above.
A SOW topic is the basis for many of the advanced messaging features in AMPS. While not all of these features are discussed in detail in this introduction, many features of AMPS are made possible because AMPS can retain the current state of each unique message.
The advanced messaging features that the SOW enables include:
Views and aggregations over topics (including joins between topics)
Publishing incremental updates to a message (called delta publishing in AMPS)
Receiving incremental updates to a message (called delta subscription in AMPS)
Determining when a message no longer matches a filter (called out-of-focus notification in AMPS)
Providing a snapshot of an update to a rapidly changing record at regular intervals, rather than providing every update (called conflation in AMPS)
These features can greatly simplify the processing an application needs to perform, making it easier to develop applications and increasing application performance. However, for a messaging system to provide these features, whenever a message arrives, the messaging system must have access to both the current message and the previous, saved state of the message. SOW topics provide that access for AMPS, and enable the advanced messaging features.
AMPS SOW topics persist the most recent update for each message, in the same way that a relational database stores the current state of each record. For performance, AMPS SOW topics store the full content of the message verbatim rather than storing a deserialized or "shredded" version of the message.
Each distinct record in a SOW topic is identified by a SOW key. AMPS treats the SOW key for a SOW topic the same way a relational database uses the primary key for a table: each distinct SOW key value is a unique message.
There are several ways to create a SOW key for a message. Each topic defines one of the following strategies:
Most applications specify that AMPS will calculate a SOW key based on the content of the message. The configuration of the topic specifies the field, or fields, to be used for the key.
A topic can also be configured to require that a publisher provide a SOW key for each message when publishing the message to AMPS. This is less commonly used than determining the key based on the message content, however, since this strategy does not require any explicit configuration, AMPS will default to this strategy for identifying messages if no other strategy is specified.
AMPS also supports the ability for custom SOW key generation logic to be defined in an AMPS module, which will be invoked to generate the SOW key for each message.
Although the SOW key is derived from the content of the message in many cases, the SOW key itself is metadata, distinct from the content of the message. Each record in a SOW topic has a distinct SOW key, which is stored with the record.
For example, the diagram below shows how AMPS computes the SOW key for a topic named ORDERS with a key definition of /orderId
. For each publish to the topic, AMPS uses the value of the key fields (in this case, simply /orderId
) to compute a SowKey
, then uses that SowKey
to insert or update the appropriate record.
At any point in time, applications can issue SOW queries to retrieve all of the messages that match a given topic and content filter. When a query is executed, AMPS will test each message in the SOW against the content filter specified and all messages matching the filter will be returned to the client. The topic can be a literal topic name or a regular expression pattern. For more information on issuing queries, please see Querying the State of the World (SOW) in the AMPS User Guide.
A SOW query is atomic. Updates that occur while the query is running, or while a client is receiving results, are not returned as part of the query.
Here's how to use spark
to query the current state of an AMPS SOW topic.
This example assumes that:
You have configured a topic named test-sow
in the AMPS server of message type JSON.
The test-sow
topic uses the /id
field of the message to calculate the key for the topic.
To retrieve the current state of the topic, an application issues the sow
command. Unlike a subscription, which stays active until it is explicitly stopped (or the application disconnects), the sow
command provides results for a specific point in time. Once the results are returned, the command is over.
First, publish a message or two to the test-sow
topic:
Open a new terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to send a single message to AMPS:
spark
automatically connects to AMPS and sends a logon command with the default credentials (the current username and an empty password). With the publish
command, spark
reads the message from the standard input and publishes the message to the JSON topic test-sow
. The command produces output similar to the following line (the rate calculation will likely be different:
When the publisher sends the message, AMPS parses the message to determine the value of the Key fields in the message, and then either inserts the message for that key, or overwrites the existing message with that key.
You can publish any number of messages this way. Each distinct id
value will create a distinct record in the topic.
Next, retrieve the current contents of the topic:
Open a new terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to retrieve the contents of the topic:
spark
automatically connects to AMPS and sends a logon command with the default credentials (the current username and an empty password). spark
then sends the sow
command to AMPS. This command requests the current contents of the test-sow
topic. Since the command is finished once the query is complete, spark
will exit when the query results are complete.
spark
shows the current contents of the topic. Notice that the output is strictly the message data, separated by newline characters. spark
does not show any of the metadata for a message.
AMPS provides the ability to record topics and replay those topics at a later time. This capability is called the transaction log.
The AMPS transaction log fully supports topic and content filtering. You configure the transaction log to keep a journal of incoming messages for one or more topics, and then you can replay those messages, in order, from any point in time. With the (optional) high-availability features in the AMPS client libraries, this also provides a way to ensure that in case of failure, an application can resume the subscription without missing messages or receiving duplicate messages.
The AMPS transaction log is most often used for:
Fully Resumable Subscriptions - With the transaction log, you can ensure that an application receives all messages of interest, even in the event of a failure.
Backtesting and Audit - The transaction log allows you to replay the precise messages published, in order across all topics in the instance, at a configurable maximum rate. You can use this feature to easily audit the flow of messages, perform backtesting, or replay a sequence of events.
Capacity Planning and Stress Testing - Since the transaction log allows you to set the maximum replay rate to be a multiple of the original publish rate, you can use the transaction log to measure the load on a system at various rates, and measure the capacity of the system and the ability of your application to correctly handle increased volumes.
The transaction log is also the source of messages for:
The AMPS transaction log can typically record messages at the maximum throughput of the underlying storage device.
60East recommends storing the transaction log on a device that supports fast sequential writes, and ensuring that the device has the speed and capacity necessary to support the expected throughput. (The Operation and Deployment chapter of the AMPS User Guide includes guidance on capacity planning.)
The AMPS transaction log records messages that are published to the topics specified in the configuration file. Every publish is stored, in the order in which the AMPS instance processed the message.
For ease of maintenance, the AMPS server writes multiple sequential files, called journal files, for the transaction log rather than writing a single large file. The journals contain the full content of each message, as well as information on the topic, the publisher, the time at which the message was processed, and so on. The AMPS configuration sets the maximum size of a journal file. When a file reaches that size, AMPS begins writing to the next file.
When AMPS records a message into the transaction log, it assigns each message a bookmark. The bookmark identifies a single message, that is, a specific point in the transaction log of the local instance.
The AMPS server does not modify the contents of journal files. Once a message is written to a journal file, it is part of the transaction log and is considered to be immutable.
Since journal files form part of the persistent state of the server, those files should not be modified or removed while the AMPS process is running except by the AMPS process itself. The AMPS server provides a set of maintenance actions for managing journal files (see the AMPS User Guide for details).
To create a Transaction Log, you add the TransactionLog
configuration element to your AMPS configuration file. You then specify a location for AMPS to create journal files, and specify the topics that you want recorded in the file.
The following configuration is the minimum configuration to create a transaction log and record a single topic:
The configuration above writes journal files to the journals
directory underneath the AMPS server's current working directory. The configuration records a single topic, some-topic
, of message type json
to the transaction log. The Name
option of the Topic
configuration element can be either a literal topic name, or a regular expression that matches the names of a set of topics to be recorded.
Although this configuration works perfectly well, AMPS provides a number of additional options that are useful for managing transaction logs in production. AMPS also provides a set of administrative actions for setting the archival and retention policy for journal files.
A more complete configuration might include options along the following lines:
In this configuration, journals are created in the journals
directory underneath the AMPS server's current working directory, as before. This configuration records two sets of topics and one individual topic. Taking these in the order in which they appear in the configuration file, this instance of AMPS will record:
Any topic that begins with /orders
and is of message type JSON.
Any topic that begins with /status/customer
and is of message type FIX.
The topic /audit/events
of message type binary.
The sample above also includes a basic journal maintenance configuration. Configuring journal maintenance is strongly recommended for any instance of AMPS that will be running on a regular basis.
For this AMPS installation, the size of the journal files has been reduced from the default 1GB
size to a 100MB
size. This typically indicates that the instance stores less than 1GB of messages during a day, so the default journal size would include multiple days worth of messages.
This configuration specifies a two-step maintenance process:
After 3 days, journal files will be archived to the /mnt/high-capacity/journals
directory -- the directory specified in the JournalArchiveDirectory
parameter of the transaction log. These journal files remain a part of the transaction log but are moved to a different location (typically on a different device with higher capacity).
After 7 days, journal files will be deleted.
AMPS will run this maintenance plan every day at 21:30 (9:30 PM) local time.
When journal files are moved to the archive directory, they continue to be part of the transaction log, but they do not have to be on the same device as the JournalDirectory
. Most often, a production installation of AMPS will keep journal files that are very active on fast storage and keep a longer period of history on storage that is higher capacity and lower cost. Since these devices typically also have lower throughput, these devices are best for files that must still be retained but are infrequently used.
Full details on these options are available in the AMPS User Guide and the AMPS Configuration Guide.
See chapter on Record and Replay Messages in the AMPS User Guide for a more complete discussion of the transaction log and message replay.
The AMPS client libraries include samples for publishing messages and replaying messages from the transaction log. See the client library distribution for those samples.
When a topic is recorded in the SOW, an application can request the current state of the topic and simultaneously subscribe to updates from the topic. In this case, AMPS first delivers all of the messages that match the query and then provides any update to a record that matches the query. AMPS guarantees that no updates are missed or duplicated between the query and the subscription. As with a simple query, AMPS will test each message currently in the SOW against the content filter specified and all messages matching the filter will be returned to the client. When the query begins, AMPS enters a subscription with the provided filter. After the query completes, AMPS delivers messages from the subscription. In the event that a record is updated while the query is running, AMPS saves the update and delivers it immediately after the query completes.
As with a simple SOW query, the topic can be a literal topic name or a regular expression pattern. For more information on issuing queries, please see Querying the State of the World in the AMPS User Guide.
Here's how to use spark
to query the current state of an AMPS SOW topic and subscribe to updates.
This example assumes that:
You have configured a topic named test-sow
in the AMPS server of message type JSON.
The test-sow
topic uses the /id
field of the message to calculate the key for the topic.
To retrieve the current state of the topic and subscribe, an application issues the sow_and_subscribe
command. Since the command includes a subscription, the command stays active until it is explicitly stopped (or the application disconnects).
First, publish a message or two to the test-sow
topic:
Open a new terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to send a single message to AMPS:
spark
automatically connects to AMPS and sends a logon command with the default credentials (the current username and an empty password). With the publish
command, spark
reads the message from the standard input and publishes the message to the JSON topic test-sow
. The command produces output similar to the following line (the rate calculation will likely be different:
When the publisher sends the message, AMPS parses the message to determine the value of the Key fields in the message, and then either inserts the message for that key, or overwrites the existing message with that key.
You can publish any number of messages this way. Each distinct id
value will create a distinct record in the topic.
Next, retrieve the current contents of the topic:
Open a new terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to retrieve the contents of the topic:
spark
automatically connects to AMPS and sends a logon command with the default credentials (the current username and an empty password). spark
then sends the sow_and_subscribe
command to AMPS. This command requests the current contents of the test-sow
topic and creates a subscription to the topic.
spark
shows the current contents of the topic. Notice that the output is strictly the message data, separated by newline characters. spark
does not show any of the metadata for a message.
spark
remains running after the query completes, waiting for new publishes to arrive.
Publish more messages (or updates to the existing messages) to the topic. In the terminal you opened to publish the first messages:
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to send a message to AMPS:
Notice that the subscription receives the message.
If you close the subscriber and re-run it, you will see that the second time the subscriber runs, it receives the updated messages in the query and, again, waits for changes to arrive.
The simplest way to use AMPS is as a low-latency publish and subscribe messaging system. Publish and subscribe messaging is at the heart of AMPS, and all of the other features of AMPS build on this foundation.
In a publish and subscribe messaging system, publishers send messages without necessarily knowing which subscribers will receive the message. This decouples publishers from subscribers for maximum flexibility.
While publishers do not need specific knowledge about the subscribers, publishers are responsible for adding information to the message so that subscribers know which messages are of interest. In publish and subscribe messaging systems, including AMPS, publishers send messages to a specific topic. The topic most often indicates the type of message, and is a way for the subscriber to locate the messages of interest. For example, in an order processing system, a publisher might publish messages to an ORDERS
topic. Subscribers that need to receive orders then subscribe to the ORDERS
topic, and receive messages that are sent to that topic.
Each message in AMPS is published to a specific topic. The publisher chooses the topic when the message is published, and subscribers can receive messages from that topic.
Unlike many messaging systems, AMPS provides an additional layer of selectivity for subscribers. Rather than receiving every message from a given topic, an AMPS subscriber can use content filtering to receive only the messages that the subscriber needs to process. Content filtering provides several advantages. First, being more selective about the messages delivered to the subscriber makes better use of bandwidth between AMPS and the subscriber, since the subscriber only receives messages that are of interest. Subscriber code is easier to write and more efficient because the subscriber is guaranteed that the messages received have the values requested. Further, because the subscriber chooses which messages to receive, content filtering makes publishers and subscribers less tightly-coupled. A publisher does not need to know what fields are important to a subscriber, or whether a field that was previously unused is now important.
The diagram below shows the basic concept of publish and subscribe messaging:
In the diagram above, there is a Publisher sending AMPS a message to the ORDERS
topic. The message being sent contains information on Ticker IBM
with a Price of 125
. Both of these fields are contained within the message payload itself (i.e., the message content). AMPS routes the message to Subscriber 1 because it is subscribing to all messages on the ORDERS
topic. Similarly, AMPS routes the message to Subscriber 2 because it is subscribed to any messages having the Ticker equal to IBM
. Subscriber 3 is looking for a different Ticker value and is not sent the message.
Unlike many messaging systems, AMPS does not require any configuration for simple publish and subscribe. For this basic functionality, there is no need to declare topics in advance. Since there is no need to declare these topics, topics that provide basic publish and subscribe are often referred to as ad hoc topics in AMPS.
It is valid for a publisher to publish to any topic, whether or not that topic has been previously configured. Likewise, it is valid for a subscriber to subscribe to any topic, whether or not a message has been previously published to that topic or whether the topic appears in a configuration file. Features that rely on persisted messages, however, are not available without configuration for a topic.
Every topic in AMPS has a specific message type. Publishers and subscribers don't need to explicitly set the message type when publishing to or subscribing to a topic. Each connection to AMPS specifies the message type to be used for that connection -- either implicitly, by connecting to a port that provides a specific message type, or explicitly (when connecting to a port that can provide multiple message types).
AMPS allows topics that use different message types to have the same name, but considers them to be different topics. Messages published to an XML topic named quotes
will not be delivered to subscriptions to a JSON topic named quotes
.
AMPS allows subscribers to provide a regular expression that defines a set of topics rather than a literal topic name. This further decouples publishers and subscribers.
It's important to remember that each message is published to a specific topic. Regular expressions are only applicable to subscriptions: a publisher should not use regular expressions in the topic when publishing messages.
Here's how to use spark
to send and receive a message from AMPS. The example assumes that you're using the sample configuration file produced by the AMPS server, and that you are running spark
on the same system that AMPS is running on.
First, start a subscriber:
Open a terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to start a subscription:
This command starts a subscription to the JSON topic test
.
spark
will connect to AMPS, logon using default credentials (the current username and an empty password) and enter the subscription. Unless there are errors, the command will produce no output until a message arrives.
Leave this terminal running. When you publish a message to the test topic, spark
will print the message in this terminal.
Next, publish a message to the same topic:
Open a new terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to send a single message to AMPS:
As with the subscriber sample, spark
automatically connects to AMPS and sends a logon command with the default credentials (the current username and an empty password). With the publish command, spark
reads the message from standard input and publishes the message to the JSON topic test
. The command produces output similar to the following line (the rate calculation will likely be different):
When the publisher sends the message, the subscriber should receive the message and produce the following output:
Congratulations! You've just sent your first message through AMPS.
Although this example is simple and relies on default behavior, the sample demonstrates some important AMPS concepts:
As mentioned earlier, there is no need to preconfigure simple publish/subscribe topics. Since the default configuration file doesn't specify any settings for the JSON topic test
, AMPS knows to treat the topic as a simple pub/sub topic. Also, since the spark
commands specify the JSON message type when connecting to the server, the topic is a JSON topic.
For simple publish and subscribe topics, AMPS delivers the message verbatim to the subscriber. AMPS doesn't interpret or normalize the message. In fact, AMPS doesn't even parse the message unless there's a need to. With this configuration and this subscription, there's no need for AMPS to parse the message, so no parsing happens.
The spark
program connects to AMPS and logs on to AMPS before sending any commands. All AMPS installations include authentication and entitlement. By default, AMPS loads an authentication and entitlement policy (implemented as an AMPS module) that requires a logon, but accepts any username and password as credentials. This policy is intended for evaluating, testing and development purposes. More information on securing an AMPS instance is available in the User Guide. For this example, the important point is to be aware that an AMPS instance always has a security policy and that policy was at work even in this simple example. The default behavior for spark
works with the default policy for AMPS.
In the Basic Sub-Pub example, the subscriber requested all messages on the JSON test
topic. AMPS includes expressive, flexible and extensible content filtering that allows subscribers to specify exactly the messages that they want to receive. Content filtering is one of the most useful features of AMPS. When subscribers use content filtering, publishers can be completely independent of subscribers. The publisher does not need to know which parts of a message are important to subscribers. Subscribers can precisely declare the content that they are interested in, so they only receive relevant messages. Publishers do not need to be updated when subscribers add additional criteria or when new subscribers come online.
You can think of content filtering as adding a WHERE clause to the subscription. Like a WHERE clause, AMPS returns only matching messages.
AMPS content filters use a combination of XPath identifiers to locate a value within a message and SQL-92 operators for comparing those values. For example, given a JSON message like:
The following content filters would match the message:
This filter uses the equality operator, =
, to compare the /note
field in the message with an exact match for the string.
The LIKE
operator uses Perl Compatible Regular Expressions to match a field. In this case, the regular expression matches any string that contains world
, using case-insensitive matching.
The AMPS BEGINS WITH
operator matches any string that begins with the exact sequence of characters provided.
Here's how to use spark
to subscribe using a content filter. The example assumes that you're using the sample configuration file produced by the AMPS server and that you are running spark
on the same system that AMPS is running on.
First, start a subscriber:
Open a terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to start a subscription:
This command starts a subscription to the JSON topic test
. This subscription will only return messages where the filter matches.
spark
will connect to AMPS, logon using default credentials (the current username and an empty password) and enter the subscription. Unless there are errors, the command will produce no output until a message arrives.
Leave this terminal running. When you publish a message to the test topic that matches the filter, spark
will print the message in this terminal.
Next, publish messages to the subscriber:
Open a new terminal in your Linux environment.
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to publish a message to AMPS. This message matches the filter:
Use the following command (with AMPS_DIR
set to the directory where you installed AMPS) to publish a message to AMPS. This message does not match the filter:
Each time you run spark
, it automatically connects to AMPS and sends a logon command with the default credentials (the current username and an empty password). With each publish command, spark
reads the message from standard input and publishes the message to the JSON topic test
. Each of the commands above produces output similar to the following line (the rate calculation will likely be different):
When the publisher sends a message that matches the filter, the subscriber should receive the message and produce the following output:
AMPS provides high-performance publish and subscribe messaging that requires minimal configuration and provides high performance, flexible publishing and message routing.
The AMPS client libraries include samples of basic publish and subscribe functionality. See the client library distribution for those samples.
The AMPS server and the AMPS client libraries provide various options for recovering and resuming subscriptions. Use this cross-reference to choose the recovery strategy that best matches the needs of your application.
At 60East, the most important part of what we do is helping people deploy systems that utilize AMPS and supporting them in maintaining the ongoing and intended operation of those systems. Considering that AMPS is often used in essential systems that push the limits of hardware, network and storage capacity, we know that support is essential to help you build, deploy and maintain the kinds of cutting-edge applications that we built AMPS to handle.
During the evaluation and development stages, we encourage you to share details with us about what you are building. This way, we can provide assistance with the design and architecture process. Once your application goes into production, use 60East support to help diagnose and correct issues that fall outside of the normal operation of the application.
The level of support you have available is dependent on your support agreement. For an outline of your specific support policies, please see your 60East Technologies License Agreement. Support contracts can be purchased through your 60East Technologies account representative.
You can save time if you complete the following steps before you contact 60East Technologies Support:
Check the documentation
Isolate the problem
If you require Support Services, please isolate the problem to the smallest test case possible. Capture erroneous output into a text file along with the commands used to generate the errors.
Collect your information
Your product version number.
Your operating system and its kernel version number.
The expected behavior, observed behavior and all input used to reproduce the problem.
Submit your request.
The AMPS version number used when reporting your product version number follows a format listed below. The version number is composed of the following:
Support is offered through the United States:
Other support options (such as support via phone, dedicated engineers, and so on) may be available, depending on the terms of your support agreement.
The sections on and have full details on the expression language used in AMPS.
See in the AMPS User Guide for a more complete discussion of subscribe and publish.
The scenarios above describe just a few of the most common recovery scenarios for a subscription. For recovery scenarios that aren't described above, contact 60East at for advice and guidance.
The problem may already be solved and documented in the AMPS User Guide or Configuration Guide for the product. Check the support site at where 60East Technologies also provides answers to frequently asked support questions.
In your email to include the minidump file if you have one.
Please contact 60East Technologies Support Services according to the terms of your 60East Technologies License Agreement. Visit the support site at for evaluations.
Automatically recover subscription without replaying missed messages.
HAClient / Subscribe
Recover subscription and replay any messages missed while application is offline.
HAClient / Transaction Log / Bookmark Subscription / Bookmark Store
Recover subscription, get current state of a set of messages upon recovery and receive updates to that state.
HAClient / State of the World / SOW and Subscribe command
Web
E-Mail (non-technical)
While there is much more content beyond the scope of this document, here are some of the topics to learn about after reading this guide.
AMPS provides a rich logging framework that supports logging to many different targets including the console, syslog, and files. Every error and event message within AMPS is uniquely identified and can be filtered out or explicitly included in the logger output. The Logging section in the AMPS User Guide describes the AMPS logger configuration and the unique settings for each logging target.
Another challenge that faces developers working with high-volume data flows is the fact that not every consumer can keep up with the rate at which data arrives.
For example, an application may display a view of data that is updated hundreds or thousands of times a second. The update rate for some data can be faster than the UI framework can redraw the grid that holds the data. Without a strategy for managing these updates, the application can be unresponsive, show outdated data, consume a large amount of memory -- or have all of those problems at the same time.
To help in this situation, AMPS provides built-in support for limiting the volume of updates. This feature is called conflation.
With AMPS conflation, an application receives updates for a particular message at most once within a specified interval. When an update for a record is sent to the application, the update contains the most current state of the record at that time. The application always receives the most current data. However, no matter how many times the record is updated during the interval, the application only receives the most current update at the end of the interval.
AMPS provides two forms of conflation:
Conflated topics are declared in the server configuration. The AMPS server keeps only a single copy of the current message state for all subscribers.
Conflated subscriptions are requested by an individual subscriber. AMPS keeps a copy of the current message state for each subscription, and that copy of the message state is not shared between subscriptions.
As an example of the value of conflation, imagine a SOW topic called PRICING
that contains the current price for a set of instruments, and imagine that updates to the pricing are being published to the topic in real time. Several applications subscribe to this topic to display the latest prices for a subset of the instruments in a GUI front-end.
If this GUI front-end only needs updates in two second intervals from the PRICING
topic, then more frequent updates would be wasteful of network and client-side processing resources. Likewise, if the GUI front end attempted to process and display every update to the prices, the incoming volume of updates might well outpace the ability of the grid to update. Using conflation in this case can both reduce network traffic and ease the load on the application.
In this case, every instance of the application is likely to have the same performance characteristics and benefit from the same interval for conflation. Therefore, configuring a conflated topic for the server would be a good approach. If there were only a single instance of this application or the application ran intermittently (for example, a monitoring or diagnostic tool), using a conflated subscription might be more appropriate.
The User Guide provides more info on conflation, conflated topics, and conflated subscriptions.
AMPS contains a high-performance aggregation engine, which can be used to project one topic onto another, similar to the CREATE VIEW
functionality found in most RDBMS software. Views can JOIN multiple topics together, including topics with different message types.
In addition to views configured by an administrator, individual subscriptions can create ad hoc aggregates and views on demand.
For some use cases, in particular interactive applications that display large sets of records, it's useful to be able to display a subset of all of the records of interest. This saves network bandwidth by only delivering records that the application intends to display to a user, and saves CPU time in the application by removing the requirement for the application to process and discard records that aren't in the current result set.
For example, a web application may potentially show thousands of orders, but may only need to render a page of 20 records at any given time. With a paginated subscription, the application can request exactly the records it needs to render, and can be notified when those records change, are deleted, or if another record is inserted within the page.
AMPS allows you to configure a SOW topic to retain the historical state of the SOW, on a configurable granularity.You can then query for the state of the SOW at a point in time, and retrieve results from the saved state.
AMPS provides several utilities that are not essential to message processing, but can be helpful in troubleshooting or tuning an AMPS instance. The User Guide and Utility Reference describe these utilities in detail. The utilities include:
spark
- a command-line client, which is a useful tool for diagnostics, such as checking the contents of a SOW topic. The spark
client can also be used for simple scripting to run queries, place subscriptions and publish data.
ampserr
- used to expand and examine error messages that may be observed in the logs. This utility allows a user to input a specific error code, or a class of error codes, examine the error message in more detail, and where applicable, view known solutions to similar issues.
amps-grep
- used to search the AMPS errors and events log or AMPS journal files to quickly locate items of interest. The AMPS User Guide includes information on the utility, including command-line templates for common searches in the Find Information in Error Log or Transaction Log topic.
amps_sow_dump
- used to inspect the contents of a SOW topic store.
amps_journal_dump
- used to examine the contents of an AMPS journal file during debugging and program tuning.
More information about each of these utilities, including usage and examples, can be found in the Utilities chapter of the AMPS User Guide.
AMPS provides a monitoring interface which contains information about the state of the host system (CPU, memory, disk and network) as well as statistics about the state of the AMPS instance it is monitoring (clients, SOW state, Journal state and more). AMPS provides this information through a RESTful interface for ease of integration into existing enterprise monitoring systems.
AMPS can also record statistics in a persistent SQLite database, which can be queried using the standard SQLite toolset.
More information about the monitoring system provided in AMPS can be found in the Monitoring AMPS chapter of the AMPS User Guide. The AMPS Monitoring Guide contains information about the statistics available and how the monitoring statistics are recorded in the statistics database.
The Replicating Messages Between Instances chapter and the Highly Available AMPS Installations chapter in the AMPS User Guide explains the powerful high availability features that AMPS provides. This chapter describes how to use the AMPS transaction log and AMPS replication to provide failover strategies and high availability guarantees.
To provide high availability and failover, AMPS provides replication of topics between instances. A set of features in the AMPS clients work with the AMPS server to provide reliable publishing and resumable subscriptions.
The transaction log, described earlier, is the foundation of AMPS replication. AMPS replication is designed to ensure that the messages in the transaction log of one AMPS instance are also in the transaction log of another AMPS instance.
The AMPS client libraries provide optional reliable publication functionality, using a local store to retain messages, until the AMPS server notifies the publisher that the message has met the persistence guarantees that the server is configured for. Typically, the persistence guarantee is configured to be the point at which the message has been confirmed to have been written to both instances in a high availability pair, but stronger guarantees (such as also having been written to an offsite disaster recovery instance, or having been written to an instance in another region) can also be configured.
The AMPS approach to high availability is based around the principle that each topic is a stream of messages. The basic concepts behind AMPS replication include:
AMPS replication is always treated as a message stream from a source instance, which pushes messages, to a destination instance, which receives messages. In many cases, an application will use two-way replication so that instances contain the same messages. For two-way replication, replication is configured in both directions.
The intent of AMPS replication is to ensure that every replicated message that the source instance is responsible for replicating has been recorded in the transaction log of the destination instance.
AMPS replicates sequences of commands (that is, each individual publish or delete) rather than the cumulative state of a set of publishes.
For a given data source (that is, an individual publisher), AMPS guarantees that it will preserve the order in which that data source provided messages. The order must be consistent both within an instance and between instances.
For full details on AMPS replication, including recommendations and best practice advice, see the Replicating Messages Between Instances chapter and the Highly Available AMPS Installations chapter in the AMPS User Guide.
AMPS includes high performance queuing built on the AMPS messaging engine and the transaction log. AMPS queues combine elements of classic message queuing with the advanced messaging features of AMPS, including content filtering, aggregation and projection, and so on.
AMPS queues help you easily solve some common messaging problems:
Ensuring that a message is only processed once.
Distributing tasks across workers in a fair manner.
Ensuring that a message that has been delivered is processed.
Ensuring that when a worker fails to process a message, that message is re-delivered.
These uses of messaging require different behavior than the scenarios discussed in the section on Subscribing and Publishing to Topics. For basic subscribe and publish, each message is delivered to any number of subscribers. With queues, each message is fully processed by only one subscriber.
While it's possible to create applications with these properties by using the other features of AMPS, message queues provide these functions built into the AMPS server for additional performance, simple administration, and ease of development.
AMPS queues also allow you to:
Replicate messages between AMPS instances while preserving delivery guarantees.
Create views and aggregates based on the current contents of a queue.
Filter messages with specific content into specific queues.
Provide a subscriber only messages that contain specific content.
Provide a single published message to multiple queues.
Aggregate multiple topics into a single queue.
Provide content aware entitlement for security.
Provide prioritization of messages within a queue, so higher-priority messages are processed first.
Provide a synchronization point that guarantees that all messages prior to that point have been processed before messages after that point are delivered.
When an application needs to receive messages, there is little difference between subscribing to a queue and subscribing to a sub/pub topic. Both delivery models use the subscribe
command, and both delivery models can provide a filter to specify messages of interest. Both types of topics provide the same message objects in the AMPS Client interfaces.
Once a message is received from a queue, however, the application must let AMPS know when the message is successfully processed. This acknowledgment lets AMPS know that the application is finished with the message, and has capacity to receive another message. In addition, if the queue is configured to retry messages if an application fails to process the message (at-least-once
delivery), acknowledging the message indicates to AMPS that the message has been processed successfully and can be removed from the queue.
Within AMPS, the server maintains an in-memory list of all of the messages currently available for delivery in a given queue and a list of all of the messages awaiting acknowledgment from subscribers. The messages themselves are stored in the transaction log for the instance. When a message has been successfully processed, the acknowledgment for that message is also stored in the transaction log.
There is no separate storage required for a queue, since messages are recorded in the transaction log. Likewise, even when a message is removed from the queue, AMPS maintains a persisted record of that message and the acknowledgment in the transaction log. Given that the transaction log contains a full record of messages and acknowledgments, AMPS queues are persistent across server restarts, and can be replicated to other instances. (For details on replicating queues, see the AMPS User Guide.)
Keeping the delivery state -- that is, the queue itself -- independent of the topic in the transaction log has several other advantages. Since the set of messages in the queue is maintained separately from the physical storage for those messages, a queue in AMPS can hold messages from any number of underlying topics. Content filtering can be applied to the queue to selectively add messages to the queue: in fact, the same topic can easily be split into independent queues using content filtering. Messages to a single topic can also be included in multiple, independent queues (for example, one queue for immediate processing, and another queue for end-of-day auditing and reconciliation).
AMPS includes the ability for a given consumer to declare the capacity of that consumer, using the max_backlog option on a queue subscription. This option declares the number of messages that the consumer is willing to have delivered at a given time. Using this option can improve throughput, since AMPS can ensure that a consumer is never idle waiting for a new message. This also helps AMPS to balance message delivery across consumers in the most efficient manner, as measured by the current available capacity of each consumer. For example, a consumer on a small VM might take 200 milliseconds, on average, to process a message, and might declare a max_backlog
of 2. A consumer running on a larger physical server, in contrast, might take 50 ms on average to process a message, and might therefore declare a max_backlog
of 8 or more. The maximum allowed backlog for a subscriber is configured for each queue, so that queues that hold large units of work can set a smaller maximum value than queues that provide smaller tasks.
Queues are intended to guarantee delivery of each message to a single consumer that processes the messages. Use queues when the problem you are solving requires a message to be processed once. When you need to distribute messages to a large number of consumers, use the AMPS pub/sub delivery model.
For example, a queue is a natural fit for a system that allocates work, such as a system that runs software builds or that executes financial transactions. A system that provides notifications to a large number of systems (for example, a system that distributes bids to sellers or a system that communicates status to a user interface) is a more natural fit for the pub/sub delivery model.
Queues are often used to solve problems like:
Guaranteeing that a given set of work is distributed fairly across a set of workers, while each unit of work is only performed once:
A system that performs CPU-intensive calculations needs to ensure that any time a request comes into the system, it is serviced by the next available worker.
A distributed compute grid has workers that vary widely in capacity. Each worker declares its capacity to AMPS. Workers with more capacity free receive work before workers with less free capacity, improving overall throughput for the compute grid.
Guaranteeing that a specific message is fully processed once, regardless of the number of subscribers:
A system that processes refunds enters the refund orders into a queue. Each message is delivered to one, and only one, worker. If the worker successfully processes the message, the worker removes the message from the queue. If the worker fails, AMPS automatically delivers the message to another worker, ensuring the message is processed.
The AMPS client libraries, starting in version 5.0, are queue-aware and contain features to make it easier to work with queues and create the application behavior that you need. See the Developer Guide for the client library of your choice for details on how to use these features.
For further details on message queues and how they function, the chapter on Message Queues in the AMPS User Guide presents a more complete discussion.
As described above, both the topic that holds the messages for the queue and the queue topic itself must be recorded in the AMPS transaction log.
In addition, the queue itself must be declared in the SOW
element of the AMPS configuration.
For example, the configuration below records the topics Work
and WorkToDo
in the AMPS transaction log:
With these topics added to the transaction log, we can configure a WorkToDo
queue that provides queuing for the messages in the Work
topic.
This declares a queue named WorkToDo
that provides queuing for the messages in the Work
topic. By setting the semantics to at-least-once
, the topic is configured to redeliver a queue message to another subscriber in the event that a subscriber fails to process that message successfully.
This example also provides an example of recommended configuration options to help manage message lifetime in cases where a message cannot be processed successfully, or where no consumers are available to process the message.
For this queue, we provide the following options:
Expiration specifies that a message will be available for, at most, 12 hours from the time it enters the queue.
MaxDeliveries specifies that a given message can be delivered from the queue at most 5 times: after that, the message will be considered to be unable to be processed and removed from the queue.
LeasePeriod specifies that a consumer has 10 minutes from the time the message is sent to acknowledge the message or the message will be automatically returned to the queue.
AMPS also provides a mechanism for publishing expired messages to a dead-letter queue, as well as a wide variety of options for controlling delivery.
See the Message Queues chapter in the AMPS User Guide for a more complete discussion of message queues, including discussions of advanced features, replicated queues, and so on.
The AMPS client libraries include samples for working with message queues. See the client library distribution for those samples.
AMPS offers a wide array of messaging features to solve a variety of messaging scenarios. This section presents some basic mappings between common messaging scenarios and the AMPS features that support those scenarios. Of course, this list is just a sampling of the types of applications that use AMPS.
Simple, low-latency publish and subscribe (many to many messaging) with no need to persist messages.
Ad hoc Publish and Subscribe
Publish and subscribe with a replayable audit trail.
Transaction Log and Bookmark Subscription
Snapshot of the current state of a set of messages (for example, graphing the elapsed time for all pending orders).
State of the World (SOW)
Creating a view server that aggregates information about a high-velocity data feed for reporting.
State of the World (SOW)
Views and Aggregation
Transaction Log and Replication
Snapshot of the current state of a set of messages followed by updates to those messages (for example, showing the current status of a set of orders when a UI starts and then showing real-time updates to those messages).
State of the World (SOW)
SOW and Subscribe from client application
Ensuring that a given message is processed once, by a single subscriber (for example, a workload distribution system).
Message Queues (Queues use the Transaction Log)
Replaying messages from a point in time.
Transaction Log and Bookmark Subscription
Transforming messages as they are published to AMPS.
State of the World (SOW) and Enrichment
Producing aggregate data for a stream of messages.
State of the World (SOW) and Views or Aggregated Subscriptions
Coordinating work across a set of independent workers who are each assigned discrete tasks.
Message Queues
Dividing work among a set of workers who each update a portion of a record.
State of the World (SOW) and Delta Publish
Providing highly available messaging with multiple servers providing failover.
Transaction Log and Replication
The scenarios above describe just a few of the more common scenarios in which AMPS is used. For messaging scenarios that aren't described above, contact 60East at http://support.crankuptheamps.com/ for advice and guidance.
Now that you understand the basics of how AMPS works, you have two potential paths forward in your usage of the product:
The following sections provide more information about each of these paths and also briefly describe some use cases for AMPS.
When preparing to deploy AMPS, consult the Deployment Checklist whitepaper, available on the 60East website.
Each language-specific Development Guide explains how to install, configure, and develop applications that use AMPS. In order to develop applications using an AMPS client, you must understand the basic concepts of AMPS, such as topics, subscriptions, messages and SOW.
One of the most common questions during evaluation of AMPS is how best to measure and quantify the overall performance of the application that uses AMPS.
There are several factors that are included in any meaningful discussion of performance:
AMPS is designed to use the underlying hardware as efficiently as possible, and does not suffer from artificial bottlenecks that limit performance.
The implications of this, though, are that the performance available from an installation of AMPS depends on the capacity of the underlying hardware.
In particular, pay attention to:
Storage device speed and bandwidth (for applications that persist data)
Memory speed
Network speed and capacity
Often, you can come up with the theoretical maximum performance of a system based on the underlying hardware. For example, storage that can only write 80MB/s would be unsuitable for a system that needs to retain messages that arrive at a sustained rate of 100MB/s.
Likewise, a system with 64GB of memory would see reduced performance for lookups on a 128GB data set, so benchmarking an application that retains 128GB of active data on a system with 64GB of memory will produce very different results than the same benchmark run on a system with 256GB of memory.
Most Linux distributions and installations are, by default, tuned for interactive desktop usage. This is convenient when developing applications, but can produce reduced performance as compared with a well-tuned server.
AMPS is designed for high-throughput, low latency messaging. This means that AMPS typically performs better with a realistic workload than with a very small number of messages. It is typically not useful to run a performance test with a small number of messages and then attempt to extrapolate the performance at scale.
As an example, imagine a test that deploys a Docker container from scratch, starts AMPS, sends and receives a single message, and then shuts down the container and uses the elapsed time from the start of the test to the time that the container shuts down as the "single message throughput time". That number will be orders of magnitude slower than the actual time that it takes for AMPS to deliver the message: most of the time in the test is consumed by overhead unrelated to delivering an individual message.
Although it would be unlikely that anyone would create a test with as much overhead as the scenario above, it is not uncommon to have hidden overhead in a test. Likewise, there are often "economies of scale" that the system (including AMPS) can take advantage of production-level usage that is not available at unrealistically low messaging rates.
A realistic test should avoid measuring overhead that would not be present in a production environment. If the requirement of the application is to have latency within a certain threshold when AMPS is processing messages at a sustained rate from a dozen publishers, the results of a test will be more accurate the more closely the test approximates that scenario.
In particular, as much as possible, build your tests to:
Have similar use of connections as the production application. If a given application will have multiple subscribers in production, do not use a single subscriber in performance testing and assume that parallel processing offers no benefit.
Have similar message volumes as the production application. Do not assume that you can use a rate of 100 messages per second to predict latency or processing time of an application that will need to process 1000 or 10000 messages per second.
Have similar message sizes as the production application. Do not assume that a 1MB message size in test will have the same performance characteristics as a 250KB message (or a 5MB message) in production.
When benchmarking different implementation ideas, compare equivalent work. In some cases, having the AMPS server do additional work does not add noticeable latency due to the efficiencies (and parallel processing) in AMPS. In other cases, having the server do additional work may add more latency. In either case, accurately measuring throughput and latency must measure the cost of doing equivalent work in the application.
For example, if your application will use AMPS delta subscriptions (that is, have AMPS automatically calculate the differences between an update to a message and the current state of the message), rather than comparing throughput for a subscription that uses that option to a subscription that does not use that option based solely on when messages arrive at the client, compare the differences between having AMPS calculate the difference versus having the application calculate the difference, and evaluate this difference based on the total throughput numbers for a realistic number of subscribers.
AMPS is carefully designed to include functionality that reduces end-to-end throughput in the system, and to provide server-side capability where performing those functions on the server improves overall performance.
When evaluating performance, take advantage of those capabilities to get an accurate measure of how an application would perform in a production environment.
For example, if your application needs to append a calculated field to every published message, use message enrichment (or the AMPS delta publish functionality) rather than a process that extracts, rewrites, and updates the full message. Likewise, if your application will only process a subset of messages to the topic, use AMPS content filtering to ensure that AMPS only provides actionable messages rather than oversubscribing and discarding messages in your application.
If you have questions on whether your application is using the built-in capabilities of AMPS in the most effective way possible, contact 60East support for an engineer to review your design.
AMPS runs on any 64-bit Linux system. For best performance in a development environment, 60East recommends that the system have a minimum of 4GB of memory available.
For basic functional evaluation and development, AMPS runs well in a virtual machine, in a container, or in a WSL2 shell on Windows, as well as on a Linux host.
AMPS is designed to help you quickly and easily develop and deploy data-intensive applications with demanding requirements for low latency and high performance. AMPS takes a nontraditional approach to messaging, storage, and analytics that is designed from the ground up for streaming data and highly-parallelized multicore systems.
AMPS is based on an incredibly fast messaging engine that supports multiple messaging paradigms, as well as providing persistent current value caching (effectively, an integrated database), content filtering and continuous query, historical replay, aggregation and analytics, message enrichment, focus tracking, partial updates and change tracking, and more.
Furthermore, AMPS is designed and engineered specifically for next generation computing environments. The architecture, design and implementation of AMPS allows the exploitation of parallelism inherent in emerging multi-socket, multi-core commodity systems and the low-latency, high-bandwidth of 10Gb Ethernet and faster networks. AMPS is designed to detect and take advantage of the capabilities of the hardware of the system on which it runs.
AMPS was designed to improve performance and reduce latency in real-world messaging deployments by focusing on the entire lifetime of a message from the message's origin to the time at which a subscriber takes action on the message. AMPS considers the full message lifetime, rather than just the "in flight" time, and allows you to optimize your applications to conserve network bandwidth and subscriber CPU utilization -- typically the first elements of a system to reach the saturation point in real messaging systems.
For example, to distribute work across a set of independent processors, you would use AMPS message queues, whereas if your application requires a content-aware last value cache, you would use a Topic in the AMPS State of the World.
60East provides access to technical support during the evaluation process.
To prepare to evaluate AMPS, 60East recommends the following process:
Engage 60East support with a description of the evaluation goals, and to put in place any agreements necessary to make evaluation go more smoothly (such as mutual non-disclosure agreements).
Define the detailed goals of the feasibility phase of the evaluation. Typically, these break down into:
Functional Capability - This represents what the evaluation project needs to be able to do. (For example: accurately receive NFVIX messages and deliver them to the appropriate subscriber or subscribers while maintaining the ability to replay 30 days of history at any point.)
Performance Goals - This represents the service level for the evaluation. (For example: Maximum latency to reach client processing of 250ms for current messages, no more than 1s to first message for beginning a replay at an arbitrary depth in history.)
Capacity Goals - This represents the total volume of work being evaluated. (For example: The system needs to reach capability and performance goals while processing 10,000 messages per second ingestion.)
Develop initial design and testing plan. During this process, teams use the AMPS documentation to understand how to use AMPS to meet the goals of the evaluation. Teams engage 60East support as necessary to resolve any questions that emerge or get advice on tradeoffs and options to achieve the evaluation goals.
Once a design and testing plan is complete, review the design and testing plan with the 60East engineering team, adjusting as necessary.
Implement the design and tests. If questions or issues emerge, consult with 60East support to resolve the issues.
Test and review the results with 60East.
Evaluate the deployment and maintenance phase of the evaluation. Typically, this involves:
Operations and Performance at Scale - Evaluate the application in a production-like environment at scales at or near production volumes.
Maintenance Goals - Develop and test the maintenance, support, and upgrade plan as described in the 60East deployment checklist.
Follow up on any open issues and complete the evaluation.
On one path, you may want to learn how to configure, deploy, and administer your own instance of AMPS. For this path, see the , which provides complete information for system administrators who are responsible for the deployment, availability and management of data to other users.
Alternatively, you may need to develop an application to work with AMPS, using one of the Developer Guides for Java, Python, C++, or C#. For this path, visit the developer page at to download one of the evaluation kits.
In preparing to deploy your instance of AMPS, you must size your host environment according to multiple dimensions: memory, storage, CPU, and network. The chapter in the provides guidelines and best practices for configuring the host environment. The chapter also specifies recommended settings for running AMPS on a Linux operating system.
Advice on preparing to deploy AMPS in production is available under your support agreement from . 60East provides review of configuration and application architecture on demand, and new deployments are especially encouraged to take advantage of this review.
You will also need an installed and running AMPS server to use the product. Although you can type and compile programs that use AMPS without a running server, you will get the most benefit by running the programs against a working server. Visit the 60East website at for an evaluation version of AMPS.
in the discusses the Linux settings that are most often configured in a way that limits the performance of AMPS on a host. Before taking final benchmarks, tune the Linux host according to those guidelines.
The includes information on how to set up a basic development environment for AMPS.
For an overview of the features of AMPS, see the in the Introduction to AMPS.
To understand which features are most commonly used for a given scenario or application pattern, see the . This provides a quick guide to help you focus on learning the features that are most relevant to the problem at hand.
Once you have done a basic evaluation of AMPS, there are two typical paths forward in usage of the product:
On one path, you may want to learn how to configure, deploy, and administer an instance of AMPS. For this path, see the AMPS User Guide, which provides complete information for system administrators who are responsible for the deployment, availability and management of data to other users.
Alternatively, you may need to develop an application to work with AMPS, using one of the Developer Guides for Java, Python, C++, or C#. For this path, download one of the client distributions from the AMPS developer page at https://www.crankuptheamps.com/develop/. The client distributions include a set of examples and an AMPS server configuration that works with the examples.
The following sections provide more information about each of these paths and also briefly describes some use cases for AMPS.
In preparing to deploy your instance of AMPS, you must size your host environment according to multiple dimensions: memory, storage, CPU, and network. The Operation and Deployment chapter in the AMPS User Guide provides guidelines and best practices for configuring the host environment. The chapter also specifies recommended settings for running AMPS on a Linux operating system.
Each language-specific Development Guide explains how to install, configure, and develop applications that use AMPS. In order to develop applications using an AMPS client, you must understand the basic concepts of AMPS, such as topics, subscriptions, messages and SOW.
You will also need an installed and running AMPS server to use the product. Typically, a team will use a server licensed for evaluation during the initial stages of development, then transition to a full license as the evaluation completes and the team prepares to deploy the application.
Thank you for choosing to evaluate the Advanced Message Processing System (AMPS) from 60East Technologies!
AMPS is designed to make it easy to develop and deploy data-intensive applications with demanding requirements for low latency and high performance. AMPS takes a nontraditional approach to messaging, storage, and analytics that is designed from the ground up for streaming data and highly-parallelized multicore systems.
AMPS is more than a publish and subscribe system. It is a feature-rich platform that enables you to easily build data intensive applications that provide previously unattainable low latency and high performance. AMPS combines a set of capabilities that cut across traditional component divisions. 60East designed the capabilities based on the needs of some of the most demanding data-intensive applications on the planet, and engineered the capabilities to work together seamlessly and provide the kind of performance and latency that those applications demand.
AMPS isn't a traditional database or messaging product. This guide presents a brief introduction to AMPS and contains information on evaluating AMPS.
This guide is designed to be use alongside other parts of the AMPS documentation.
Overview of AMPS features and capabilities, intended as a starting point for learning AMPS.
Description of AMPS server-side features.
Guide to AMPS configuration.
The AMPS client distributions also contain guides discussing the programming model for applications and how to use the client libraries effectively.
In addition, 60East maintains an FAQ on the 60East support site at:
Frequently asked questions about the AMPS product.
For evaluation purposes, 60East recommends starting with the Introduction to AMPS for an overview of the features of AMPS, including a cross-reference as to which features are most commonly used together in particular application scenarios.
The AMPS server is supported on the following platforms:
Linux 64-bit (2.6 kernel or later) on x86_64 compatible processors
While 2.6 is the minimum kernel version supported, AMPS will select the most efficient mechanisms available to it and thus reaps greater benefit from more recent kernel and CPU versions.
The AMPS distribution contains all of the supporting libraries and dependencies needed to run on a typical Linux server installation: no further software is required.
Some utilities provided with the AMPS server have additional dependencies. These utilities are not required to run the server, but can make it easier to troubleshoot and test on the system that hosts the AMPS instance:
spark
, a basic command line client that supports a subset of AMPS functionality, requires Java 1.7 or later.
The utilities for inspecting AMPS files (amps_sow_dump
, amps_clients_ack_dump
, and so on) require a Python installation.
amps-grep
requires a Python installation.
amps-sqlite3
requires a Python installation and the sqlite3 package for your distribution (often, but not always, installed by default).
For existing customers, evaluation and development licenses are typically covered in the existing licensing agreement. Contact the team that manages the license agreement or 60East for details.
For new customers, you can register for your evaluation, obtain an evaluation license, and receive instructions for downloading AMPS from the Evaluate AMPS page of the 60East Website.
The registration process covers the terms of the evaluation license. You can use the support website, as described in Obtaining Evaluation Support, for any questions that arise during your evaluation, including both licensing questions and technical questions.
For existing customers, an outline of your specific support benefits and policies is available in your 60East Technologies License Agreement. Support contracts can be purchased through your 60East Technologies account representative. Existing customers will also typically already have a non-disclosure agreement in place, allowing development teams to discuss the details of their applications with 60East for the purposes of troubleshooting issues or answering questions about AMPS.
For new customers, contact support (as described in the following sections) for any issues that emerge during your evaluation. Support for evaluation purposes is typically available for new customers without a support contract. If your company requires a non-disclosure agreement with 60East before discussing technical information, please contact support to begin the process of creating and executing an agreement.
You can save time if you complete the following steps before you contact 60East Technologies Support:
Check the documentation
The problem may already be solved and documented in the User Guide or Configuration Guide for the product. 60East Technologies also provides answers to frequently asked support questions on the support web site at http://support.crankuptheamps.com.
Isolate the problem
If you require Support Services, please isolate the problem to the smallest test case possible. Capture erroneous output into a text file along with the commands used to generate the errors.
Collect your information
Your product version number.
Your operating system and its kernel version number.
The expected behavior, observed behavior and all input used to reproduce the problem.
Submit your request.
If you have a minidump file, be sure to include that in your email to crash@crankuptheamps.com.
The AMPS version number used when reporting your product version number follows a format listed below. The version number is composed of the following:
Each AMPS version number component has the following breakdown:
MAJOR
Increments when there are any backward-incompatible changes in functionality, file formats, client network formats or configuration; or when deprecated functionality is removed.
May introduce major new functionality or include internal improvements that introduce major behavioral changes.
Megacert
MINOR
Increments when functionality is added in a backwards-compatible way, or when functionality is deprecated.
May include internal improvements, including internal improvements that introduce minor behavioral changes or changes to network formats used only by the AMPS server (such as replication).
Megacert
FEATURE
Increments for previews of new features.
May introduce behavioral changes to fix incorrect behavior, enable new functionality or to enhance performance.
May include internal enhancements that do not introduce behavioral changes.
Note: A feature level of 0
indicates a long-term stable release. A feature level above zero indicates the current feature level (a preview of the next long-term stable release).
Kilocert
HOTFIX
A release for a critical defect impacting a customer. A hotfix release is designed to be 100% compatible with the release it fixes (that is, a release with same MAJOR.MINOR.FEATURE
version).
May introduce behavioral changes to fix incorrect behavior. May document previously undocumented features or extend surface area to improve usability for existing features.
Cert
TIMESTAMP
Proprietary build timestamp.
(does not affect verification level)
TAG
Identifier that corresponds to precise code used in the release.
(does not affect verification level)
The certification levels are defined in the following table. Notice that, in all cases, 60East will certify at a higher level if time permits or if a change involves a critical part of AMPS (such as replication or internal utility classes that are widely used).
Megacert
Performance and long-haul testing.
Full regression suite and stress-testing suite, including replication testing and application scenario tests.
Full unit testing suite, including new unit tests to verify correct behavior of bugfixes in this release.
less than 2 weeks
Kilocert
Full regression suite and stress-testing suite, including replication testing and application scenario tests.
Full unit testing suite, including new unit tests to verify correct behavior of bugfixes in this release.
less than 1 week
Cert
Full unit testing suite, including new unit tests to verify correct behavior of bugfixes in this release.
Replication testing suite if release affects replication code.
4 hours
Please contact 60East Technologies Support Services according to the terms of 60East Technologies License Agreement (whether that is an existing license or evaluation license).
Support is offered through the United States:
Web:
E-mail:
Support:
Other support options may be available for existing customers, depending on the terms of the support agreement.
Thank you for choosing the Advanced Message Processing System (AMPS) from 60East for evaluation.
This guide provides information to help you evaluate AMPS for your application.
This guide covers the following topics:
General introduction to evaluating AMPS.
Description of the suggested evaluation process and how to get started with developing applications with AMPS.
Performance measurement guidance and considerations.
Suggested paths after reading this guide.
Thank you for choosing the Advanced Message Processing System (AMPS) from 60East Technologies. AMPS is a feature-rich message processing system that delivers previously unattainable low-latency and high-throughput performance to users. AMPS provides both publish-and-subscribe messaging and high-performance message queuing.
AMPS is designed to help you quickly and easily develop and deploy data-intensive applications with demanding requirements for low latency and high performance.
AMPS combines aspects of a traditional message bus, message queue, database, view server, analytics and event processing engine. The features that AMPS provides are designed to be easy to use, to work well together, and to provide high performance.
The 60East documentation is intended to be used with a working (development) environment of AMPS available so that you can quickly explore the concepts discussed.
60East recommends starting with the Introduction to AMPS to become familiar with AMPS, and then reading the sections of the AMPS User Guide and AMPS Configuration Guide for the features that your application will use.
The Introduction to AMPS provides an overall introduction to AMPS, including information on setting up a development environment, the basic concepts and features of AMPS, and general advice on which features combine effectively for specific scenarios. The 60East documentation is intended to be used with a working (development) environment of AMPS available so that you can quickly explore the concepts discussed.
The AMPS Evaluation Guide provides advice on evaluating AMPS, included a suggested evaluation process, tips on monitoring and measuring performance in an evaluation environment, and information on how to effectively partner with 60East on an evaluation of AMPS.
The AMPS User Guide -- this guide -- provides a complete overview of the features of AMPS and how to deploy and administer an instance of AMPS.
The AMPS Configuration Guide describes the AMPS configuration file and the options available to specify the behavior of an instance of AMPS.
The Deployment Checklist is a short document providing recommendations for deploying AMPS into a shared environment, whether that environment will be used for production, test, or development.
These guides cover the general features of AMPS. This site provides additional guides, such as guides for developing applications with AMPS, a guide to the statistics available for monitoring, and so on.
For developers, becoming familiar with the Developer Guide for the AMPS Client library that you will be using is also recommended. The developer page on the 60East web site contains reference material and links to download client libraries. Full source code (including example applications) is available for all client libraries. For many client libraries, 60East also includes pre-built binaries and makes binary distributions available through popular package management sites. Notice, however, that the pre-built distributions do not contain documentation, source code, or examples.
For developers, 60East also provides an AMPS Command Reference that describes the commands to the AMPS server and responses from the AMPS server. Once you are familiar with the features you will use, as described in the user guide, the Developer Guide for your client library of choice and the AMPS Command Reference provide details on how an application communicates with the AMPS server.
The AMPS server is supported on the following platforms:
Linux 64-bit (2.6 kernel or later) on x86 compatible processors
While 2.6 is the minimum kernel version supported, AMPS will select the most efficient mechanisms available to it and as a result, reaps greater benefit from more recent kernel and CPU versions.
The AMPS distribution contains all of the supporting libraries and dependencies needed to run on a typical Linux server installation: no further software is required.
Some utilities provided with the AMPS server have additional dependencies. These utilities are not required to run the server, but can make it easier to troubleshoot and test on the system that hosts the AMPS instance:
spark
, a basic command line client that supports a subset of AMPS functionality, requires Java 1.7 or later.
The utilities for inspecting AMPS files (amps_sow_dump
, amps_clients_ack_dump
, and so on) require a Python installation.
amps-grep
requires a Python installation.
Welcome to the Advanced Message Processing System (AMPS) from 60East Technologies.
AMPS is a feature-rich message processing system that delivers previously unattainable low-latency and high-throughput performance to users. AMPS provides both publish-and-subscribe messaging and high-performance message queuing. AMPS also provides current value caching / message database functionality, analysis and aggregation.
AMPS, the Advanced Message Processing System, is built around an incredibly fast messaging engine that supports both publish-subscribe messaging and queuing. AMPS combines the capabilities necessary for scalable high-throughput, low-latency messaging in realtime deployments such as in financial services. AMPS goes beyond basic messaging to include advanced features such as high availability, historical replay, aggregation and analytics, content filtering and continuous query, last value caching, focus tracking, and more.
Furthermore, AMPS is designed and engineered specifically for next generation computing environments. The architecture, design and implementation of AMPS allows the exploitation of parallelism inherent in emerging multi-socket, multi-core commodity systems and the low-latency, high-bandwidth of 10Gb Ethernet and faster networks. AMPS is designed to detect and take advantage of the capabilities of the hardware of the system on which it runs.
AMPS does more than just route and deliver messages. AMPS was designed to lower the latency in real-world messaging deployments by focusing on the entire lifetime of a message from the message's origin to the time at which a subscriber takes action on the message. AMPS considers the full message lifetime, rather than just the "in flight" time, and allows you to optimize your applications to conserve network bandwidth and subscriber CPU utilization -- typically the first elements of a system to reach the saturation point in real messaging systems.
AMPS offers both topic and content based subscription semantics, which makes it different than most other messaging platforms. Some of the highlights of AMPS include:
Topic and content based publish and subscribe
Message queuing, including content-based filtering and configurable strategies for delivery fairness
Client development kits for popular programming languages such as Java, C#, C++, C, Python, and JavaScript
Built-in support for FIX, NVFIX, JSON, BSON, MessagePack, BFlat, Google Protocol Buffer and XML messages. AMPS also supports uninterpreted binary messages, and allows you to create composite message types from existing message types.
State of the World queries
Historical State of the World queries
Easy to use command interface
Full Perl compatible regular expression matching
Content filters with SQL92 WHERE
clause semantics
Built-in latency statistics and client status monitoring
Advanced subscription management, including delta publish and subscriptions and out-of-focus notifications
Basic CEP capabilities for real-time computation and analysis
Aggregation within topics and joins between topics, including joins between different message types
Replication for high availability
Fully queryable transaction log
Message replay functionality
Fully-integrated authentication and entitlement system, including content-based entitlement for fine-grained control
Optional encryption (SSL) between client and server
Extensibility API for adding message types, user-defined functions, user-specified actions, authentication, and entitlement functionality
This manual is divided into the following parts:
Part One presents introductory material and a brief overview of AMPS
Part Two explains the features of AMPS, including information on the following features:
Subscribe and Publish, the basic building blocks of AMPS applications
The expression language and functions used to take advantage of the content-aware features of AMPS are covered in AMPS Expressions and AMPS Functions
Record and Replay Messages using the AMPS transaction log
Competitive message consumption with Message Queues
The Message Types that AMPS supports for content-aware processing
Current value caching and database functions using State of the World (SOW) topics
State of the World topics enable many of the other advanced features in AMPS, such as:
This section also contains detailed chapters on specific topics, such as the AMPS filter language. Both application developers and administrators should become familiar with this section.
Part Three discusses AMPS deployment and operations, including:
This section is most useful for those with a focus on AMPS operations, although the information presented here is helpful for developers who want to design high-performance, high-availability applications that are easy to deploy and maintain.
Additional chapters provide reference information:
Optionally-Loaded Modules describes special-purpose modules that are included in the AMPS distribution but are not loaded by default
File Format Versions lists the file formats used by each AMPS version
This manual is an introduction to the 60East Technologies AMPS product. It assumes that you have a working knowledge of Linux and uses the following conventions.
Text
Standard document text
Code
Inline code fragment
Variable
Variables within commands or configuration
Parameter
(required)
Required parameters in parameter tables
Optional
Optional parameters in parameter tables
The AMPS documentation also includes the following types of notes:
Inside boxes with this icon, you will find information that's important to keep in mind when working with AMPS. These are typically recommendations that should generally be followed, but may not be applicable in special cases.
Inside boxes with this icon, you will find important information and guidelines that require special consideration or caution when using AMPS to ensure the proper functioning of the system and to avoid any potential issues or risks.
Inside boxes with this icon, you will find usage warnings or information that is critical for ensuring that AMPS functions correctly.
Additionally, here are the constructs used for displaying content filters, XML, code, command line, and script fragments.
Command lines will be formatted as in the following example:
This section describes how to install and start AMPS. It describes the file structure of the AMPS distribution and how to configure a simple AMPS instance.
The Getting Started with AMPS section in the Introduction to AMPS covers setting up a basic development environment. This section includes a reference to the AMPS server command line options and provides information to help create a production deployment of AMPS.
To install AMPS, unpack the distribution for your platform where you want the binaries and libraries to be stored. For the remainder of this guide, the installation directory will be referred to as $AMPSDIR
as if an environment variable with that name was set to the correct path.
Within $AMPSDIR
are the following sub-directories:
bin
AMPS engine binaries and utilities
docs
Documentation
lib
Library dependencies
sdk
Include files for the AMPS extension API
For an outline of your specific support policies, please see your 60East Technologies License Agreement. Support contracts can be purchased through your 60East Technologies account representative.
You can save time if you complete the following steps before you contact 60East Technologies Support:
Check the documentation
The problem may already be solved and documented in the User Guide or Configuration Reference Guide for the product. 60East Technologies also provides answers to frequently asked support questions on the support website at: http://support.crankuptheamps.com.
Isolate the problem
If you require Support Services, please isolate the problem to the smallest test case possible. Capture erroneous output into a text file along with the commands used to generate the errors.
Collect your information
Your product version number.
Your operating system and its kernel version number.
The expected behavior, observed behavior and all input used to reproduce the problem.
Submit your request.
If you have a minidump file, be sure to include that in your email to crash@crankuptheamps.com.
The AMPS version number used when reporting your product version number follows a format listed below. The version number is composed of the following:
Each AMPS version number component has the following breakdown:
MAJOR
Increments when there are any backward-incompatible changes in functionality, file formats, client network formats or configuration; or when deprecated functionality is removed.
May introduce major new functionality or include internal improvements that introduce major behavioral changes.
Megacert
MINOR
Increments when functionality is added in a backwards-compatible way, or when functionality is deprecated.
May include internal improvements, including internal improvements that introduce minor behavioral changes or changes to network formats used only by the AMPS server (such as replication).
Megacert
FEATURE
Increments for previews of new features.
May introduce behavioral changes to fix incorrect behavior, enable new functionality or to enhance performance.
May include internal enhancements that do not introduce behavioral changes.
Note: A feature level of 0
indicates a long-term stable release. A feature level above zero indicates the current feature level (a preview of the next long-term stable release).
Kilocert
HOTFIX
A release for a critical defect impacting a customer. A hotfix release is designed to be 100% compatible with the release it fixes (that is, a release with same MAJOR.MINOR.FEATURE
version).
May introduce behavioral changes to fix incorrect behavior. May document previously undocumented features or extend surface area to improve usability for existing features.
Cert
TIMESTAMP
Proprietary build timestamp.
(does not affect verification level)
TAG
Identifier that corresponds to precise code used in the release.
(does not affect verification level)
The certification levels are defined in the following table. Notice that, in all cases, 60East will certify at a higher level if time permits or if a change involves a critical part of AMPS (such as replication or internal utility classes that are widely used).
Megacert
Performance and long-haul testing.
Full regression suite and stress-testing suite, including replication testing and application scenario tests.
Full unit testing suite, including new unit tests to verify correct behavior of bugfixes in this release.
less than 2 weeks
Kilocert
Full regression suite and stress-testing suite, including replication testing and application scenario tests.
Full unit testing suite, including new unit tests to verify correct behavior of bugfixes in this release.
less than 1 week
Cert
Full unit testing suite, including new unit tests to verify correct behavior of bugfixes in this release.
Replication testing suite if release affects replication code.
4 hours
Please contact 60East Technologies Support Services according to the terms of your 60East Technologies License Agreement.
Support is offered through the United States:
Web:
E-mail:
Support:
Other support options (such as support via phone), may be available depending on the terms of your support agreement.
The AMPS server generates a minimal sample configuration file with the --sample-config
option. You can save the sample configuration file to $AMPSDIR/amps_config.xml
with the following command line:
On older processor architectures, ampServer
will start the ampServer-compat
binary. The ampServer-compat
binary avoids using hardware instructions that are not available on these systems.
You can also set the AMPS_PLATFORM_COMPAT
environment variable to force ampServer
to start the ampServer-compat
binary. 60East recommends using this option only on systems that do not support the hardware instructions used in the standard binary. The ampServer-compat
binary will not perform as well as ampServer
, since it uses fewer hardware optimizations.
Once you have a configuration file saved to $AMPSDIR/amps_config.xml
you can start AMPS with that file as follows:
If your first start-up is successful, you should see AMPS display a simple message similar to the following to let you know that your instance has started correctly.
The version numbers and dates will be appropriate for the version that you've started.
If you see this, congratulations! You have successfully cranked up the AMPS!
The AMPS server binary supports the following command line options:
The AMPS engine binary is named ampServer
and is found in $AMPSDIR/bin
. Start the AMPS engine with a single command line argument that includes a valid path to an AMPS configuration file. You use the configuration file to enable and configure the AMPS features that your application will use. This guide discusses the most commonly used configuration options for each feature. The full set of options is described in the .
The sample configuration file generated by AMPS includes a very minimal configuration. The client language distributions include a sample configuration file that sets up AMPS to work with the samples provided with that client, and the contains a full description of the configuration items with sample configuration snippets.
--verify-config
Parse and verify the specified configuration file, then exit.
--sample-config
Produce a minimal AMPS config.xml file to standard output, then exit.
--dump-config
Process the specified configuration file, resolving any Include directives and expanding environment variables. Dump the resulting file to standard output.
--version
Print the AMPS version string, then exit.
--help
Print usage information for the command line options accepted by the ampServer
program, then exit.
--daemon
Run AMPS as a daemon process.
-D<variable>=<value>
Set the specified environment variable to the specified value when running the AMPS process. AMPS accepts any number of -D
options.
For example, to set the variable AMPS_PATH
to /mnt/fast/AMPS
use the command line option -DAMPS_PATH=/mnt/fast/AMPS
To create a production configuration of AMPS, you configure the instance to meet the needs of the application (or applications) that will use the instance.
An overview of the most commonly used features is available in the Introduction to AMPS guide. This guide, the AMPS User Guide, describes the features in detail. The AMPS Configuration Guide lists the required and optional configuration for each feature.
Typically, all instances of AMPS will configure:
The instance Name
(this is required).
The Admin interface for the instance, to make monitoring available. This typically includes setting a path to persist the instance statistics database.
Logging for the instance (at a minimum of info
level for production instances, typically at trace
level for development, testing, or UAT instances).
One or more Transports to allow incoming connections to the AMPS server.
Administrative actions to create a scheduled maintenance plan for the statistics database and the logs.
The ampServer
binary will produce a minimal sample configuration to stdout
if it is run with the --sample-config
flag that shows a minimum configuration. Options that require site-specific information (for example, the path to the statistics database or log files) are commented out in the sample.
Instances of AMPS may then add configuration to take advantage of advanced messaging features (such as the State of the World, Aggregation and Analytics, the ability to Record and Replay Messages, and so on), to add resiliency by Replicating Messages Between Instances (typically required for Highly Available AMPS Installations), and so on.
AMPS is a rich message delivery system. At the core of the system, the AMPS engine is highly-optimized for publish and subscribe delivery. In this style of messaging, publishers send messages to a message broker (such as AMPS) which then routes and delivers messages to the subscribers. "Pub/Sub" systems, as they are often called, are a key part of most enterprise message buses, where publishers broadcast messages without necessarily knowing all of the subscribers that will receive them. This decoupling of the publishers from the subscribers allows maximum flexibility when adding new data sources or consumers.
AMPS can route messages from publishers to subscribers using a topic identifier and/or content within the message's payload. For example, in the figure above, there is a Publisher sending AMPS a message pertaining to the LN_ORDERS
topic. The message being sent contains information on Ticker "IBM" with a Price of 125, both of these properties are contained within the message payload itself (i.e., the message content). AMPS routes the message to Subscriber 1 because it is subscribing to all messages on the LN_ORDERS
topic. Similarly, AMPS routes the message to Subscriber 2 because it is subscribed to any messages having the Ticker equal to "IBM". Subscriber 3 is looking for a different Ticker value and is not sent the message.
AMPS provides the ability for the server to conflate messages to a subscription. When a subscription requests conflation, the server will retain messages for that subscription for a certain period of time, the conflation interval, and provide the latest update to that message once a message has been retained for that interval. In effect, AMPS guarantees that a subscription will receive no more than one update for a given message per conflation interval.
Conflated subscriptions provide a way to reduce the bandwidth and processing for a subscriber in cases where a subscriber needs periodic updates with the current state of a message, rather than the complete message stream. AMPS provides per-subscription conflation for cases where only a small number of subscribers require conflation, or if conflation is required only in unusual cases. If multiple subscribers will have the same conflation needs, consider using Conflated Topics.
For example, imagine an application that monitors selected stocks and displays the current prices on a large screen, which refreshes every few seconds. This application may use the same topics as a trading desk, but has very different needs for data freshness and completeness. Since updates to each symbol will only be displayed every few seconds, the application only needs point-in-time updates of the prices, rather than the full stream of price changes. To meet this need, the application could specify that the subscription conflates price updates by tickerId
with a conflation interval of two seconds. For each distinct value of the tickerId
field, AMPS will retain messages for two seconds. If another message with the same tickerId
is processed for the subscription during the conflation interval, that message completely replaces the previous message. At the end of the two second conflation interval, the message is delivered to the application. This lets the application receive an up-to-date price at most every two seconds, without having to process a large number of updates that will never be displayed. This approach also ensures that the price is never more than two seconds out of date, which means that each time the screen is refreshed, the price is current.
As mentioned in the example above, if the subscription uses tickerId
for conflation and the following sequence of messages arrive during a conflation interval:
AMPS delivers only the last message for that tickerId
:
Notice that when a subscription is conflated, AMPS does not guarantee that messages are delivered precisely in the order in which they arrived in AMPS, since the latest update is delivered based on the conflation interval.
When the timestamp
option is used with conflated subscriptions, AMPS provides the timestamp for the first message conflated.
Conflated subscriptions reduce the bandwidth for a subscription, and may reduce the processing resources required for a subscription. However, rather than immediately delivering messages, AMPS retains messages in memory for the conflation interval. This can increase the memory required for the subscription.
AMPS contains other features for conflating messages and reducing bandwidth. Conflated subscriptions are most appropriate when:
Network bandwidth is at a premium, and you would like AMPS to spend slightly more processing time and potentially more memory to reduce the bandwidth needs of the application.
Each subscription has different conflation needs. For example, if each subscription has a dramatically different conflation interval, or needs to conflate by different fields. If most subscribers will use a similar conflation interval and use the same fields for conflation, using a Conflated Topic can provide equivalent results with lower overhead.
The conflation needs are relatively predictable and consistent for the subscription. If you need the application to conflate messages only when processing is slow or there are bursts of message traffic, client-side conflation provides that ability and may be a better choice than a conflated subscription. See the developer guide for your programming language of choice for details.
The considerations above are general guidance to help you consider options and choose a conflation strategy.
You can also combine approaches as necessary. For example, if most of your subscriptions require a 3 second conflation interval by tickerId
, while a few subscriptions require a 15 second interval, you could create a Conflated Topic with a 3 second interval. Those subscriptions that require a 15 second interval could subscribe with that interval. This provides both sets of subscriptions with the intervals that they need.
To request conflation on a subscription, set the following options on the subscription:
conflation=n
Specifies whether to conflate this subscription. The value provided can be a time interval, auto
or none
.
When present and set to a value other than none, enables conflation for the subscription.
Can also be set to auto
, which requests that AMPS attempt to determine an appropriate conflation interval based on client consumption.
Recognizes the same time specifiers used in the AMPS configuration file (for example, 100ms
or 1s
or 1m
).
Defaults to none
.
conflation_key=[keys]
When conflation is enabled, specifies the fields to use to determine message uniqueness. The format of this option is a comma-delimited list of XPath identifiers within brackets.
For example, to conflate based on the value of the /tickerId
and /customerId
within a message. the value of this option would be:
[/tickerId,/customerId]
Defaults to the SOW key fields for SOW topics.
No default for non-SOW topics. This option is required for non-SOW topics.
This option is not valid with the oof
option unless the keys provided are identical to the keys for the topic.
This option can only be used when conflation
is also specified.
For example, to request a 10 second conflation interval with messages conflated on the [/orderId]
field, you would use the following options string:
One thing that differentiates AMPS from classic messaging systems is its ability to route messages based on message content. Instead of a publisher declaring metadata describing the message for downstream consumers, the publisher can simply publish the message content to AMPS and let AMPS examine the native message content to determine how best to deliver the message.
The ability to use content filters greatly reduces the problem of oversubscription that occurs when topics are the only facility for subscribing to message content. The topic space can be kept simple by using content filters to deliver only the desired messages. The topic space can reflect broad categories of messages and does not have to be polluted with metadata that is usually found in the content of the message. In addition, many of the advanced features of AMPS such as out-of-focus messaging, aggregation, views, and SOW topics rely on the ability to filter content.
Content-based messaging is somewhat analogous to database queries that include a WHERE
clause. Topics can be considered tables into which rows are inserted (or updated). A subscription is similar to issuing a SELECT
from the topic table with a WHERE
clause to limit the rows which are returned. Topic-based messaging is analogous to a SELECT
on a table with no limiting WHERE
clause.
AMPS uses a combination of XPath-based identifiers and SQL-92 operators for content filtering. Some examples are shown below:
For more information about how content is handled within AMPS and the syntax of AMPS filters, details are presented at and .
AMPS provides the ability to perform atomic subscription replacement. This allows you to replace the filter, change the topic, or update the options for a subscription.
The most common use for this capability is for an application to change the filter for a subscription. For example, a GUI that is providing a view of a set of orders may need to add or remove an order from the set of orders being displayed. By replacing the content filter with a filter that tracks the updated set of orders, the application can do this without missing messages, getting duplicate messages, or having to manage more than one subscription.
Replacing a filter is an atomic operation. That is, the application is guaranteed not to miss messages that are in both the original and replacement subscription, and is guaranteed to receive all messages for the new subscription as of the point at which the replacement happens.
To replace a subscription, applications re-submit the subscription using the subscription ID of the previous subscription. See the Developer Guide of the client library you are using and the AMPS Command Reference for details.
When replacing a sow_and_subscribe
command (described later in the guide), AMPS runs the SOW command again and provides any messages that were not previously in the result set to the application. See the section called Replacing Subscriptions with SOW and Subscribe for details.
Notice that some options on an initial subscription limit the support for replace
on a subscription. In those cases, the limitation is described when the option is described.
AMPS allows you to replace the content filter on an existing subscription. When this happens, AMPS begins sending messages on the subscription that match the new filter. When an application needs to bring more messages into scope, this can be more efficient than creating another subscription.
For example, an application might start off with a filter such as the following:
The application might then need to bring other regions into scope, for example:
AMPS allows a subscription to replace the topic on a subscription. When the topic is replaced, AMPS re-evaluates the subscription as it does when a filter is replaced. If the subscription is updated to include a topic that the user does not have permission to subscribe to, the replace
operation succeeds, but no messages will be delivered on that topic.
AMPS allows a subscription to replace some of the options on the subscription. In this case, the subscription is evaluated as though the topic or filter has been replaced. Any new messages generated after the subscription is replaced use the new options. However, AMPS does not replay or re-query previous messages to apply the options.
For example, if a sow_and_subscribe
command did not previously specify Out-of-Focus tracking and adds this option, AMPS generates the appropriate Out-of-Focus messages from the replace point forward. AMPS does not recreate Out-of-Focus messages that would have previously been generated by the subscription.
If the subscription uses pagination (see Managing Result Sets), the replacement must contain the full set of pagination options provided on the original subscription. For a paginated subscription, the replacement may not change the topic of the subscription. Instead, close the existing subscription and create a new subscription with a different topic.
AMPS includes an expression language that combines elements of XPath and SQL-92's WHERE
clause. This expression language is used whenever the AMPS server refers to the contents of a message, including:
Content filtering
Constructing fields for message enrichment
Creating projected fields for views
AMPS uses a common syntax for each of these purposes, and provides a common set of operators and functions. AMPS also provides special directives for message enrichment, and aggregation functions for projecting views.
For example, when an expression is used as a content filter, any message for which the expression returns true
matches the content filter. When an expression is used to construct a field for message enrichment or view projection, the expression is evaluated and the result that the expression returns is used as the content of the field.
The quickest way to learn AMPS expressions is to think of each as a combination of identifiers that tell AMPS where to find data in a message, and operators that tell AMPS what to do with that data. Each AMPS expression produces a value. The way AMPS uses that value depends on where the expression is used. For example, in a content filter, AMPS uses the value of the expression to determine whether a message matches the filter. When constructing a field, AMPS uses the value of the expression as the contents of the field.
Consider a simple example of an expression used as a filter. Imagine AMPS receives the following JSON message:
Using an AMPS expression, you can easily construct a content filter that matches the message:
There are three parts to this expression. The first part, /name
, is an identifier that tells AMPS to look for the contents of the name
field at the top level of the JSON document. The second part of the filter, =
, is the equality operator, which tells AMPS to compare the values on either side of the operator and return true
if the values match. The final part of the filter, 'Gyro'
, is a string literal for the equality operator to use in the comparison. When an expression is used in a content filter, a message matches the filter when the expression returns true
. The expression returns true
for the sample message, so the sample message matches the filter.
The identifier syntax is a subset of XPath, as described in the section on Identifiers. The comparison syntax is similar to SQL-92.
Notice that AMPS makes no rigid guarantees as to the number of times a given expression is evaluated or when that evaluation will take place. AMPS will evaluate the expression as needed.
AMPS has the ability to allow a subscriber to retrieve only the relevant parts of a message, in the same way that a SQL query can retrieve only specified fields from a table. For example, consider a topic that stores an event ID, a short description, and a detailed event record. A UI that presents an overview of the contents of the topic might only need the event ID and short description to present a high-level view of the topic contents, while retrieving the detailed event record when a user explicitly requests the details for a specific record.
With select lists, AMPS allows an individual subscription to control which fields are retrieved from a subscription or query. In the example above, the subscription would include a select list that requests that AMPS provide the event ID and description, while excluding any other field. To do this, the application would include the following option on the command used to retrieve data for the overview: select=[-/,+/event_id,+/description]
When provided by an application as a part of a command to AMPS, a select list is applied after any content filtering is applied. The select list specifies the contents of the subscription, but does not affect the underlying messages, and the contents of the subscription select list do not affect filter evaluation or query results.
As mentioned above, to provide a select list on a command, add the keyword select
and a comma-delimited list of field directives to the options for a subscription or query in AMPS.
Each field directive is a combination of an inclusion specifier and an AMPS identifier.
For example, the field directive +/event_id
has an inclusion_specifier of +
and the AMPS identifier of /event_id
. This field directive specifies that the /event_id
field is included in the message returned to the subscriber.
AMPS recognizes the following inclusion specifier values:
-
Explicitly exclude the field for the identifier immediately following.
+
Explicitly include the field for the identifier immediately following.
Identifiers for individual fields follow the syntax described in the Identifiers section.
For select lists, AMPS also recognizes the special field directive of -/
to specify that all fields should be excluded and the special field directive of +/
to specify that all fields should be included.
If no field directive in the select list applies to a given field in a message, that field is included in the message.
If a field is covered by multiple field directives, AMPS respects the most specific field directive. In other words, a select list that contains the field directives +/,-/details
will include all fields except the details field. A select list that contains the field directives -/event,+/event/description
will include the /event/description
subfield, but no other contents of the /event
field. (If an identifier is provided twice in the same select list, AMPS uses the first field specifier that contains the identifier.)
With select lists, AMPS does not create fields that are not in the original message. This means that if the select list requests a field that does not exist in the original message, the message delivered to the subscriber will not contain that field.
Notice that a select list only changes how a message is delivered to the subscriber that the select list applies to. The original message is unaffected, and the complete message is delivered to any subscriber that does not specify a select list.
AMPS contains related functionality that may be more appropriate for some applications:
To modify a message as it is published to AMPS, use Enrichment and Preprocessing. With those features, the original publish message is modified and the modified message is stored in AMPS and sent to all subscribers.
AMPS also offers the ability to create a view of a set of messages that aggregates data across a set of messages and produces a result (for example, the total value of all open orders for each customer). See the chapter on Aggregating and Analyzing Data in AMPS for more details.
For example, consider an original message like the following JSON document:
An application might only need to see the id
and complaint
description. To retrieve just those fields of a message, the application could add the following option to the command that retrieves the message:
This select list tells AMPS to remove all fields from the message except for the /id
field and the /complaint
field. With this select list, the message above will be delivered as:
Likewise, an application could want to know the name of the person making the complaint and the contents of that person's left pocket:
From the original message, the result of providing this select list would be:
Last, consider an application that wants to see everything in the message except the pocket_contents
. That application could provide an option such as:
With that specifier, AMPS provides any field in the message except the pocket_contents
, producing the following result:
AMPS identifiers use a subset of XPath to specify values in a message. AMPS identifiers specify the value of an attribute or element in an XML message, and the value of a field in a JSON, FIX or NVFIX message. Given that the identifier syntax is only used to specify values, the subset of XPath used by AMPS does not include wildcards, relative paths, array manipulation, predicates or functions.
For example, when messages are in this XML format:
The following identifier specifies the Symbol
element of an Order
message:
The following identifier specifies the update
attribute of an Order
message:
For FIX and NVFIX, you specify fields using /
and the tag name. AMPS interprets FIX and NVFIX messages as though they were an XML fragment with no root element. For example, to specify the value of FIX tag 55
(symbol), use the following identifier:
Likewise, for JSON or other types that represent an object, you navigate through the object structure using the /
to indicate each level of nesting.
AMPS only guarantees support for field identifiers that are valid step names in XPath. For example, AMPS does not guarantee that it can process or filter on a field named Fits&Starts
.
AMPS also supports an optional bracketed field identifier syntax that extends the characters available for field names. For example, the following step name:
refers to a field name of Not Xpath Name
at the root level of the message. This syntax allows spaces to be used in field names in AMPS expressions, even though this is not a valid step name in XPath. Notice that not all message types support field names with embedded spaces or other special characters. For example, the Not Xpath Name
identifier is not a valid element name in XML, nor would it be a valid field name in Google Protocol Buffers.
AMPS checks the syntax of identifiers when parsing an expression. AMPS does not try to predict whether an identifier will match messages within a particular topic. It is not an error to submit an identifier that can never match due to the limitations of the message type. For example, AMPS allows you to use an identifier like /OrderQty
in a filter submitted for a FIX connection, even though FIX messages only use numeric tags, or an identifier like /DataPackage/RunDate
in a filter submitted for a BFlat connection, even though BFlat does not support nested elements.
The message type is responsible for constructing a set of identifiers from a message. In most cases, the mapping is simple. However, see the documentation for the message type for details, or if the mapping is unclear. For example, a composite-local
message type adds the number of the part to the beginning of each XPath within the part (so, a top-level field of /name
in the first part of the message has an identifier of /0/name
).
AMPS expressions are designed to work exactly as expected if you are familiar with XPath path specifiers and SQL-92 predicates. This section describes in detail how AMPS evaluates the syntax, operators, and functions available in the AMPS expression language.
AMPS expressions combine the following elements:
Identifiers specify a field in a message. When evaluating an expression, AMPS replaces identifiers with values from the message or set of messages being evaluated.
Literal values are explicit values in an AMPS expression, such as 'IBM'
or 42.
Operators and functions such as =
, <
, >
, *
, and UNIX_TIMESTAMP().
Every AMPS expression produces a value. The way that AMPS uses the value depends on the context in which AMPS evaluates the expression. For example, if the expression is used for a filter, the message is considered to match the filter when the expression returns true
. When an expression is used to project a field, the result of the expression is used as the value of the projected field.
A topic is a string that is used to declare a subject of interest for purposes of routing messages between publishers and subscribers. Topic-based Publish and Subscribe (e.g., Pub/Sub) is the simplest form of Pub/Sub filtering. All messages are published with a topic designation to the AMPS engine, and subscribers will receive messages for topics to which they have subscribed.
For example, in the diagram above there are two publishers: Publisher 1 and Publisher 2 which publish to the topics LN_ORDERS
and NY_ORDERS
, respectively. Messages published to AMPS are filtered and routed to the subscribers of a respective topic. For example, Subscriber 1, which is subscribed to all messages for the LN_ORDERS
topic will receive everything published by Publisher 1. Subscriber 2, which is subscribed to the regular expression topic ".*_ORDERS"
will receive all orders published by Publisher 1 and 2.
Regular expression matching makes it easy to create topic paths in AMPS. Some messaging systems require a specific delimiter for paths. AMPS allows you the flexibility to use any delimiter. However, 60East recommends using characters that do not have significance in regular expressions, such as forward slashes. For example, rather than using northamerica.orders
as a path, use northamerica/orders
.
AMPS does not restrict the characters that can be present in a topic name. However, notice that topic names that contain regular expression characters (such as .
or *
) will be interpreted as regular expressions by default, which may cause unexpected behavior.
Topics that begin with /AMPS
are reserved. The AMPS server publishes messages to topics that begin with /AMPS
as described in the Event Topics section. Some versions of the AMPS client libraries may internally publish to /AMPS/devnull
. Your applications should not publish to topics that begin with /AMPS
, as publishes to those topics may fail.
Each topic has an associated message type. Each client connection to AMPS also has an associated message type. A given client connection can only publish to topics with the same message type, and can only receive messages from topics with the same message type.
AMPS does not require explicit configuration of a topic for publishers to send messages to the topic and subscribers to receive messages from the topic. However, if there is no configuration for the topic, AMPS does not persist messages to the topic, so no features that depend on having a persisted message state (for example; replay, aggregation, State of the World, and so on) are available for that topic. The message will be delivered to subscriptions that are active when the message is published, but the message will not be persisted or retained. These "ad hoc" topics are useful for low-latency delivery of messages that are only useful at the time that they are published.
With AMPS, a subscriber can use a regular expression to simultaneously subscribe to multiple topics that match the given pattern. This feature can be used to effectively subscribe to topics without knowing the topic names in advance.
Notice that a message cannot be published to a topic pattern. The topic for a given message is unambiguously specified using a literal string. From the publisher’s point of view, it is publishing a message to a topic. A publisher does not publish to a topic pattern.
When a subscription is sent to AMPS, the topic for the subscription is interpreted as a regular expression if the topic includes special regular expression characters. Otherwise, the topic must be an exact match.
Some examples of regular expressions to match a set of topics are included in the table below:
^trade$
Matches only "trade".
^client.*
Matches "client", "clients", "client001", etc.
.*trade.*
Matches "NYSEtrades", "ICEtrade", etc.
trade.info
Matches "trade/info", "trade-info", "every/trade/info", etc.
For more information regarding the regular expression syntax supported within AMPS, please see the Regular Expressions section for details.
AMPS can be configured to disallow regular expression topic matching for subscriptions. See the AMPS Configuration Guide for details.
AMPS guarantees that, for each AMPS instance, each subscription to a topic receives messages in the order in which AMPS received the messages (with the exception of messages that have been returned to a message queue for redelivery or the results of a query). Before a given message is delivered to a subscriber, all previous messages for that topic are delivered to the subscriber. AMPS does this by enforcing a total order across the instance for all messages received from publishers, including messages received via replication. When AMPS is using a transaction log, that order is preserved in the transaction log for the instance, and persists across instance restarts. When replaying from the transaction log, a subscriber will always receive messages in the same order in which messages were originally delivered by that instance of AMPS.
This guarantee also applies across topics for subscriptions that involve multiple topics, for all topics except views, queues, and conflated topics. Views and queues guarantee that every message on the view or the queue appears in the order in which the message was published. However, the computation involved in producing messages for views and queues may introduce some amount of processing latency, and AMPS does not delay messages on other topics while performing these computations. For a queue that provides at-least-once
delivery, if a processor fails and returns a message to the queue, that message will be redelivered (which means that the new processor may receive the message out of order). Likewise, when AMPS is providing conflation (either through a conflated topic or the conflation options on a subscription), AMPS does not provide ordering guarantees for conflated messages.
Applications often use this guarantee to publish checkpoint messages, indicating some external state of the system, to a checkpoint topic. For example, you might publish messages marking the beginning of a business day to a checkpoint topic, MARKERS
, while the ORDERS
topic records the orders during that day. Subscribers to the regular expression ^(ORDERS|MARKERS)$
are guaranteed to receive the message that marks the business day before any of the messages published to the ORDERS
topic for that day, since AMPS preserves the original order of the messages.
For messages constructed by AMPS, such as the output of a view, AMPS processes messages for each topic in the order in which they arrive (unless conflation is requested) and delivers each calculated message to subscribers as soon as the calculation is finished and a message is produced. This keeps the latency low for each individual topic. However, this means that while AMPS guarantees the order in which messages are produced within each view, messages produced for views that do simple operations will generally take less time to be produced than messages for views that perform complex calculations or require more complicated serialization. This means that AMPS guarantees ordering within view topics, but does not guarantee that messages for separate view topics arrive in a particular order.
The figure below shows a possible ordering for messages received on an underlying topic and two views that use the topic:
Notice that within each topic, AMPS enforces an absolute order. However, the Simple View produces the results of Message 3 before the Complex View produces the results of Message 2. AMPS delivers the message for each topic as soon as possible.
When providing messages received via replication (see Replicating Messages Between Instances), the principles on message ordering provided above still apply. AMPS records messages into the local transaction log in the order in which messages are received by the instance and provides messages to subscribers in that order. AMPS uses the sequence of publishes assigned by the original publisher and the order assigned by the upstream instance to ensure that all replicated messages are received and recorded in order with no gaps or duplicates.
Each instance of AMPS replicates messages to downstream destinations in the order in which messages are recorded in the transaction log.
AMPS does not enforce a global total ordering across a replication topology. This peer-to-peer approach means that an AMPS instance can continue accepting messages from publishers and providing messages to subscribers even when the remote side of a replication link is offline or if replication is delayed due to network congestion. However, if two messages are published to different instances at the same time by different publishers, the two instances may record a different overall message order for those messages, even though message order from each publisher is preserved.
Communication between applications and the AMPS server uses AMPS messages. AMPS messages are received or sent for every operation in AMPS. Each AMPS message has a specific type and consists of a set of headers and a payload. The headers are defined by AMPS and formatted according to the protocol specified for the connection. Typically, applications use the standard amps
protocol which uses a JSON document for headers. The payload, if one is present, is the content of the message and is in the format specified by the message type.
Messages received from AMPS have the same format as messages to AMPS. These messages also have a specific type, with a header formatted according to the protocol and a payload of the specified message type. For example, AMPS uses ack
messages, short for acknowledgment, to report the status of commands. AMPS uses publish
messages to deliver messages on a subscription, and so on for other commands and other messages.
Let's consider a complete interaction between an application and the AMPS server as an example. When a client subscribes to a topic in AMPS, the client sends a subscribe
message to AMPS that contains the information about the requested subscription and, by default, a request for an acknowledgment that the subscription has been processed. AMPS returns an ack
message when the subscription is processed that indicates whether the subscription succeeded or failed, and then begins providing publish
messages for new messages on the subscription. The publish
messages continue as messages that match the subscription arrive at the AMPS server. If the application needs to stop the subscription, the application sends an unsubscribe
message to the AMPS server, indicating the subscription to end. Once the AMPS server processes the unsubscribe message, the server will no longer send messages for that subscription to the application. Should the application disconnect, the AMPS server removes all subscriptions for that connection (whether or not the application sends an unsubscribe
command first).
Messages to and from AMPS are described in more detail in the AMPS Command Reference, available on the 60East website and included in the AMPS client SDKs.
In this version of AMPS, the communication transports used by AMPS accept message sizes of up to 200MB in a single command to AMPS. Messages larger than 200MB may be rejected by the transport as invalid. Should your use of AMPS require larger message sizes, contact 60East support.
This version of AMPS limits messages to 200MB in total size
The AMPS Command Reference contains a full list of headers for each command. The table below lists some commonly used headers.
Topic
The topic that the message applies to.
For commands to AMPS, this is the topic that AMPS will apply the command to. For messages from AMPS, this is the topic from which the message originated.
Command
The command type of the message. Each message has a specific command type.
For example, messages that contain data from a query over a SOW topic have a command of sow
, while messages that contain data from a publish command have a command of publish
and messages that acknowledge a command to AMPS have a command type of ack
.
CommandId
An identifier used to correlate responses from AMPS with an initial command.
For example, ack
messages returned by AMPS contain the CommandId provided with the command they acknowledge and subscriptions can be updated or removed using the CommandId provided with the subscribe
command.
SowKey
This header is included on messages from a SOW topic by default. AMPS will omit this header when the subscription or SOW query includes the no_sowkey
option.
CorrelationId
A user-specified identifier for the message.
Publishers can set this identifier on messages. AMPS does not parse, change, or interpret this identifier in any way.
This header is limited to characters used in Base64 encoding.
Status
Set on ack
messages to indicate the results of the command, such as Success
or Failure
.
Reason
Set on ack
messages to indicate the reason for the Status acknowledgment.
Timestamp
Optionally set on publish
messages and sow
messages to indicate the time at which the local AMPS instance processed the message.
To receive a timestamp, the SOW query or subscription must include the timestamp
option on the command that creates the subscription or runs the query. The timestamp is returned in ISO-8601 format.
This section presents a few of the commonly used headers. See the AMPS Command Reference for a full description of AMPS messages.
AMPS does not provide the ability to add custom header fields. However, AMPS composite message types provide an easy way to add an additional section to a message type that contains metadata for the message. Since composite message type parts fully support AMPS content filtering, this approach provides more flexibility and allows for more sophisticated metadata than simply adding a header field. See the Composite Messages section for details.
For messages received from a State of the World (or SOW) topic, an identifier that AMPS assigns to the record for this message. SOW topics are described the section.
The logical operators are NOT
, AND
, and OR
, in order of precedence. These operators have the usual Boolean logic semantics.
As with other operators, you can use parentheses to group operators and affect the order of evaluation.
AMPS also provides a regular expression comparison operator, LIKE
, to provide regular expression matching on string values. A pattern is used for the right side of the LIKE
operator. A pattern must be provided as a literal, quoted value. For more on regular expressions and the LIKE
comparison operator, please see the section on Regular Expressions.
The string comparison operators described in the section called String Comparison Functions are usually more efficient than equivalent LIKE
expressions, particularly when used to compare multiple literal patterns, or when the only purpose of the regular expression is to perform case-insensitive matching. Use LIKE
operations when it is not practical to represent the filter condition with the string comparison operators.
The comparison operators can be loosely grouped into equality comparisons and range comparisons. The basic equality comparison operators, in precedence order, are ==
, =
, >
, >=
, <
, <=
, !=
, and <>
. The ==
comparison and the =
comparison are treated as the same operator and produce the same results.
If these binary operators are applied to two operands of different types, AMPS attempts to convert strings to numbers. If conversion succeeds, AMPS uses the numeric values. If conversion fails because the string cannot be meaningfully converted to a number, strings are always considered to be greater than numbers. The operators consider an empty string to be NULL
.
The following table shows some examples of how AMPS compares different types.
There are also set and range comparison operators. The BETWEEN
operator can be used to check the range values.
The range used in the BETWEEN
operator is inclusive of both operands, meaning the expression /A BETWEEN 0 AND 100
is equivalent to /A >= 0 AND /A <= 100.
For example:
The IN
operator can be used to perform membership operations on sets of values. The IN
operator returns true when the value on the left of the IN
appears in the set of values in the IN
clause. For example:
The IN
operator returns true for the set of records that would be returned by an equivalent set of =
comparisons joined by OR
. The following two statements return the same set of records:
This equivalence means that NULL
values in either the field being evaluated, or the set of values provided to the IN
clause, always return false.
This also means that, for string values, the IN
operator performs exact, case-sensitive matching.
When evaluating against a set of values, the IN
operator typically provides better performance than using a set of OR
operators. That is, a filter written as /firstName IN ('Joe', 'Kathleen', 'Frank', 'Cindy', 'Mortimer')
will typically perform better than an equivalent filter written as /firstName = 'Joe' OR /firstName = 'Kathleen' OR /firstName = 'Frank' OR /firstName = 'Cindy' OR /firstName = 'Mortimer'
.
AMPS expressions allow you to group parts of the expression using parentheses. Parts of an expression inside parentheses are evaluated together. 60East recommends using parentheses to group independent parts of an expression to ensure that the expression is evaluated in the expected order. For example, in this expression:
The clause /counter % 3
is evaluated first, and the result of that evaluation is compared to 0
.
Within a group, elements are evaluated left to right in precedence order. For example, given the filter below:
AMPS evaluates expression2
, then expression3
(since AND
has higher precedence than OR
), and if they evaluate to false, then expression1
will be evaluated.
AMPS does not guarantee that all parts of an expression will be evaluated if the result of an expression can be determined after only evaluating part of the expression. For example, given the expression:
AMPS only guarantees that B_FUNCTION(/b)
will be evaluated if A_FUNCTION(/a)
returns false
.
Each value in AMPS is assigned a data type when the message type module parses the value. AMPS operators and functions attempt to convert values into compatible types, based on the type of operation. For example, the *
operator (multiplication) will attempt to convert all values to numeric values, while the CONCAT
function (string concatenation) will attempt to convert all values to strings. In effect, a value in AMPS can be transparently treated as any type to which it can be meaningfully converted.
Internally, AMPS uses the data types in the table below. As mentioned above, the message type module is responsible for assigning the type of a value from an incoming message as part of the parsing process. For some types, such as JSON, XML, FIX and NVFIX, the parser infers the type of the value from the field. For other types, such as MessagePack, BFLAT, Google Protocol Buffers or BSON, the message itself contains information about the type of the field.
As mentioned above, the AMPS expression language does not limit the value to the type assigned by the message type module. Instead, a value in AMPS can be used in any context.
For example, given the following JSON document:
The values of /a
and /b
can be used as either string values or numeric values. AMPS will automatically convert these values as necessary, and AMPS considers the string or numeric representation to be equally correct and valid.
The following table lists the data types in the AMPS expression language:
Numeric values in AMPS are always typed as either integers or floating point values. All numeric types that are less than or equal to the LONG_MAX limit in AMPS are signed, otherwise, the numeric type is unsigned. AMPS message types convert the original numeric types (or original representation for message types that do not have typed values) into the internal AMPS type system for the purposes of expression evaluation.
Within expressions, integer values are all numerals, with no decimal point, and can have a value in the same range as a 64-bit integer. For example:
Within expressions, all numerals with a decimal point are floating-point numbers. AMPS interprets these numerals as double-precision floating point values. For example:
or, in scientific notation:
AMPS automatically converts strings that contain numeric values to numbers when strings are used with an operator, function or comparison that expects a numeric value.
AMPS uses the following rules for type promotion when evaluating numeric expressions:
If any of the values in the expression is NaN
, the result is NaN
.
Otherwise, if any of the values in the expression is floating point, the result is floating point.
Otherwise, all of the values in the expression are integers, and the result is an integer.
Notice that, for division in particular, the results returned are affected by the type of the values. For example, the expression 1 / 5
evaluates to 0
since the result is interpreted as an integer. In comparison, the expression 1.0 / 5
evaluates to 0.2
since the result is interpreted as a floating point value.
When creating expressions for AMPS, string literals are indicated with single or double quotes. For example:
AMPS supports the following escape sequences within string literals:
Additionally, any character which follows a backslash will be treated as a literal character.
AMPS string operations have no restrictions on character set, and correctly handle embedded NULL
characters (\x00
) and characters outside of the 7-bit ASCII range. AMPS string operations are not unicode-aware.
XPath expressions are considered to be NULL
when they evaluate to an empty or nonexistent field reference. NULL
values follow SQL-92 semantics.
This means that comparisons with NULL
are never true (in other words, even if /a
is NULL
, /a != NULL
is false and /a == NULL
is also false).
In numeric expressions where the operands or results are not a valid number, the XPath expression evaluates to NaN
(not a number). The rules for applying the AND
and OR
operators against NULL
and NaN
values are outlined in the tables below:
Likewise, direct comparisons with NULL
are not ever true (so, if /b
is NULL, /b == NULL
does not produce a true value, and neither does /b != NULL
). AMPS, like SQL-92, provides an IS NULL
predicate for testing whether a value is NULL
, and an IS NOT NULL
predicate for testing whether a value is not NULL
.
There also exists an IS NAN
predicate for checking that a value is NaN
(not a number.)
To reliably check for existence of a NULL
value, you must use the IS NULL
predicate such as the filter: /optionalField IS NULL
To reliably check that a value is not NULL
, you must use the IS NOT NULL
predicate or negate the value of an IS NULL
test: /optionalField IS NOT NULL
and NOT /optionalField IS NULL
are equivalent.
AMPS also provides a COALESCE()
function that accepts a set of values and returns the first value that is not NULL. For example, given the following filter expression:
AMPS will return the first value that is not NULL
, and compare that value to the constant string 'restricted'
. Notice that, to make the intent of the filter clear, this example provides a constant value for AMPS to return from the COALESCE
if all of the field values are NULL
.
Many messaging applications are designed for high performance and use a simplified message structure. For applications that use compound types, AMPS includes the ability to parse and filter on the contents of nested data structures.
For performance, AMPS parses nested data structures into a set of values. As with single-valued (or scalar) values, the AMPS expression language refers to a parsed set of values that is common to all message types rather than the underlying data.
The AMPS message types treat compound data types as a set of paths with corresponding scalar values. A field that only contains other fields is represented as a step in the path to the primitive values that it contains.
AMPS parses compound types as follows:
Any field that contains a scalar value is represented as an identifier/value pair.
Any field that contains other fields is represented as a step in the path to that value.
The following JSON document is a simple example.
With this document, AMPS produces the following parsed value:
In the parsed representation, the outer
and middle
fields contain no data of their own. They serve only as containers for the inner
field which contains data.
Notice that the intermediate paths do not have an explicit scalar value.
With a more complex document the parsed representation continues to follow the same principles, as shown in the following example.
The representation of the above message in the AMPS expression language would typically be as follows:
As with the first example, fields that do not directly contain a value do not have an explicit scalar value. Values with the same identifier are represented as an array of values with that identifier.
LIKE
The string to be compared
The pattern to evaluate the string against
Case-sensitive
Returns true if the string to be compared matches the pattern.
For example, the following filter uses a PCRE backreference to return true for any message where the /state
field contains two identical characters in a row.
This operator is not unicode-aware.
When a function or operator that expects a numeric type is provided with a string, AMPS will attempt to convert string values to numeric types as necessary. When converting string values, AMPS recognizes the same numeric formats in message data as are supported in the AMPS expression language (see ). If the string is in an unrecognized format, AMPS converts the string as NaN
.
The COALESCE
function, like other functions in AMPS, is not array-aware. This means that when one of the XPath expressions provided to COALESCE
specifies an array in the original message, AMPS provides the first item in the array to the COALESCE
function. See for details.
Multiple values with identical paths are represented as an array. For more information on arrays in the AMPS expression language, see .
1 < 2
TRUE
10 < '2'
FALSE, '2' can be converted to a number
'2.000' <> '2.0'
TRUE, no conversion to numbers since both are strings
2 = 2.0
TRUE, numeric comparison
10 < 'Crank It Up'
TRUE, strings are greater than numbers
10 < ''
FALSE, an empty string is considered to be NULL
10 > ''
FALSE, an empty string is considered to be NULL
'' = ''
FALSE, an empty string is considered to be NULL
'' IS NULL
TRUE, an empty string is considered to be NULL
NULL
Unknown, untyped value (SQL-92 semantics)
[no field provided]
NVFIX: a=<SOH>
JSON: {"a":null}
XML: <a/>
Boolean
True (1
) or false (0
)
JSON: {"e":true}
Integer
Signed 64-bit integer or unsigned 64-bit integer for values > LONG_MAX
NVFIX: b=24
JSON: {"b":24}
XML: <b>24</b>
Floating Point Number
64-bit floating point number
NVFIX: c=24.0
JSON: {"c":24.0}
XML: <c>24.0</c>
String
Arbitrary sequence of bytes of a specific length
An empty string is considered to be NULL
NVFIX: d=Grilled cheese sandwich<SOH>
JSON: {"d":"Grilled cheese sandwich"}
XML: <d>Grilled cheese sandwich</d>
\a
Alert
\b
Backspace
Horizontal tab
Newline
\f
Form feed
Carriage return
\xHH
Hexadecimal digit where H is (0..9,a..f,A..F)
\OOO
Octal Digit (0..7)
TRUE
(AND)
NULL
NULL
FALSE
(AND)
NULL
FALSE
NULL
(AND)
NULL
NULL
NULL
(AND)
TRUE
NULL
NULL
(AND)
FALSE
NULL
TRUE
(OR)
NULL
TRUE
FALSE
(OR)
NULL
NULL
NULL
(OR)
NULL
NULL
NULL
(OR)
TRUE
NULL
NULL
(OR)
FALSE
NULL
/outer/middle/inner
5
/outer/array
['a1', 'a2', 'a3]
Elements in the array can be referred to directly with subscript notation.
For example: /outer/array[0]
is 'a1'
.
/outer/compound/A
'middle-A'
/outer/compound/B
'middle-B'
/outer/compound/C/C1
['first-C1', 'second-C1']
Elements in the array can be referred to directly with subscript notation.
For example: /outer/compound/C/C1[0]
is 'first-C1'
.
/outer/compound/C/D1
['first-D1', 'second-D1']
Elements in the array can be referred to directly with subscript notation.
For example: /outer/compound/C/D1[0]
is 'first-D1'
.
AMPS contains support for a ternary conditional IF
operator which allows for a Boolean condition to be evaluated to true
or false
, and will return one of the two parameters. The general format of the IF
statement is:
In this example, the BOOLEAN_CONDITIONAL
will be evaluated, and if the result is true, the VALUE_TRUE
value will be returned otherwise the VALUE_FALSE
will be returned.
IF
Conditional expression
Value to return if conditional expression is true
Value to return if conditional expression is false
Evaluate the conditional expression and return one of the two input values based on the results of the expression.
The AMPS expression engine can conditionally evaluate the terms provided to the IF
statement in version 5.3.4 and greater.
In previous versions of AMPS, all expressions provided to the IF
statement were fully evaluated before the IF
statement was evaluated.
For example:
The above example returns a count of the total number of orders that have been placed where the symbol is MSFT and the order contains a quantity more than 500.
The IF
operator can also be used to evaluate results to determine if results are NULL
or NaN
. This is useful for calculating aggregates where some values may be NULL
or NaN
. The NULL
and NaN
values are discussed in more detail in the AMPS Data Types section.
For example:
AMPS supports the arithmetic operators +
, -
, *
, /
, %
, and MOD
in expressions. The result of arithmetic operators where one of the operands is NULL
is undefined and evaluates to NULL
.
AMPS distinguishes between floating point and integral types. When an arithmetic operator uses two different types, AMPS will convert the integral type to a floating point value as described in Numeric Types and Literals.
Examples of filter expressions using arithmetic operators:
AMPS numeric types are signed, and the AMPS arithmetic operators correctly handle negative numbers. The MOD
and %
operators preserve the sign of the first argument to the operator. That is, -5 % 3
produces a result of -2
, while 5 % -3
produces a result of 2
.
When using mathematical operators in conjunction with filters, be careful about the placement of the operator. Some operators are used in the XPath expression as well as for mathematical operation (for example, the '/'
operator in division). Therefore, it is important to separate mathematical operators with white space to prevent interpretation as an XPath expression.
This section describes the functions installed by default in the AMPS server.
Additional functions that ship with the AMPS server are provided in auxiliary modules, as described in the section on Optional Functions.
This section describes general performance considerations for the AMPS expression language and content filters. The considerations here are aspects of AMPS performance to be aware of in the general case. However, since the AMPS expression language operates on specific data, the structure and size of the messages that your application uses may have more effect on overall performance than the specific expressions used. For example, parsing and filtering a 20MB XML document is inherently more expensive than parsing and filtering a 400 byte BFlat document.
When clauses in an expression are joined by OR
, AMPS will only evaluate the right side of an OR
expression if the left side of the expression is false.
When constructing an expression, this means that there can be a performance advantage to having relatively less expensive clauses on the left hand sides of the OR
. For example, in the following clause:
The regular expression comparison is only evaluated if the comparison /code = 'restricted'
is false. If the comparison is true, then the overall clause is true and there is no need to evaluate the regular expression.
AMPS does not reorder or recombine complex expressions. Where feasible, your application can save work at the server by combining expressions. In particular, if an application is constructing a filter by reading options from various sources, performance can be improved by combining the queries.
For example, in a filter like the following:
The comparison against '12345'
will be evaluated three times in cases where the value of /id
does not match any of the values in the filter.
This filter is equivalent to:
The same results are produced, but only evaluates the /id
field against a given value one time.
The LIKE
operator offers access to full Perl-Compatible Regular Expressions within the AMPS expression language. This flexibility allows for very precise filtering, and the PCRE engine performs well.
However, for comparisons for which AMPS provides a named function, the named function is highly-optimized and will perform somewhat better than the general-purpose regular expression engine.
For example, given a choice between two equivalent expressions:
and
The version that uses BEGINS WITH
will typically perform slightly better than the version that uses the regular expression.
This doesn't mean that regular expressions or the LIKE
operator perform poorly. The LIKE
operator can efficiently match patterns that would be difficult or impossible to match using the other operators. However, for very simple comparisons where AMPS provides a dedicated operator, that operator typically performs slightly better than a regular expression.
The following table shows some examples of regular expressions and the AMPS operator equivalent.
Most AMPS message types have the ability to partially parse messages. That is, rather than parsing the entire message, the message type can simply find the identifiers that will be used, and stop the parsing process as soon as those identifiers are found.
This optimization is most useful for larger messages. For example, if the SOW key for a topic is based on the /id
field of a message and there are active content filters that use both the /id
field and the /code
field, while no other field is being indexed, then, considering the message below:
The AMPS parser can stop parsing after processing only the /id
and the /code
fields. In this case, halting the parsing after processing these two fields avoids the expense of parsing the remaining parts of the message.
Notice that this optimization will only improve performance in cases where AMPS doesn't need to parse the entire message. For example, if there is a delta_subscribe
active for the topic, or if the command being processed is a delta_publish
, AMPS will parse the message completely to be able to calculate the deltas. Likewise, if any filter refers to a field that doesn't appear in the message, AMPS will parse the message completely to be able to determine that the field does not appear in the message.
Queries over topics in the State of the World (SOW) have additional performance considerations. AMPS maintains indexes over SOW topics to help locate messages in response to a query.
Queries over a topic in the SOW can use SOW topic indexes. Where possible, use an exact string match and create a hash index to take advantage of hash indexes.
When a query is submitted with an XPath identifier for which no index exists, AMPS will create and populate a memo index for that XPath identifier. This can add to the amount of time a query takes the first time a given XPath identifier is queried. You can specify that AMPS creates a memo index for a given identifier by using the Index
configuration item in the Topic
definition. Once an index is created, AMPS will continue to search for that XPath identifier in incoming messages for that topic to keep the index up to date.
Notice that SOW topic indexes are only used for sow
commands and during the sow
portion of a sow_and_subscribe
(or sow_and_delta_subscribe
) command. Once the subscription to current updates begins, the subscription does not use a SOW topic index because there is no need to locate a message. During a subscription, filters are run against the current message.
AMPS supports filters that operate on arrays in messages. There are two simple principles behind how AMPS treats arrays:
Binary operators that yield true
or false
(for example, =
, <
, LIKE
) are array aware, as is the IN
operator. These operators work on arrays as a whole, and evaluate every element in the array.
Arithmetic operators, functions, user-defined functions and other scalar operators, are not array aware, and use the first element in the array.
With these simple principles, you can predict how AMPS will evaluate an expression that uses an array. For any operator, an empty array evaluates to NULL
.
Let's look at some examples. For the purposes of this section, we will consider the following JSON document:
While these arrays are presented using JSON format for simplicity, the same principles apply to arrays in other message formats.
Here are some examples of ways to use an array in an AMPS filter:
To determine this, you provide the identifier for the array, and use a comparison operator.
To determine this, use the subscript operator []
on the XPath identifier to specify the position, and use the equality operator to check the value at that position.
These patterns and principles hold regardless of the original representation of the array in a document.
See the section on for details.
When creating an expression that uses a field in a compound value, keep in mind that AMPS represents compound values as described in the section on .
^something
BEGINS WITH('something')
something$
ENDS WITH ('something')
something
INSTR(/field, 'something') != 0
(?i)something
INSTR_I(/field, 'something') != 0
(?i)^something$
STREQUAL_I(/field, 'something') != 0
^a$
= 'a'
/data = 1
TRUE, /data
contains 1
/data = 'zebra'
TRUE, /data
contains 'zebra'
/data != 'zebra'
TRUE, /data
contains an element that is not 'zebra'
/data = 42
FALSE, /data
does not contain 42
/data LIKE 'z'
TRUE, a member of /data
matches 'z'
/other > 30
TRUE, a member of /other
is > 30
/other > 50
FALSE, no member of /other
is > 50
/data[0] = 1
TRUE, first element of /data
is 1
/data[3] = "zebra"
TRUE, fourth element of /data
is 'zebra'
/data[1] != 1
TRUE, second element of /data
is not 1
/other[1] LIKE '4'
TRUE, second element of /other
matches '4'
/data = /other
TRUE, a value in /data
equals a value in /other
/data != /other
TRUE, a value in /data
does not equal a value in /other
3 IN (/data)
TRUE, 3
is a member of /data
/data IN (1, 2, 3)
TRUE, a member of /data
is in (1, 2, 3)
/data IN ("zebra", "antelope", "lion")
TRUE, a member of /data
is in ("zebra", "antelope", "lion")
Regular expression matching provides precision, power, and flexibility for matching patterns. AMPS supports regular expression matching on topics and within content filters. Regular expressions are implemented in AMPS using the Perl-Compatible Regular Expressions (PCRE) library. For a complete definition of the supported regular expression syntax, please refer to:
http://perldoc.perl.org/perlre.html
To use regular expressions for topic matching, provide a regular expression pattern where you would normally provide a topic name.
To use regular expressions in content filtering, compare strings to regular expressions using the LIKE
operator. The syntax of the LIKE
operator is:
In this context, a string is any expression that provides a string and pattern is a literal regular expression pattern.
This chapter presents a brief overview of regular expressions in AMPS. However, this chapter is not exhaustive. For more information on regular expression matching, see the PCRE site mentioned above.
Here is an example of a content filter for messages that will match any message meeting the following criteria:
Regular expression match of symbols of 2 or 3 characters starting with “IB”
Regular expression match of prices starting with “90”
Numeric comparison of prices less than 91
The corresponding content filter would be:
The tables below contain a brief summary of special characters and constructs available within regular expressions.
Here are more examples of using regular expressions within AMPS:
Use (?i)
to enable case-insensitive regular expression searching. For example, the following filter will be true regardless if /client/country
contains “US” or “us”.
To match messages where tag 55 has a TRADE
suffix, use the following filter:
To match messages where tag 109 has a US
prefix and a TRADE
suffix, with case insensitive matching, use the following filter:
AMPS recognizes the following regular expression metacharacters:
^
Beginning of string
$
End of string
.
Any character except a newline
*
Match previous 0 or more times
?
Match previous 0 or 1 times
()
Grouping of expression
[]
Set of characters
{}
Repetition modifier
\
Escape for special characters
AMPS recognizes the following repetition constructs:
a*
Zero or more a's
a?
Zero or one a's
a{m}
Exactly m a's
a{m,}
At least m a's
a{m,n}
At least m, but no more than n a's
The table below lists some of the modifiers AMPS recognizes:
i
Case insensitive search
m
Multi-line search
s
Any character (including newlines) can be matched by a . character
x
Unescaped white space is ignored in the pattern
A
Constrain the pattern to only match the beginning of a string
U
Make the quantifiers non-greedy by default (the quantifiers are greedy and try to match as much as possible by default)
AMPS additionally provides support for raw strings, which are strings prefixed by an 'r' or 'R' character. Raw strings use different rules for how a backslash escape sequence is interpreted by the parser. When a string literal is provided as a raw string, the characters in the raw string are matched exactly, even when those characters are special characters for a regular expression.
In the example below, the raw string - noted by the r
prefix of the string literal in the second operand of the LIKE
predicate causes AMPS to search for the literal characters ++
in the results, without requiring those characters to be escaped. In this example we are querying for a string that contains the programming language named C++
. In the regular string, we are required to escape the '+'
character since it is also used in a regular expression as the “match previous 1 or more times” regular expression character. In the raw string we can use r'C++'
to search for the string and not have to escape the special '+'
character.
An expression using the raw string capability would look like the following:
This can be simpler and easier to read then the escaped equivalent, shown below:
As mentioned previously, AMPS supports regular expression filtering for topics, in addition to content filters. Regular expressions use the same grammar described in content filtering. Regular expression matching for topics is enabled in an AMPS instance by default.
Subscriptions or queries that use a regular expression for the topic name provide all matching records from AMPS topics where the name of the topic matches the regular expression used for the subscription or query. For example, if your AMPS configuration has three SOW topics, Topic_A
, Topic_B
and Topic_C
and you wish to search for all messages in all of your SOW topics for records where the Name
field is equal to “Bob”, then you could use a sow
command with a topic of ^Topic_.*
and a filter of /FIXML/@Name='Bob'
to return all matching messages that match the filter in all of the topics that match the topic regular expression.
Notice that, as with the LIKE
expression, a regular expression will match at any position in the topic name. To anchor the match to the beginning of the string, use the ^
directive at the beginning of the regular expression. To anchor the match to the end of the string, use the $
directive at the end of the string.
For example, to match a topic with "order"
anywhere in the topic name, you could use the regular expression order.*
(the ending .*
matches zero or more characters, but lets AMPS know to interpret this as a regular expression). To match only topics that start with order
, you would use the regular expression ^order
. To match topics that end with order
, you would use the regular expression order$
.
AMPS includes several types of string comparison operators:
Case-Sensitive Exact Matches - The IN
, =
, BEGINS WITH
, ENDS WITH
, and INSTR
operators do literal matching on the contents of a string. These operators are case-sensitive.
Case-Insensitive Exact Matches - AMPS also provides two case-insensitive operators: INSTR_I
, a case-insensitive version of INSTR
, and a case-insensitive equality operator, STREQUAL_I
.
Regular Expression Matches - AMPS also provides full regular expression matching using the LIKE
operator, described in Regular Expressions.
The =
operator tests whether a field exactly matches the literal string provided.
BEGINS WITH
and ENDS WITH
test whether a field begins or ends with the literal string provided. The operators return TRUE
or FALSE
.
AMPS allows you to use set comparisons with BEGINS WITH
and ENDS WITH
. In this case, the filter matches if the string in the field begins or ends with any of the strings in the set.
The INSTR
operator allows you to check to see if one string occurs within another string. For this operator, you provide two string values. If the second string occurs within the first string, INSTR
returns the position at which the second string starts, or 0 if the second string does not occur within the first string. Notice that the first character of the string is 1 (not 0). For example, the expression below tests whether the string critical
occurs within the /eventLevels
field.
AMPS also provides INSTR_I
and STREQUAL_I
functions for performing case-insensitive comparisons.
The following table lists the string comparison functions and operators in AMPS:
=
The string to be compared
The string to compare
Case-sensitive
Returns true if the string to be compared is identical to the string to compare.
BEGINS WITH
The string to be compared
A list of strings to compare
Case-sensitive
Returns true if the string to be compared begins with any of the strings in the list.
ENDS WITH
The string to be compared
A list of strings to compare
Case-sensitive
Returns true if the string to be compared ends with any of the strings in the list.
INSTR
The string to be compared
The string to compare
Case-sensitive
Returns the position at which the second string starts, or 0 if the second string does not occur within the first string.
This function is not unicode-aware.
INSTR_I
The string to be compared
The string to compare
Case-insensitive
Returns the position at which the second string starts, or 0 if the second string does not occur within the first string.
This function is not unicode-aware.
STREQUAL_I
The string to be compared
The string to compare
Case-insensitive
Returns true if, when both strings are transformed to the same case, the string to be compared is identical to the string to compare.
This function is not unicode-aware.
LENGTH
The string to be counted
Returns the length of the provided string.
In the AMPS expression language, a function can be used in any place an identifier or literal value can be used.
All AMPS functions return a single value. During evaluation of an AMPS expression, AMPS calls the functions in the expression and uses the results to evaluate the expression. A function may perform type conversion as needed to evaluate the expression.
The results of a call to an AMPS function can be used as the parameter to an AMPS function. For example, the following is a valid expression:
In this case, AMPS first evaluates the SUBSTR
function, which requests the subset of the string fandango
, starting at position 5
. That function returns ango
. AMPS then uses the string ango
as the input to the REVERSE
function, which returns the result ogna
.
The following table lists the available functions by category:
The AMPS server distinguishes between functions that produce a consistent value for the same message (deterministic functions) and functions that may produce a different value each time it is called, even if the function has the same input and is called for the update to the same message (non-deterministic functions).
There are no restrictions on the use of deterministic functions, since each time that they are called for a given message (or a given update to a message), they will return a consistent result.
Some features of AMPS rely on being able to evaluate an expression in a consistent way for a given message. A function that can produce a different value each time that it is called cannot be used in those situations: otherwise, AMPS could produce incorrect (or meaningless) results.
In practice, this means that a non-deterministic function:
Cannot be used in the filter of a subscription that requests out of focus (oof) notifications.
Cannot be used in the filter of an aggregated subscription (although a non-deterministic function is allowed in the filter of an aggregated query, since the filter will only be evaluated once per message).
Cannot be used in an aggregate function (aggregate functions are available in views, aggregated subscriptions, and aggregated queries).
Cannot be used in the filter for a sow_and_subscribe
command that uses pagination (that is, a command that specifies top_n
/skip_n
/OrderBy
).
Cannot be used in the filter for a queue or the barrier expression for a queue.
Cannot be used in the filter for a view or conflated topic.
Cannot be used in a replication filter.
In this release, LAST_READ
, UNIX_TIMESTAMP
and VALUE_LOOKUP
are non-deterministic. The other functions provided with AMPS (both built in and provided through auxiliary modules) are deterministic.
String
Converting Arrays to Strings
(see ARRAY_TO_STRING
)
Date and Time
Array Reduce
Geospatial
Numeric
Checksum
Message
Client
Working with NULL Values
AMPS Information
Typed Value Creation
AMPS provides the CONCAT
function, that can be used for constructing strings. The CONCAT
function takes any number of parameters and returns a string constructed from those parameters. The function can accept both XPath identifiers and literal values.
The CONCAT
function can be used in any AMPS expression that uses a string. For example, you could CONCAT
in a filter as follows:
CONCAT
can be combined with other expressions, including conditional expressions. A mailingAddressName
field in a view could be constructed as follows:
AMPS provides a pair of functions, REPLACE
and REGEXP_REPLACE
, that replace text within strings. The REPLACE
function does a literal match of the string to be replaced, while REGEXP_REPLACE
uses a PCRE pattern to find the string to be replaced.
The following expressions all evaluate as true:
REPLACE
string to transform, string to match, replacement text
Returns the input string, with all occurrences of the string to match replaced with the replacement text.
REGEXP_REPLACE
string to transform, pattern to match, replacement text
Returns the input string, with all occurrences of the pattern to match replaced with the replacement text.
AMPS provides the UPPER
and LOWER
functions to produce a string in a specific case. This can be useful when constructing fields, or when an expression needs case-insensitive comparisons against a group of values using the IN
clause.
As described above in String Comparison Functions, AMPS provides INSTR_I
and STREQUAL_I
functions for performing case-insensitive comparisons. In some cases, particularly when using strings with the IN
clause, it is more efficient to simply convert the string to a known case.
The UPPER
and LOWER
functions are not unicode-aware; these functions will not produce the correct data when used with multibyte characters. For example, you might compare an incoming field of unknown case to a set of known values as follows:
UPPER
The string to transform
Returns the input string, transformed to uppercase.
This function is not unicode aware.
LOWER
The string to transform
Returns the input string, transformed to lowercase.
This function is not unicode aware.
AMPS includes functions for working with date and time values. This section covers functions loaded into AMPS by default. AMPS also includes functions for working with date and time in the Legacy Messaging Compatibility layer.
STRFTIME
format string, timestamp
Produces a string that contains a representation of the provided timestamp
, formatted as specified in the provided format string
. The format string uses the same format specifiers as the standard strftime(3)
function.
This function also supports the additional format specifier %f
to format microseconds, and the format specifier %03f
to format milliseconds.
The length of the string produced for the time is limited to 128 bytes.
STRPTIME
time string, format string
This function interprets the time string
provided as a timestamp, with the format string
specifying how to interpret the time string
.
This function returns a double.
The format string uses the same format specifiers as the standard strptime(3)
function.
This function also supports the additional format specifier %f
to parse microseconds, and the format specifier %03f
to parse milliseconds.
UNIX_TIMESTAMP
none
Returns the current timestamp as a double, represented in seconds (including parts of a second as a decimal).
Notice that a UNIX timestamp is seconds elapsed since 00:00 on January 1, 1970 in UTC and is independent of the timezone of the local system.
The underlying system call used for this function has microsecond resolution, subject to any hardware or host limitations.
This function is non-deterministic, and cannot be used in contexts that require a deterministic function
AMPS includes a set of functions designed to operate over an array element in a message and produce a value. These functions take an array within a single message as input, and reduce that array to a single value as output.
ARRAY_COUNT
array
Returns the number of elements in the array.
ARRAY_MAX
array
Returns the largest value in the array, using the standard AMPS >
comparison.
ARRAY_MIN
array
Returns the minimum value in the array, using the standard AMPS <
comparison.
ARRAY_SUM
array
Returns a number produced by adding all of the elements in the array.
NULL
values in the array are ignored.
ARRAY_TO_STRING
array, delimiter, null_replacement
Returns a string comprised of the elements of the array, separated by the provided delimiter.
NULL
values in the array are replaced with the provided null_replacement value.
AMPS provides the SUBSTR
function, that can be used for returning a subset of a string. There are two forms of this function.
The first form takes the source string and the position at which to begin the substring. You can use a negative number to count backward from the end of the string. AMPS returns a string that starts at the specified position and goes to the end of the string. If the provided position is before the beginning of the string, AMPS starts at the beginning of the string, returning the full string. If the provided position is past the end of the string, AMPS returns a zero-length string, which evaluates to NULL
.
For example, the following expressions are all TRUE
:
The second form of SUBSTR
takes the source string, the position at which to begin the substring, and the length of the substring. Notice that SUBSTR
considers the first character in the string to be position 1
(rather than position 0
), as demonstrated below. AMPS will not return a string larger than the source string. As with the two-argument form, if the starting position is before the beginning of the string, AMPS starts at the beginning of the string. If the starting position is after the end of the source string, AMPS returns an empty string which evaluates to NULL
.
For example, the following expressions are all true:
AMPS also provides simplified forms of SUBSTR
, which simply take the leftmost or rightmost characters from a string. For example, the following expressions all evaluate as true:
AMPS provides a set of functions that work with whitespace or other delimiter characters. For example, the following expressions are all true:
These functions accept an optional second parameter that specifies the delimiters to remove:
The REVERSE
function simply reverses the input string:
SUBSTR
string to process, starting position, [length]
Returns a portion of the input string, starting at the starting position and ending after the specified length.
If the length is not provided, returns the portion of the string from the starting position to the end of the string.
TRIM
string to transform, [characters to trim]
Returns the input string, with all leading and trailing characters in the set of characters to trim removed.
The characters to trim parameter is optional.
When not provided, the parameter defaults to " "
(that is, a space character).
LTRIM
string to transform, [characters to trim]
Returns the input string, with all leading characters in the set of characters to trim removed.
The characters to trim parameter is optional.
When not provided, the parameter defaults to " "
(that is, a space character).
RTRIM
string to process, [characters to trim]
Returns the input string, with all trailing characters in the set of characters to trim removed.
The characters to trim parameter is optional.
When not provided, the parameter defaults to " "
(that is, a space character).
LEFT
string to process, number of characters
Returns the leftmost number of characters from the provided string.
RIGHT
string to process, number of characters
Returns the rightmost number of characters from the provided string.
REVERSE
string to process
Returns the provided string in reverse.
AMPS includes the following functions for working with numbers.
ABS
number
Returns the absolute value of a number.
For example, the following filter will be TRUE when the difference between /a
and /b
is greater than 5, regardless of whether /a
or /b
is larger.
GREATEST
list of numbers to compare
Returns the largest of the provided numbers, or NaN
if no argument is a number.
LEAST
list of numbers to compare
Returns the smallest of the provided numbers, or NaN
if no argument is a number.
CEILING
number to round
Returns the value rounded upward to the next greatest integer. Returns an integer unchanged.
FLOOR
number to round
Returns the value rounded downward to the next lower integer. Returns an integer unchanged.
EXP
exponent to use
Returns e raised to the power of the provided exponent.
LN
number
Returns the natural logarithm of the provided number.
LOG2
number
Returns the base-2 logarithm of the provided number.
LOG10
number
Returns the base-10 logarithm of the provided number.
POWER
base, exponent
Returns the value of base raised to the power of exponent.
SQRT
number
Returns the square root of the provided number.
COS
number
Returns the cosine of the provided number.
ACOS
number
Returns the arcosine of the provided number.
SIN
number
Returns the sine of the provided number.
SINH
number
Returns the hyperbolic sine of the provided number.
ASIN
number
Returns the arcsine of the provided number.
COSH
number
Returns the hyperbolic cosine of the provided number.
COT
number
Returns the cotangent of the provided number.
ATAN
number
Returns the arctangent of the provided number.
ATAN2
number, number
Returns the arctangent of the provided numbers.
TAN
number
Returns the tangent of the provided number.
TANH
number
Returns the hyperbolic tangent of the provided number.
MD5
string
Returns the MD5 checksum of the provided string.
RADIANS
number
Returns the provided number converted from degrees to radians.
DEGREES
number
Returns the provided number converted from radians to degrees.
SIGN
number
Returns the sign of the provided number.
If the number is less than 0
, returns -1
. If the number is greater than 0
, returns 1
. Otherwise, the number is 0
and the function returns 0
.
ROUND
number, [number of decimal places]
Returns a number rounded to the specified number of decimal places.
The number of decimal places is optional. When not provided, the number defaults to 0.
The number of decimal places can be positive or negative. When the number is positive, the number specifies the number of digits to the right of the decimal place to round at. When the number is negative, the number specifies the number of digits to the left of the decimal place to round at.
For example, you could use the following expression in a view to limit the precision of the /price
field of the source topic to 2 decimal places.
WIDTH_BUCKET
expression, min, max, bucket count
The bucket count argument specifies the number of buckets to create over the range defined by min and max. min is inclusive, while max is not.
The value from expression is assigned to a bucket, and the function returns a corresponding bucket number.
When expression falls outside the range of buckets, the function returns either 0
or max + 1
, depending on whether expression is lower than min or greater than or equal to max.
AMPS includes functions that compute a CRC checksum over a string. This function is useful for creating a numeric identifier from a string representation. This is commonly-used to create a shortened representation of the string, or to provide input for a MOD
calculation.
CRC32
string
Returns an integer calculated as a checksum of the provided string.
This function returns a 32-bit integer.
If a NULL value is provided, this function returns a constant value.
This function uses CRC32-C to create the result.
For details on the exact parameters, contact 60East.
CRC64
string
Returns an integer calculated as a checksum of the provided string.
This function returns a 64-bit integer.
If a NULL value is provided, this function returns a constant value.
This function uses a polynomial of 0x95AC9329AC4BC9B5
to create the result.
For details on the exact parameters, contact 60East.
AMPS includes functions that return information about the currently connected client. As with the message functions, these functions return information about the client that prompted the operation, if one is present.
CLIENT_NAME
(none)
Returns the name of the currently connected client.
Subscriptions to a topic or conflated topic
Enrichment and preprocessing
USER
(none)
Returns the user ID of the currently connected client.
Subscriptions to a topic or conflated topic
Enrichment and preprocessing
REMOTE_ADDRESS
(none)
Returns the remote address of the currently connected client.
Subscriptions to a topic or conflated topic
Enrichment and preprocessing
CLIENT_VERSION
(none)
Returns the version string reported by the currently connected client.
Subscriptions to a topic or conflated topic
Enrichment and preprocessing
AMPS includes a function for calculating the distance from a signed latitude and longitude.
GEO_DISTANCE
first_latitude, first_longitude, second_latitude, second_longitude
Returns a double that contains the distance between the point identified by first_latitude, first_longitude and second_latitude, second_longitude in meters.
For example, given a home point and a message containing /lat
and /long
fields, you could use the following expression to calculate the distance from home.
AMPS uses the haversine formula when computing distances.
AMPS includes functions for explicitly constructing constant values of various types.
FALSE_VALUE
none
Returns a boolean false value.
This function is most useful for constructing values in message types that have a distinct type for boolean values.
In the AMPS expression language, false is equivalent to a literal 0.
TRUE_VALUE
none
Returns a boolean true value.
This function is most useful for constructing values in message types that have a distinct type for boolean values.
In the AMPS expression language, true is typically represented with a literal 1.
NAN_VALUE
none
Returns a NaN (not a number) value.
CHAR_VALUE
integer (0-255)
Returns the character (byte) for the integer provided.
This function is most useful for constructing values in message types that have a distinct type for char values.
In the AMPS expression language, a single character value is equivalent to a string constructed with an escape, and constructing a string literal is more efficient. That is, '\x01'
is more efficient in a filter or field construction than CHAR_VALUE(1)
.
However, to construct a character based on a field, use CHAR_VALUE
. For example: CHAR_VALUE(/code)
.
AMPS includes functions that can be used to refer to the current message being processed.
When used in view construction or aggregate definition, these functions refer to the incoming message that is prompting the update to the view or aggregate, not to the constructed message that is the result of the update. For example, a Field
like this in a view projection:
will return the topic name of the topic that prompted the update to the view, not the name of the view itself.
MESSAGE_SIZE
(none)
Returns the size of the payload of the current message, in bytes.
All messages
CORRELATION_ID
(none)
Returns the correlation ID of the current message as a string.
Returns NULL if there is no correlation ID for the current message.
All messages
LAST_UPDATED
(none)
Returns a timestamp for the last time that a message in the SOW was updated, as a double.
For a subscription (including the subscription part of a sow_and_subscribe
command), the LAST_UPDATED
value will be the current timestamp.
This function is most useful for queries of a topic in the SOW.
Notice that this field is set based on when the local instance has updated the message.
For replicated topics, this means that a given message will have different values on different instances.
Queries of a SOW topic
BOOKMARK
(none)
Returns the bookmark for the current message, if one is available.
Notice that messages retrieved from a SOW topic using a query return NULL for BOOKMARK
, since the SOW does not store the bookmark of a message.
Bookmarks are assigned using a combination of an identifier derived from the client name and a sequence number.
When working with bookmarks, 60East recommends treating bookmarks as opaque identifiers. In particular, bookmarks are not guaranteed to sort in any particular order between different publishers.
AMPS only assigns bookmarks when a message is stored in the transaction log. Messages that are not in the transaction log do not have bookmarks assigned.
Subscriptions to a transaction-logged topic
Bookmark subscriptions
Subscriptions to a message queue
Replication filters
TOPIC_NAME
(none)
Returns the topic name for the message currently being processed. When used in a filter for a message being delivered from a queue that has multiple underlying topics, returns the name of the underlying topic.
All messages
SOW_KEY
(none)
Returns the SOW key for the message currently being processed, if one exists.
This function is designed for use in enrichment. In a query, subscription, or delete command, using the SowKeys
header with the key or keys of interest is more efficient.
Although the function will return a value when used in a filter, using SowKeys
is recommended.
Queries or enrichment of a SOW topic
Subscriptions to a SOW topic
SOW_KEY_HASH
(none)
Returns a hash value of the SOW key for the message currently being processed. If AMPS generated the SOW key, this value will be the same value as the SOW key.
For topics where the publisher provides the SOW key, this will be a hash of the value provided by the publisher.
This function is designed for use in enrichment.
Queries or enrichment of a SOW topic
Subscriptions to a SOW topic
LAST_READ
(none)
Returns a timestamp for the last time that this message was read from the SOW, as a double.
This function only returns a value for messages in a topic in the SOW.
The LAST_READ
time for a message resets when AMPS restarts.
Notice that this field is set based on when the local instance processes a read of the message.
For replicated topics, this means that a given message will have different values on different instances.
This function is non-deterministic, and cannot be used in contexts that require a deterministic function.
Queries of a SOW topic
LAST_LEASED
(none)
For a message in a queue, returns a timestamp for the last time this message was leased from this instance, as a double.
Returns NULL
for a message that is not in a queue.
Notice that this timestamp is set based on when the local instance leased the message.
This counter is reset when the instance restarts.
Queries of a message queue
Subscription to a message queue
SOW delete by filter for a message queue
LEASE_COUNT
(none)
For a message in a queue, returns the number of times the message has been leased from this instance as a double.
Returns NULL
for a message that is not in a queue.
Notice that this counter is set based on leases from the local instance.
This counter is reset when the instance restarts, and does not track leases from other instances.
Queries of a message queue
Subscription to a message queue
SOW delete by filter for a message queue
AMPS includes a function that accepts any number of arguments and returns the first argument that is not NULL.
COALESCE
value (any number of values may be provided)
Returns the first value that is not NULL
.
If all values are NULL
, returns NULL
.
AMPS includes a pair of functions that provide the instance name and group name of the current instance.
AMPS_INSTANCE_NAME
none
Returns the instance name of this AMPS instance.
AMPS_GROUP_NAME
none
Returns the group name of this AMPS instance.
AMPS provides a set of aggregation functions that can be used in a Field
constructor for a view and in the projection
option of an aggregated subscription. These functions return a single value for each distinct group of messages, as identified by distinct combinations of values in the Grouping
clause.
These functions produce an aggregation over a literal value, an identifier directing AMPS to extract the value from the message, or the result of a function.
For example, given a set of messages like the following:
With a view definition that has a Projection
clause and Grouping
clause like the following:
AMPS will produce the following record:
Notice that the first SUM()
function simply extracts the value of the /qty from each message, while the second SUM()
function uses the output of the IF
statement for each message.
Since aggregate functions operate over groups of messages, these functions are only available when constructing fields for aggregate purposes, either in a view or an aggregated subscription. The functions described in this section are not available to filters, and are not available for constructing fields during SOW topic enrichment.
The set of functions provided in AMPS have been chosen to be efficient to compute over high volumes of rapidly changing data.
AVG
Average over an expression.
Returns the mean value of the values specified by the expression.
ANY
Returns one of the set of values in the expression.
COUNT
Count of values in an expression.
Returns the number of values specified by the expression.
COUNT_DISTINCT
Count of the number of distinct values in an expression, ignoring NULL
.
Returns the number of distinct values in the expression. AMPS type conversion rules apply when determining distinct values.
GROUP_CONCAT
Create a list of the distinct values in the expression specified, using the second argument as the delimiter. If no second argument is provided, the delimiter defaults to ,
(a comma).
For example, to create a list of the distinct values in the /names
column for the group delimited by a |
character, you would use:
GROUP_CONCAT(/names, '|')
This function does not guarantee the order of the values within the string produced. AMPS type conversion rules apply when determining distinct values.
This function returns a string, regardless of the types of the values in the expression.
MIN
Minimum value.
Returns the minimum out of the values specified by the expression.
MAX
Maximum value.
Returns the maximum out of the values specified by the expression.
STDDEV_POP
Population standard deviation of an expression.
Returns the calculated standard deviation.
STDDEV_SAMP
Sample standard deviation of an expression.
Returns the calculated standard deviation.
SUM
Summation over an expression.
Returns the total value of the values specified by the expression.
UNIQUE
Determine if all of the values in a given field match within the group.
If all of the values match, returns the value. Otherwise, returns NULL
.
Null values are not included in aggregate expressions with AMPS, nor in ANSI SQL. COUNT
will count only non-null values, SUM
will add only non-null values, AVG
will average only non-null values, and MIN
and MAX
ignore NULL
values, and so on.
MIN
and MAX
can operate on either numbers or strings, or a combination of the two. AMPS compares values using the principles described for comparison operators. For MIN
and MAX
, AMPS determines order based on these rules:
Numbers sort in numeric order.
String values sort in ASCII order.
When comparing a number to a string, convert the string to a number, and use a numeric comparison. If that is not successful, the value of the string is higher than the value of the number.
For example, given a field that has the following values across a set of messages:
MIN
will return 1.3
, MAX
will return 'cat'
. Notice that different message types may have different support for converting strings to numeric values: AMPS relies on the parsing done by the message type to determine the numeric value of a string.
For views, aggregated subscriptions, and SOW topic enrichment, AMPS allows you to construct new fields based on existing data.
When you construct a field, there are two components required:
A source expression that produces a value. This expression can include XPath identifiers that extract values from a message, literal values, operators, and functions.
A destination identifier that specifies the identifier where the message type will serialize the value produced by the source expression.
The source expression and the destination identifier are separated by the AS
keyword. The format for a field construction expression is as follows:
For example, to create a field in a view that calculates the total value of an order by multiplying the /price
field times the /qty
field, construct the field as shown below:
This constructs a field using /price * /qty
as the source expression. Both /price
and /qty
are taken from the incoming message. When the result of this expression is computed, the value will be produced with the XPath identifier /total
as the destination. That value will then be serialized to a message (with the exact format and syntax determined by the message type).
Notice that the grammar for constructing fields does not specify precisely how the field is represented in the message. AMPS constructs the value and provides the XPath identifier to the message type. The message type itself is responsible for serializing the value into the correct representation and structure for that message type.
All of the AMPS operators and functions that are available for filters are available to use in source expressions, including any user-defined functions loaded into the instance.
Depending on the context for field construction, there are additional capabilities available when constructing fields, as described in the following sections.
Preprocessing field constructors operate on a single message and construct fields based on that message. The results of the preprocessing field constructor are merged into the incoming message. Any field in the source message that is not changed or removed during preprocessing is left unchanged, so it is not necessary to include all fields in the message in the Preprocessing
block.
Since preprocessing fields apply to a specific message, preprocessing fields cannot specify the topic or message type in an XPath identifier. All identifiers in the source expression are evaluated as identifiers in the message being preprocessed. Preprocessing fields are evaluated during the preprocessing phase, so they cannot refer to the previous state of a message.
Preprocessing can be used to remove fields from a message. By default, AMPS serializes any field that has an empty string or NULL
value after preprocessing. Preprocessing fields can include a directive that specifies that a field that contains a NULL
value should be removed from the set of fields rather than serialized with a NULL
value. The directive HINT OPTIONAL
applied to the XPath identifier specifies that if the result of the source expression is NULL
, AMPS does not provide the value for the message type to serialize. For example, the following field constructor removes the /source
field from the message if the value provided is not in a specific list of values:
By default, AMPS considers the results of field construction (the processed message) to be distinct from the current message. AMPS rewrites the current message after preprocessing is completed. This means that, by default, the results of fields constructed during preprocessing are not available to other fields within preprocessing. The HINT SET_CURRENT
option immediately inserts or updates values in the current message, which makes the new value available to all subsequent Field
declarations.
In the sample below, AMPS enriches the message by performing an expensive operation (implemented as a user-defined function) on two input fields, and immediately updates the current message with the output of that operation. AMPS then sets other fields in the processed message using the updated value in the current message.
Notice that using HINT SET_CURRENT
requires AMPS to process Field
declarations in order, which may prevent future optimizations.
Hints can be combined as follows:
In this case, if the projected field would be NULL
, the field is removed from the current message.
Enrichment field constructors operate on a single message and construct fields based on that message. Enrichment expressions operate on the current message and change the current message. The results of the enrichment directives are merged into the incoming message. Any field in the source message that is not changed or removed during preprocessing is left unchanged, so it is not necessary to include all fields in the message in the Enrichment
directive.
Since enrichment fields apply to a specific message, enrichment fields cannot specify the topic or message type in an XPath identifier. All identifiers in the source expression are evaluated as identifiers in the message being enriched.
Enrichment fields are constructed during the enrichment phase, so enrichment fields can refer to the previous state of a message. Within an enrichment expression, AMPS provides two special modifiers for XPath identifiers that specify whether an XPath identifier refers to the current incoming message or the previous state of the message. These modifiers apply only to the source expression, and cannot be used in the destination identifier. The modifiers are as follows:
OF CURRENT
Specify that the XPath identifier refers to the incoming message.
OF PREVIOUS
Specify that the XPath identifier refers to the previous state of the message in the SOW.
If there is no record in the SOW for this message, all identifiers that specify OF PREVIOUS
return NULL
.
Enrichment can be used to remove fields from a message. By default, AMPS serializes any field that has an empty string or NULL
value after enrichment. Enrichment Field
elements can include a directive that specifies that a field that contains a NULL
value should be removed from the message rather than serialized with a NULL
value. The directive HINT OPTIONAL
applied to the XPath identifier specifies that if the result of the source expression is NULL
, AMPS does not provide the value for the message type to serialize. For example, the following field constructor removes the /source
field from the message if the value provided is not in a specific list of values:
By default, AMPS considers the results of field construction (the enriched message) to be distinct from the current message. AMPS rewrites the current message after enrichment is completed. This means that, by default, the results of fields constructed during enrichment are not available to other fields within enrichment. The HINT SET_CURRENT
option immediately inserts or updates values in the current message, which makes the new value available to all subsequent Field
declarations.
In the sample below, AMPS enriches the message by performing an expensive operation (implemented as a user-defined function) on two input fields, and immediately updates the current message with the output of that operation. AMPS then sets other fields in the processed message using the updated value in the current message.
Notice that using HINT SET_CURRENT
requires AMPS to process Field
declarations in order, which may prevent future optimizations.
Hints can be combined as follows:
In this case, if the projected field would be NULL
, the field is removed from the current message.
View field constructors operate over groups of messages, and construct a single output message for each distinct group, as specified by the Grouping
element in the View
configuration.
When constructing a field in a view, all identifiers used in the source expression must be in one of the underlying topics for the view. When the view uses a Join
, the identifiers must include the topic identifier. If the topics in the Join
are of different message types, the identifiers must include both the message type and the topic identifier.
For example, the following Field
definition multiplies the /quantity
from the NVFIX topic orders
by the /price
from the JSON topic items
, and projects the result into the /total
field of the view.
One of the core features of AMPS is the ability to persist the most recent update for each distinct message published to a given topic. The State of the World (SOW) can be thought of as a database where messages published to AMPS are filtered into topics, and where the topics store the latest update to each distinct message. The SOW gives subscribers the ability to quickly resolve any differences between their data and updated data in the SOW by querying the current state of a topic or any set of messages inside a topic. Topics recorded in the SOW are also used for caching data, providing "point in time" snapshots of active data flows, providing key/value stores over data flows, and so on. Topics recorded in the SOW are the underlying sources for AMPS aggregation and analytics capabilities, and the ability to store the previous state of a message is the foundation of advanced messaging features such as delta messaging and out of focus notifications.
AMPS also provides the ability to keep historical snapshots of the contents of the SOW, which allows subscribers to query the contents of the SOW at a particular point in time and replay changes from that point in time.
AMPS can maintain the SOW for a topic in a persistent file, which will be available across restarts of the AMPS server. The SOW can also be transient, in which case the state of the SOW does not persist across server restarts.
Topics do not keep the current values in the SOW by default. To provide this capability for a topic, you must configure AMPS to maintain the topic in the SOW by adding a definition for the Topic
to the SOW
section of the AMPS configuration file.
Much like tables in a relational database, topics in the AMPS SOW persist the most recent update for each message. AMPS identifies a message by using a unique key for the message. The SOW key for a given message is similar to the primary key in a relational database: each value of the key is a unique message. The first time a message is received with a particular SOW key, AMPS adds the message to the SOW. Subsequent messages with the same SOW key value update the message.
There are several ways to create a SOW key for a message:
Most applications specify that AMPS assigns a SOW key based on the content of the message. The fields to use for the key are specified in the SOW topic definition, and consist of one or more XPath expressions. AMPS finds the specified fields in the message and computes a SOW key based on the name of the topic and the values in these fields. 60East recommends this approach unless an application has a specific need for a different approach.
A topic can also be configured to require that a publisher provide a SOW key for each message when publishing the message to AMPS.
AMPS also supports the ability for custom SOW key generation logic to be defined in an AMPS module, which will be invoked to generate the SOW key for each message. While these SOW keys are generated automatically by AMPS, rather than being provided by the publisher, the logic to generate these keys is provided by the module, and the configuration required (if any) is determined by the module.
The following diagrams demonstrate how the SOW works, using a SOW topic that is configured to have AMPS determine the SOW key based on the /orderId
field within the message. As each message comes in, AMPS uses the contents of the /orderId
field to generate a SOW key for the message. The SOW key is used to identify unique records in the SOW, so AMPS will store a distinct record for each distinct /orderId
value published to this topic. The calculated SOW key will be returned in the SowKey
header of messages received from the topic in the SOW.
In the previous diagram, two messages are published where neither of the messages have matching keys existing in the ORDERS
topic. The messages are both inserted as new messages.
Some time after these messages are processed, an update comes in for the order with an orderId
of 2
. This message changes the price from 120 to 95. Since the incoming message has an orderId
of 2, this matches an existing record and overwrites the existing message for the same SOW key, as seen in the diagram below. AMPS replaces the entire record with the contents of the update.
Although the SOW key is derived from the content of the message in many cases, the SOW key is distinct from the content of the message. Each record in a SOW topic has a distinct SOW key, which is stored with the record. The SOW stores the full message in the message type format for performance. There is no re-serialization required to send a message to subscribers.
By default, a topic recorded in the SOW is persistent. For these topics, AMPS stores the contents of the SOW for that topic in a dedicated, memory-mapped file. This means that the total SOW does not need to fit into memory, and that the contents of the SOW database are maintained across server restarts. You can also define a transient SOW topic, which does not store the contents of the SOW to a persisted file.
The SOW file is separate from the transaction log, and you do not need to configure a transaction log to use a SOW. When a transaction log is present that covers the SOW topic, on restart AMPS uses the transaction log to keep the SOW up to date. When the latest transaction in the SOW is more recent than the last transaction in the transaction log (for example, if the transaction log has been deleted), AMPS takes no action. If the transaction log has newer transactions than the SOW, AMPS replays those transactions into the SOW to bring the SOW file up to date. If the SOW file is missing or damaged, AMPS rebuilds the SOW by replaying the transaction log from the beginning of the log.
When a SOW topic is persistent
, each Topic must be stored in a separate file. Only one instance of AMPS can access a given file; the same copy of the SOW file cannot be used by multiple instances of AMPS.
When the SOW for a Topic is transient, AMPS does not store the SOW for this topic across restarts. In this case, AMPS will synchronize the SOW with the transaction log when the server starts to restore the state of the topic. By default, this recovery processes the entire transaction log. You can use the RecoveryPoint
configuration option to specify that the topic should have only new publishes or should recover from a specific point in time (for example, you could use an environment variable to provide a timestamp to the RecoveryPoint
so that AMPS recovers only the last 24 hours of messages.)
This section describes AMPS SOW keys in detail, including information on how AMPS generates SOW keys and considerations for applications that generate SOW keys. An individual SOW topic may use either AMPS-generated SOW keys or user-generated SOW keys. Every message in the SOW must use the same type of key generation.
Regardless of how the SOW key is generated, AMPS creates an opaque value from the SOW key and uses this value for efficient lookup internally. For SOW keys that AMPS generates, this opaque value is returned in the message header for SOW messages and is used in commands that reference SOW keys. When the SOW key is provided with a message, AMPS returns the original value in the SOW key header, and the original value is used in commands that reference SOW keys.
For topics that have a SOW key (including views and conflated topics), commands that directly use the SOW for a topic (for example, sow
, sow_and_subscribe
, sow_delete
) can provide a SOW key, or a set of SOW keys with the command. When a set of SOW keys is provided with one of these commands, the command will only operate on messages that have a SOW key in the provided set.
AMPS-generated SOW keys are often the easiest and most reliable way to define the SOW key for a message. The advantages of this approach are that AMPS handles all of the mechanics of generating the key, the key will always match the data in the message, and there is no need for a publisher to be concerned with how AMPS assigns the key. The publisher simply publishes messages, and AMPS handles all of the details.
AMPS generates SOW keys based on the message content when you define one or more Key
fields in the SOW configuration. For example, if your SOW tracks unique orders that are identified by an orderId
field in the message, you could provide the following Key
element in your SOW configuration:
This configuration item tells AMPS to use that field of the message to generate SOW keys. AMPS supports composite SOW keys when multiple Key
elements are provided. For example, the following configuration specifies that every unique combination of /orderId
and /customerId
is a unique record in the SOW:
When AMPS generates a key, it creates the key based on the key domain (which is the name of the topic by default) and the values of the fields specified as SOW keys. AMPS concatenates these values together with a unique separator and then calculates a checksum over the value. This ensures that different values create different keys, and ensures that records in different topics have different keys.
In some cases, you may need AMPS to calculate consistent SOW key values for identical messages even when the messages are published to different topics. The SOW topic definition allows you to set an explicit key domain in the configuration, which AMPS will use instead of the topic name when generating SOW keys. For example, if your application uses the orderId
field of a message as a SOW key in both a ShippingStatus
topic and an OpenOrders
topic, having AMPS generate a consistent key for the same orderId
value may make it easier to correlate messages from those topics in your application. By setting the same KeyDomain
value in the Topic configuration for those SOW topics, you can ensure that AMPS generates consistent SOW keys for the same order ID across topics.
An application should treat SOW keys generated with the AMPS default SOW key generator as opaque tokens. The value of a generated SOW key is guaranteed to be consistent for the same fields, values, and key domain. However, an application should not make assumptions as to the specific value that the AMPS default key generator will produce from a given set of values. If an application requires a specific value for the SOW key, the application should generate a SOW key, as described in the following section.
The preprocessor phase of AMPS enrichment occurs before AMPS generates SOW keys for a message. You can use this phase of enrichment to construct fields that are then used to generate the SOW key for a message.
AMPS allows you to customize how the server generates SOW keys for a topic. To customize SOW key generation, you implement a SOW key generator module and specify that the module should be used to generate keys for that SOW topic.
To use a custom SOW key generator, you first load the module in the Modules
section of the configuration file, then specify the module as the KeyGenerator
for the SOW topic.
For information on implementing a custom SOW key generator, contact 60East support for the AMPS Server SDK.
AMPS allows applications to explicitly generate and assign SOW keys. In this case, the publisher calculates the SOW key for the message and includes that key in the message when it is published. AMPS does not interpret the data in the message to decide whether the message is unique: AMPS uses only the value of the SOW key.
When using a user-generated SOW key, applications should consider the following:
All publishers should use a consistent method for generating SOW keys.
SOW keys must contain only characters that are valid in Base64 encoding.
The application must ensure that messages intended to be logically different do not receive the same SOW key.
User-generated SOW keys are particularly useful for the binary
message type. For this message type, AMPS does not parse the message, so providing an explicit SOW key allows you to create a SOW that contains only binary
messages.
To specify that AMPS will require publishers to this topic to submit the SOW key, the Topic
configuration does not specify any Key
fields and does not specify a KeyGenerator
for the topic.
State of the World topics are used for several different purposes:
At any point in time, applications can issue SOW queries to retrieve all of the messages that match a given topic and content filter. When a query is executed, AMPS will test each message in the SOW against the content filter specified and all messages matching the filter will be returned to the client. The topic can be a literal topic name or a regular expression pattern. For more information on issuing queries, please see the section on Querying the State of the World (SOW).
If the application needs to receive updates or needs the ongoing state of the topic rather than running a one-time query, the application can query the State of the World and simultaneously subscribe to updates to the topic. This is typically much more efficient than running repeated queries of the topic.
This command, sow_and_subscribe
, is described in the Query and Subscribe topic in the Querying the State of the World (SOW) section.
Because the State of the World maintains a record of the current state of messages, it enables several of the advanced messaging features provided by AMPS.
A subscription to a topic can optionally request to be notified when a message is removed or no longer matches the subscription when the topic is recorded in the State of the World. See the Out-of-Focus Messages section for details.
AMPS can optionally enrich messages when they are published to a topic that is recorded in the State of the World. The enrichment can include logic based on the previous state of the message. See the State of the World Message Enrichment section for details.
Because a topic stored in the State of the World maintains the current value of a message, applications do not need to republish the full message when making updates to a message. See the Incremental Message Updates section for details.
Since a State of the World topic maintains a complete set of current values for a topic, a State of the World topic is the foundation of analysis and aggregation of the messages published to a topic. See the Aggregation and Analytics section for more details.
When a message in the State of the World is replaced or updated, AMPS can determine which fields (if any) have changed from the previous values. A subscriber can optionally request to be delivered only fields that have changed from the previous values. See Receiving Only Updated Fields for details.
The topics titled When Should I Store a Topic in the SOW and Scenario and Feature Reference in the Introduction to AMPS provide an introduction to some of the application scenarios that can benefit from using a State of the World topic.
AMPS maintains indices over SOW topics, views, and conflated topics to improve query efficiency.
There are two types of indices available:
Memo indices are created automatically when AMPS needs to use a particular field for a query. These indices maintain the value of a key, and can be used for any type of query, including regular expression queries, range queries, and comparisons such as less than or greater than. You can also request that AMPS pre-create an index of this type with the Index
directive of the SOW topic configuration.
Hash indices are defined by the configuration for the topic, view or conflated topic. These indices maintain a hash derived from the values provided for the fields in the key. When the topic is configured so that AMPS generates the SOW key, AMPS automatically creates a hash index that contains all of the fields in the SOW Key. You can create any number of hash indexes for a SOW topic, with any combination of fields. Hash index queries are significantly faster than queries using memo indexes.
Both types of indices are maintained in memory. The section on Estimating AMPS Instance Memory Usage has more details.
A hash index can be created using any XPath Identifier in the message. For example, if you are using a composite-local
message type, you can create a hash index using fields from any part of the message. If you are using an xml
message, you can create a hash index that uses the XML attributes.
The values of hash indices are always evaluated as strings. Hash indices are only used for exact matches on the value of the fields or with the IN
operator, and only for queries that use the exact set of fields in the hash index. Other operators or functions (for example, LIKE
, !=
, BETWEEN
, IS NULL
, IS NOT NULL
, and so on) cannot use the hash index. To use a hash index, the comparison must use a literal string for comparison to specify that the comparison uses an exact string comparison and not a numeric comparison.
For example, if your configuration specifies a hash index that uses the fields /address/postalCode
and /customerType
, a filter such as /address/postalCode = '04109' AND /customerType = 'retail'
will use the hash index. A filter such as /address/postalCode = '04109' AND /customerType LIKE 'retail|remainder'
will not use the hash index, since this filter uses the LIKE
operator rather than exact matching. Likewise, a comparison such as /address/postalCode = 04109
will not use the hash index, since the expression requests a numeric comparison rather than a string comparison.
Starting with AMPS 5.3.1.0, AMPS will also use a hash index for a compound filter if the first clause in the filter is an IN
operator that can use a hash index and the other comparisons in the filter are evaluated using the AND
operator. In this case, AMPS evaluates the IN
clause first, and executes the rest of the expression against the results of the IN
clause. For example, a filter like /id IN ('jon', 'jim', 'joy') AND /price > 50
will use a hash index to find matching records for /id
and then compare the matching records to the rest of the filter (in this case, a numeric comparison on /price
). (Notice that this optimization is not available if the other comparisons use the OR
operator.)
AMPS uses a hash index for filters where possible. If the filter does not meet the requirements for using a hash index, AMPS uses memo indices for the fields in the filter if those are available. If one or more of the required memo indices is not available, AMPS creates the indexes during the query.
If your application frequently uses queries for an exact match on a specific set of fields (for example, retrieving a set of customers by the /address/postalCode
field), creating a hash index can significantly improve the speed of those queries.
AMPS allows applications to explicitly remove records from a SOW topic using the sow_delete
command.
When removing records from a SOW, there are three different ways to indicate which message, or messages, will be deleted:
Using a content filter. AMPS will delete all messages in the SOW that match the content filter. To delete every message in the SOW, use the special filter 1=1
to indicate that the filter is true for every message, regardless of the contents of the message. (In essence, AMPS runs a query to locate the records to be deleted, and then deletes the matching records.)
Using the SOW key assigned to the message. AMPS accepts a list of SOW keys, and will remove the messages indicated by those SOW keys.
Using message data. The application provides message data with the sow_delete
command. AMPS parses the message data to determine the SOW key for the record that would be updated if the command were a publish
, and deletes that record (if one exists). Notice that if the topic is configured so that publishers must provide the SOW key, the key cannot be derived from the data, which means that using message data to delete messages may not produce the expected results.
When a record is removed from the SOW, AMPS sends an out-of-focus (OOF) message to any subscriptions that have requested OOF notifications. AMPS also updates any views that use the SOW topic, and the record will be removed from conflated topics at the next conflation interval.
When the SOW is configured with the History
option to enable historical queries, the sow_delete
command removes the message from the current set of messages in the SOW. The command does not remove previously saved versions of the message: the historical state of the SOW is unaffected by the sow_delete
.
The most efficient way to delete a specific message or specific set of messages is to use the SOW key that AMPS assigns, when that key is available. You can provide these keys in the SowKeys
header (a delete by keys), or by providing a filter expression that will be evaluated as a query on the primary key or a hash index. See the Indexing SOW Topics section for details on how AMPS determines if a hash index or primary key can be used for a filter.
When the SOW delete provides an example message to be deleted, AMPS parses that message to determine the SOW key and then uses that to key to delete the message, which is also relatively efficient.
Deleting a message from the SOW means that AMPS can reuse the space that the message consumed, but AMPS does not reduce the size of the storage for the topic when a message is removed. Typically, SOW topics in production reach a steady state based on the number of messages that are typically present at any given time: it is most efficient to simply make the space available for new messages.
To reduce the size of a file used to persist a topic in the SOW after messages are removed, use the Compact SOW Topic action. Running this operation will typically reduce throughput to the topic being compacted during the process of compacting the topic, so this should only be done during a maintenance window or when reducing (or pausing) throughput to the topic would have less impact on the application than leaving the SOW file at its current size.
Removing a message from a Topic
in the State of the World removes the message from that Topic
and notifies any View
or ConflatedTopic
that depends on this topic that the message has been removed (see Aggregation and Analytics for details on creating a View
, see Conflated Topics for details on creating a conflated topic). Removing a message from a Topic
adds the delete command to the transaction log, but does not remove messages stored in the transaction log (see Record and Replay Messages).
If the Topic
contains History
, the sow_delete
affects the current value of the Topic
but does not remove previous state. AMPS will remove records that have not been current for longer than the retention Window
, as described in the Historical SOW Topic Queries section.
Applications that store topics in the SOW must consider the ongoing storage needs and file management for the SOW.
There are two aspects to SOW maintenance:
Ensuring that the host system has enough capacity to efficiently store and manage the topics in the SOW. Capacity planning guidelines are discussed in the Capacity Planning section in the operations section of this guide.
Setting and implementing a data retention policy for the contents of each topic in the SOW.
The data retention policy for a topic in the SOW is determined by the needs of your application.
Consider the following questions:
Does the topic have a data set that tends to stay at a consistent size? If so, there may be no need to explicitly manage data retention. Many AMPS applications have topics that fall into this category.
For example, an application that uses a SOW topic to track the current price of a specific set of ticker symbols has little need to set a data retention policy. The SOW will always contain the same number of records (one for each ticker symbol), and those records will always contain data of a consistent size. The application may choose to remove a record when a symbol is removed from the set, but otherwise rely on publishers to keep the data current.
Is the data only valid for a specific period of time after the data is published? If so, SOW expiration may be a good way to manage the SOW.
For example, an application that needs to ensure that quotes are removed from the system after 10 minutes from the time the quote is published could use SOW expiration to remove records after 10 minutes.
Is the data valid until a certain condition becomes true? If so, having the application remove records from the SOW or using AMPS actions may be a good way to manage the SOW.
For example, an application that needs to clear the state of the SOW every 24 hours during a maintenance window could use an action to remove those records. An application that can determine when a record is no longer needed can remove the record immediately, which means that the topic only contains data that the application needs at any given time.
Regardless of the approach an application takes, 60East recommends that every application that uses a SOW consider capacity and explicitly consider the data retention needs of each topic and the application.
For many applications, messages need to be removed at specific times (for example, at the end of a trading day, or when the message reaches a certain state) rather than having a message-specific time to live.
These applications typically configure a scheduled maintenance plan using AMPS actions to manage the SOW and remove unneeded information.
For full details on AMPS actions, see the Automating AMPS with Actions section in the user guide and the Actions section in the configuration reference.
Below is an example of a configuration section for a SOW topic definition, where records will need to be removed when they have reached a state of completed
and been inactive for more than 24 hours. Since this is intended to manage the size of the saved state of the topic, it isn't necessary for messages to be removed precisely when they reach that state. Removing messages once a day, before activity begins for that day, is enough.
To create this maintenance plan, we configure an AMPS action that runs at 02:00
local time and removes the messages that the topic no longer needs.
Notice that there are two parts to this action. The On
element specifies when the action should run -- in this case, every day at 02:00
local time. The Do
element directs AMPS to delete messages from the ORDERS
topic (of message type nvfix
) where the /status
is closed
, and where the last update for the message is earlier than 24 hours (86400 seconds) from the time the action was started. At the scheduled time, AMPS internally runs a sow_delete
command that removes the specified messages. This command is also written to the transaction log and replicated to other instances.
With this configuration, AMPS can efficiently maintain the SOW topic based on the needs of the application.
By default, a topic in the SOW stores all distinct records until a record is explicitly deleted. For scenarios where message persistence needs to be limited in duration, AMPS provides the ability to set a time limit on the lifespan of SOW topic messages. This limit on duration is known as message expiration and can be thought of as a "Time to Live" feature for messages stored in a SOW topic.
Expiration on SOW topics is disabled by default. For AMPS to expire messages in a SOW topic, you must explicitly enable expiration on the SOW topic.
There are two ways message expiration time can be set. First, a topic recorded in the SOW can specify a default lifespan for all messages stored for that topic. Second, each message can provide an expiration as part of the message header.
AMPS stores the expiration time for each message individually, as a property of the message in the SOW. The expiration for a given message is first determined based on the message expiration specified in the message header. If a message has no expiration specified in the header, then the message will inherit the expiration setting for the topic expiration. If there is no message expiration and no topic expiration, then it is implicit that a SOW topic message will not expire. When an expiration of 0 is provided in the message header, this indicates that AMPS should not provide expiration for this message.
AMPS configuration supports the ability to specify a default message expiration for all messages in a single SOW topic. Below is an example of a configuration section for a SOW topic definition with an expiration. The SOW/Topic section of the AMPS Configuration Guide has more detail on how to configure the SOW topic.
In this case, messages with no lifetime specified on the message have a 30 second lifetime in the SOW. When a message arrives and that message has an expiration set, the message expiration on the publish overrides the default expiration for the topic. Each publish or delta publish that arrives, including an update to an existing message, updates the expiration time.
AMPS also allows you to enable expiration on a SOW topic, but to only expire messages that have message-level expiration set:
With this configuration file, expiration is enabled for the topic. The message lifetime is specified on each individual message. When expiration is disabled for a SOW topic, AMPS preserves any message expiration set on an individual message but does not expire messages.
AMPS processes expirations during startup when SOW expiration is enabled. This means that any record in the SOW which needs to be expired will be expired as AMPS starts. Notice that if expiration has been disabled in the configuration file, AMPS will not process expiration for the topic.
When expiration is enabled for a topic in the SOW, each message published to that topic expires at the configured time by default.
Individual messages have the ability to specify the expiration for that individual message. When an expiration time is provided on a message, that value overrides the default expiration set for the topic. For example, the SOW configuration for a topic might specify an expiration of 5 minutes for a pending order. For large orders, however, a publisher might explicitly prevent messages from expiring by providing a 0
for the expiration time when publishing the message.
AMPS does not process expiration for any messages in a topic recorded in the SOW unless expiration is enabled for the topic. When expiration is not configured for a topic, messages published to that topic do not expire, regardless of the expiration setting on an individual message.
When a message arrives, AMPS calculates the expiration time for the message and stores a timestamp at which the message expires in the SOW with the message. When the message contains an expiration time, AMPS uses that time to create the timestamp. When the message does not include an expiration time, but the topic contains an expiration time, AMPS uses the topic expiration for the message. Otherwise, there is no expiration set on the message, and AMPS records a timestamp value that indicates no expiration.
Messages in the SOW topic can receive updates before expiration. When a message is updated, the message’s expiration lifespan is reset. For example, a message is first published to a SOW topic with an expiration of 45 seconds. The message is updated 15 seconds after publication of the initial message, and the update resets the expiration to a new 45 second lifespan. This process can continue for the entire lifespan of the message, causing a new 45 second lifespan renewal for the message with every update.
If a message expires, then the message is deleted from the SOW topic. This event will trigger delete processing to be executed for the message, similar to the process of executing a sow_delete
command on a message stored in a SOW topic.
When using message expiration, one common scenario is that the message has an expiration set, but the AMPS instance is shut down during the lifetime of the message.
To handle such a scenario, AMPS calculates and stores a timestamp for the expiration, as described above. Therefore, if the AMPS instance is shutdown, upon recovery the engine will check to see which messages have expired since the occurrence of the shutdown. Any expired messages will be removed from the topic as soon as possible.
Notice that, because the timestamp is stored with each message, changing the default expiration of a SOW topic does not affect the lifetime of messages already in the SOW. Those timestamps have already been calculated, and AMPS does not recalculate them when the instance is restarted or when the defaults on the SOW topic change. If expiration is not enabled for the topic after the configuration change, AMPS does not process expirations for that topic and messages will not expire.
Because the expiration time is stored as an attribute of each individual message, that expiration time is replicated with the message. A downstream instance that receives the message via replication does not reset or change the expiration time that is stamped on the message.
Expiration processing happens on each individual instance. The fact that a message has expired is not replicated (this is not necessary, since the message expiration is stored as a part of the message, so each individual instance can manage expiration locally).
When SOW topics are configured inside an AMPS instance, clients can issue SOW queries to AMPS to retrieve all of the messages matching a given topic and content filter. When a query is executed, AMPS will test each message in the SOW against the content filter specified and all messages matching the filter will be returned to the client. The topic can be a straight topic or a regular expression pattern.
Topics in the State of the World can also be configured to include historical snapshots of messages, which allows subscribers to retrieve the contents of the topic at a particular point in time.
As with simple queries, a client can issue a query by sending AMPS a sow
command and specifying an AMPS topic. For a historical query, the client also adds a timestamp that includes the point in time for the query in the Bookmark
header of the command. A filter can be used to further refine the query results based on the message content.
Use a historical SOW query when it is important to get a snapshot of the state of messages in a topic as they existed at a specific point in time (that is, if it is important for an application to be able to query the state of the world at a point in time).
If an application needs to replay the exact sequence of messages delivered to a topic, but does not need to be able to query the values that were current at a specific point in time, record the topic in the transaction log and replay from the transaction log.
If an application needs to both retrieve a snapshot of the values that were current at a specific point in time and replay the exact sequence of messages from that point forward, use a historical SOW query and record the topic in the transaction log.
By default, AMPS does not maintain history for a topic in the State of the World. To enable history (and historical query) for the topic, add the History
element to the Topic
configuration. This element configures how much information AMPS stores for enabling historical queries.
There are two options that control how AMPS stores data for historical queries:
The Window
option sets the amount of time that AMPS will retain historical versions of messages. AMPS will remove the historical state of the message from the SOW topic once that historical state is older than the specified window. (If the message has been deleted, and the delete command is older than the specified window, AMPS may remove the message from the SOW topic entirely). AMPS always retains the most current state of a message, even if that state was published earlier than the specified Window
.
In other words, a given version of a message is eligible for removal after it has no longer been the most current update to that message for longer than the specified Window
.
The Granularity
option sets the interval at which AMPS retains a historical copy of a message in the SOW. For example, if the Granularity
is set to 10m
, AMPS stores a historical copy of the message no more frequently than every 10 minutes, regardless of how many times the message is updated in that 10 minute interval. AMPS stores the copies when a new message arrives to update the SOW. This means that AMPS always returns a valid SOW state that reflects a published message, but -- as with a conflated topic -- the SOW may not reflect all of the states that a message passes through. This also means that AMPS uses SOW space efficiently. If no updates have arrived for a message, since the last time a historical message was saved, AMPS has no need to save another copy of the message.
When a message is deleted from a topic that maintains history, AMPS saves the fact that the message has been deleted, and queries as of that point in time will not return the message. However, previously saved states of the message within the Window
are still present and can still be queried.
Likewise, if an application queries at a point in time earlier than the Window
, AMPS will return an empty result set (even if messages had actually been present in the topic at that point), since the SOW state is only retained for the period in the Window
.
The Granularity
for a topic is always specified as a duration. If your application requires that a query be able to return the exact state of the SOW exactly as AMPS would have represented it at that time (with no tolerance for the granularity), you can specify that AMPS keep every message during the Window
by specifying a Granularity
of 0s
. Notice that this is not required to replay every message after a point-in-time query, since replay is delivered from the transaction log rather than the stored State of the World.
When a historical SOW and Subscribe query is entered, and the topic is covered by a transaction log, AMPS returns the state of the SOW adjusted to the next oldest granularity, then replays messages from that point. In other words, AMPS returns the same results as a historical SOW query, then replays the full sequence of messages from that point forward.
The transaction log and the SOW topic are maintained separately and have separate views of history. When a version of the message is removed from the SOW topic (because it is older than the specified Window
), the message remains in the transaction log, but will not be returned by a SOW query.
The message sequence flow is the same as a simple SOW query flow. Once AMPS has transmitted the messages that were in the SOW as of the timestamp of the query, the query ends. Notice that the query will include messages that have been subsequently deleted from the SOW, but which were the current state of the message as of that timestamp.
Topics that maintain History
in the SOW support paginated queries from a point in time. When the topic is also covered by the transaction log, the sow_and_subscribe
command also supports paginated subscriptions from a point in time. See Paginated SOW and Subscribe for details.
A client can issue a query by sending AMPS a sow
command and specifying an AMPS topic. Optionally a filter can be used to further refine the query results. AMPS also allows you to restrict the query to a specific set of messages identified by a set of SowKeys. When AMPS receives the sow
command request, it will validate the filter and start executing the query. When returning a query result back to the client, AMPS will package the sow
results into a sow
record group by first sending a group_begin
message followed by the matching SOW records, if any, and finally indicating that all records have been sent by terminating with a group_end
message. AMPS returns the results for a SOW query in a single, atomic operation. Any messages for the client that arrive during the SOW query are delivered after the SOW results.
AMPS treats queries as a single, atomic operation. All results from a query are sent to a client before the results of any subsequent commands. Use care when issuing queries that return a result set large enough to take several seconds or more to transmit over the network.
When planning for large queries, please see the information on how AMPS handles a situation where messages are produced faster than the client or network can consume them. This is discussed in the section on Slow Client mitigation.
The sequence diagram below illustrates the message flow for a SOW query.
For purposes of correlating a query request to its result, each query command can specify a QueryId
. The QueryId
specified will be returned as part of the response that is delivered back to the client. The group_begin
and group_end
messages will have the QueryId
attribute set to the value provided by the client. The client specified QueryId
is what the client can use to correlate query commands and responses coming from the AMPS engine.
AMPS does not allow a sow
command on topics that do not have a SOW enabled. If a client queries a topic that does not have a SOW enabled, AMPS returns an error.
The ordering of records returned by a SOW query is undefined by default. You can include an OrderBy
parameter on the query to specify a particular ordering based on the contents of the messages.
AMPS has a special command that will execute a query and place a subscription at the same time to prevent a gap between the query and subscription where messages can be lost. Without a command like this, it is difficult to reproduce the SOW state locally on a client without creating complex code to reconcile incoming messages and state.
For example, this command is useful for recreating part of the SOW in a local cache and keeping it up to date. Without a special command to place the query and subscription at the same moment, a client is left with two options:
Issue the query request, process the query results, and then place the subscription, which misses any records published between the time when the query and subscription were placed;
or
Place the subscription and then issue the query request, which could send messages placed between the subscription and query twice.
Instead of requiring every program to work around these options, the AMPS sow_and_subscribe
command allows clients to place a query and get the streaming updates to matching messages in a single command.
In a sow_and_subscribe
command, AMPS behaves as if the SOW command and subscription are placed at the exact same moment. The SOW query will be sent before any messages from the subscription are sent to the client. Additionally, any new publishes that come into AMPS that match the sow_and_subscribe
filtering criteria and come in after the query started will be sent after the query finishes (and the query will not include those messages.) As with a simple SOW query, any other messages that arrive for the client while the SOW query is running will also be delivered after the query results.
AMPS allows a sow_and_subscribe
command on topics that do not have a SOW enabled. In this case, AMPS simply returns no messages between group_begin
and group_end
.
The sequence diagram below illustrates the message flow for sow_and_subscribe
commands:
For topics that have History
configured, AMPS SOW Query and Subscribe also allows you to begin the subscription with a historical SOW query. For historical SOW queries, the subscription begins at the point of the query with the results of the SOW query. The subscription then replays messages from the transaction log. Once messages from the transaction log have been replayed, the subscription then provides messages as AMPS publishes them.
In effect, a SOW Query and Subscribe with a historical query allows you to recreate the client state and processing as though the client had issued a SOW Query and Subscribe at the point in time of the historical query.
A historical SOW and Subscribe requires that the SOW topic is recorded in the transaction log and that history is enabled on the SOW. If history is not enabled for the topic, a sow_and_subscribe
command returns the current state of the SOW and the subscription begins atomically at the point in time when AMPS processes the command.
AMPS allows you to control the results returned by a SOW query by including the following options and header on the query:
For details on how to submit these options with a SOW query, see the documentation for the AMPS client library your application uses.
When replacing a subscription that uses top_n
, skip_n
, or OrderBy
, any of these options specified on the original command must be provided on the replacement command. In other words, sow_and_subscribe
command that specifies top_n=10,skip_n=20
must provide both top_n
and skip_n
on a replacement command.
When top_n
and skip_n
are specified on a sow_and_subscribe
command, AMPS creates a paginated subscription. (Both top_n
and skip_n
must be provided to create a paginated subscription.)
With a paginated subscription, AMPS maintains a list of the set of results for the SOW query, and delivers only results that fall between the first record after the skip_n
number and within the number of records specified by the top_n
number. This allows applications that only need a subset of the results returned by a filter to work with only those results. This is commonly used for interactive applications, where a user interface shows a small number of records at a time in the interface.
When the subscription specifies an OrderBy
, that header specifies the order in which records are sorted within the paginated subscription. If no OrderBy
is specified, the results are sorted by the SowKey
generated by AMPS (effectively, an arbitrary but stable order).
From a subscriber point of view, paginated subscriptions behave as though only the messages in the pagination window are present in AMPS. For example, when out-of-focus notifications are enabled and a message in the topic is deleted, subscribers receive an oof
notification only if the deleted message was in the pagination window. Likewise, if a message that was previously in the pagination window falls outside of the window due to an insert or delete, the message that is now outside of the window will be out of focus, and will generate an oof
notification.
For example, consider the following topic in the SOW, where the topic uses the /id
field as a key.
With a top_n
of 2
, a skip_n
of 1
, and an OrderBy
of /id
, the results for the subscription will include the records with id
of 2
and id
of 5
.
Now a new message is published with an id
of 4
, as shown below:
Since the new message falls within the pagination window, the message is published to the subscriber. Given that the message with the id
of 5
is no longer within the pagination window, the subscriber will receive an oof
message for the message with an id
of 5
if the subscriber has requested out-of-focus notifications.
While a paginated subscription is active, AMPS maintains a list of the messages that match the subscription in memory (but does not, as of version 5.3.2, maintain the entire sorted result set in memory). For efficiency, when more than one subscription uses the same topic, these subscriptions will use the same result set in memory. The memory used counts as part of the configured MessageMemoryLimit
. Each connection that uses the result set is counted as consuming a portion of the memory retained. For example, if 5 connections use the same result set, each of those connections is counted as using 1/5 of the memory for the result set.
In addition, each paginated subscription requires that AMPS maintain state for the window for that subscription: this memory is not shared and is counted for that client.
AMPS provides the ability to aggregate the results of a SOW query. The results of an aggregated SOW query are the same as the results of querying a View
with the same definition.
To request an aggregated SOW query, provide the grouping
and projection
options with the sow
query.
AMPS provides a way to easily define a set of related SOW topics by specifying a Pattern
element in the Topic
configuration. When this element is present, AMPS creates a container SOW topic that can include a number of SOW topics in one physical file. A publish to a topic name that matches the Pattern
will be treated as an individual SOW topic within the container topic that defines the pattern. The definition of each individual topic (for example, the Key
values defined, the hash indexes defined, and so on) is defined by the container Topic
, and is the same for every individual topic within the container SOW topic.
Using this approach creates a single physical topic (that is, the container is a single in-memory SOW topic and, when the topic is persisted, a single file) that contains records for any number of individual topic names. The topics within the container maintain the last value of each individual record within each of the topics. Publishers and subscribers can use these topics as though the topics were each configured individually as a Topic
in the AMPS configuration file (with some minor behavioral differences resulting from all of the topics and messages being stored in the same data structure, as described in the following sections).
Although AMPS treats every topic within the container SOW topic as a distinct topic for the purposes of publishing and subscribing, AMPS manages those topics as records within a single SOW object. When the overall SOW topic is persisted, every message for an individual topic is stored within the same file. Likewise, the overall SOW topic is treated as a single topic in memory (including for monitoring and statistics purposes). In cases where an application has a large number of topics and each topic has a small number of messages (typically, in cases where each topic has only a single message), using a Pattern
can use considerably less memory than individual Topic
entries for the same number of topics and messages.
60East recommends using a Pattern
for a topic in situations where an existing system uses topic names rather than content filtering, and it is not practical to adjust the system. For example, when migrating a legacy system that distinguishes orders for different customers using different topic names rather than using the content of the message (such as using topic names /orders/customerA
and /orders/customerB
rather than including a customer
field on the message), creating a SOW topic using a Pattern
of ^/orders/
might be the most straightforward way to adapt the system to AMPS.
For a small number of topics, or cases where an individual topic would have a large number of entries, 60East recommends using individual topics rather than specifying a Pattern
.
The Pattern
element allows you to define a large number of SOW topics that will hold a small number of records (typically, only one record per topic) while minimizing the memory and storage overhead for each topic. This can be especially helpful when migrating a system that uses topic-based routing to AMPS, since you can easily create a large number of topics (hundreds or thousands) without having to explicitly specify each one in the AMPS configuration file in cases where it is important to query the last value of each topic.
Consider using the Pattern
element in cases where:
You need to be able to query the current value of a record (or topic). If you do not need to query current values, there is no need to define topics in the SOW at all (consider using ad hoc topics instead).
The information that determines whether a given message is unique is not contained in the message itself. If that information is already present in the data, it is more efficient to use a Topic
with the unique property configured as a Key
.
The messages have the same structure and are the same logical type of message. Messages that are different types, or that have different structures, would typically be represented in different topics.
Your application requires a large number of topics, or you do not know the topics in advance, such that it is impractical to define the topics using individual Topic
declarations.
Each unique topic will have a small number of messages (ideally, only one message per topic).
Your application does not require historical point in time query, or enrichment on the messages.
All of the topics to be managed together have the same general set of permissions. AMPS does not support applying different entitlements to individual topics within an overall SOW topic (some limited workarounds are available through content filters).
If any of the above considerations are not true, consider using a set of Topic
declarations rather than using the Pattern
element in a single Topic
.
Container topics are most commonly used when adapting a system that did not support content filtering (content-based routing) to an AMPS-based application in cases where the message data itself does not contain enough information to support content-based routing. Applications designed for AMPS most frequently use a Topic
and content filtering rather specifying a Pattern
for a Topic
and providing routing information in the topic name.
For most purposes, topics that use a Pattern
work just like any other topic defined using the Topic
directive. However, there are some differences in behavior, as outlined below:
When an application issues a sow
or sow_and_subscribe
that uses a regular expression for the topic name, messages from topics within a topic that uses Pattern
are delivered between a single group_begin
and group_end
pair. Messages from any topic name within the topic may be delivered in any order within the query results. Each message will indicate which topic within the topic it originated from.
A topic that uses Pattern
cannot be the underlying topic for a view.
A topic that uses Pattern
can be the underlying topic for a conflated topic, but the conflated topic must be configured to use such a topic.
All of the topic names within the topic must have the same permissions.
AMPS allows you to define a standalone topic, view, queue, or conflated topic with a Name
that matches the Pattern
of the Topic
. To do so, however, that definition must appear in the configuration file before the definition of the topic that uses the Pattern
. The topic, view, queue, or conflated topic will be configured as though the topic that defined the Pattern
is not present.
For example, the following SOW
configuration creates a Topic
named /orders/specialHandling
and a Topic
with a Pattern
that matches ^/orders/
. The /orders/specialHandling
topic adds preprocessing, and could also, in principle, have different permissions than the topic names that are matched by the Pattern
.
With these definitions, a publish to the following topic names would produce the following results:
The transaction log can be used to record publishes to a physical topic that contains multiple logical topics.
To do this, the transaction log specification must contain a Topic
directive that matches the physical topic Name
. This will capture all of the logical topics in the transaction log. Notice that only the physical topic Name
is consiered in this case.
AMPS considers permissions for all of the logical topics within the physical SOW topic to be identical. When checking permissions for the topic with an entitlement module, AMPS requests that the module provide permissions for the Pattern
specified in the topic. Any topic name included in the container will use the permissions, entitlement filter, and entitlement select list specified by the module for that Pattern
.
If it becomes necessary to restrict access to individual topics within the physical topic, there are two approaches that you can take:
Create a new topic with a Pattern
that specifies the topics that require different permissions, and apply the permissions to that topic.
Provide an entitlement filter that uses the TOPIC_NAME()
function to restrict access to specific topic names; for example, TOPIC_NAME() IN ('/orders/RHAT', '/orders/MSFT', '/orders/IBM')
. Using this method is less efficient than providing permissions for those topics (either as standalone topics, or for a regular expression topic containing exactly those three topics), but this approach can be a good option in cases where subscribers typically subscribe only to the topics they are entitled to, different subscribers have substantially different sets of entitlements, or there are no logical or convenient groupings that can be used to separate the topics into several regular expression topic declarations.
A sow_and_subscribe
command can include the conflation interval and conflation key options for server side conflation (as described in ), just as a regular subscription can. When the command requests conflation, the results of the SOW query are not conflated. Conflation only applies to the subscription.
As described in , AMPS allows you to replace an existing subscription. When the subscription is entered with the sow_and_subscribe
command, AMPS will re-run the SOW query delivering the messages that are in scope with the new filter but which were not previously delivered. If the subscription requests out-of-focus (OOF) messages, AMPS will deliver out of focus messages for messages that matched the previous filter but do not match the new filter. As with the initial Query and Subscribe, AMPS guarantees to deliver any changes to the SOW that match the filter and occur after the point of the query.
top_n
(option)
Limits the results returned to the number of messages specified.
When a skip_n
option is also provided for a subscription, AMPS creates a paginated subscription. Otherwise, this option applies only to the SOW query part of a sow_and_subscribe
or sow_and_delta_subscribe
command.
skip_n
(option)
Skips the number of messages specified before returning results.
A command that provides this option must also provide a top_n
option.
OrderBy
(command header)
Orders the results returned as specified.
Requires a comma-separated list of identifiers of the form:
The ASC
directive specifies that AMPS sort the results in ascending order (the default). DESC
specifies that AMPS sort the results in descending order.
The TEXT
hint specifies that AMPS will sort the column according to the textual representation of the column. This can be helpful in cases where the column represents a string value, but where some values could be interpreted as numeric values.
For example, to sort in descending order by orderDate
so that the most recent orders are first, and ascending order by customerName
for orders with the same date, you might use a specifier such as:
As another example, the following specifier will sort the orderId
field as a string, with the updateTimestamp
sorted in descending order for orders with the same orderId
.
If no sort order is specified for an identifier, AMPS defaults to ascending order. If no type hint is specified for an identifier, AMPS defaults to using a mixed-type sort.
grouping=[keys]
For use with aggregated SOW queries.
The format of this option is a comma-delimited list of XPath identifiers within brackets. For example, to aggregate entries based on their /description
(producing one record in the aggregation for each distinct value in /description
), you would use the following option:
When this option is provided, a projection
must also be provided.
When the topic has History
enabled, this option can be used with a bookmark to aggregate the historical state of the SOW.
projection=[fields]
For use with aggregated SOW queries.
Specifies a comma-delimited set of fields to project, within brackets. Each entry has the format described in the AMPS User Guide.
This option must contain an entry for every field in the aggregated message. If there is no entry for a field in this option, that field will not appear in the aggregated message, even if the field is in the underlying message.
There is no default for this option. When this option is provided, a grouping
must also be provided.
When the topic has History
enabled, this option can be used with a bookmark to aggregate the historical state of the SOW.
Message Published to Topic
Results
/orders/specialHandling
Matches Topic
definition. Stored in the Topic
.
The Preprocessing
directive runs and creates the /orderId
from the /customerName
and /customerSerialNumber
if there is no /orderId
already present.
/orders/RHAT
Matches the regular expression topic definition, stored in the regular expression topic.
/orders/specialHandling/oops
Matches the regular expression topic definition, stored in the regular expression topic.
Notice that a Topic
definition is an exact match on the topic name, not a pattern match.
/customer/orders/timothy_someone
Does not match either the Topic
or the regular expression topic.
Not included in the SOW.
AMPS allows a publisher to update and add fields within a message that is stored in a State-of-the-World Topic
using the delta_publish
command. This can be important in high-performance messaging, it can be important to conserve bandwidth by sending the smallest possible update over the network.
An incremental update may improve performance in environments where bandwidth is at a premium. Since an incremental update requires that AMPS parse, merge, and re-serialize messages, an incremental update can consume somewhat more CPU on the AMPS server than a simple publish, particularly for large messages with a complex structure (such as deeply-nested documents).
To be able to incrementally update a message, the message type for the Topic
must support delta messages. All of the included AMPS message types, except for binary
and struct
, support delta messages, with the limitations described in each section below. For custom message types, contact the message type implementer to determine whether delta support is provided.
AMPS also supports the ability of a subscriber to receive only the changed parts of a message, described in the section on Receiving Only Updated Fields.
While these features are often used together, the features are independent. For example, a subscriber can request a regular subscription even if a publisher is publishing deltas. Likewise, a subscriber can request a delta subscription even if a publisher is publishing full messages.
Topics recorded to the State of the World (SOW) can provide inline message enrichment for messages published to the topic. This capability is especially useful for applications that do consistent, simple transformations on incoming data. For example, you can use this capability to automatically add a calculated price to an incoming order, to map abbreviated data such as status codes to easier-to-understand values, or even to compute the value of a field used for a SOW key.
AMPS provides two distinct stages of message enrichment: preprocessing and enrichment. The preprocessing stage occurs before AMPS calculates the SOW key for the message. Fields that are added or updated in the preprocessing stage can be used as the SOW key for the message. Given that this stage occurs before the SOW key is generated, this stage does not have access to the previous state of the message in the SOW. The enrichment stage occurs after AMPS calculates the SOW key. Enrichment performed at this stage has access to the previous state of the SOW.
If entitlement for the instance uses content filters for publish entitlements, these filters are applied to the incoming message before either enrichment stage runs. For more details on the steps involved in enrichment, see the sequence of operations in SOW Update and Enrichment Processing.
Message enrichment only affects the message data, not the metadata on the message. In other words, while enrichment can change any field in the data, you cannot change metadata properties such as the topic the message was published to, the acknowledgments requested on the message, or the authenticated username for the publish command.
Message enrichment rewrites the message before the messages are stored in AMPS or delivered to publishers. AMPS also provides the ability to aggregate or analyze messages while preserving the original state of the message, as described in the chapter on Aggregation and Analysis. If a subscriber only needs a subset of data in a message, AMPS provides the ability for that subscriber to provide a select list to retrieve only the needed data.
The preprocessing stage of AMPS enrichment allows you to alter a message before the SOW key is calculated. This gives you the ability to easily add or transform fields that are used in the SOW key. Use this stage to enrich messages when the enriched field should be used as part of the SOW key. To specify preprocessing for a topic, you add a Preprocessing
directive to the Topic
configuration for the SOW topic.
Use Preprocessing
when you need to change the value of a field that is part of the Key
for the topic. Otherwise, use Enrichment
.
Preprocessing field directives operate on a single message and construct fields based on that message. The results of the preprocessing expression are merged into the incoming message. Any field in the source message that is not changed or removed during preprocessing is left unchanged, so it is not necessary to include all fields in the message in the Preprocessing
block.
Since preprocessing fields apply to a specific message, preprocessing fields cannot specify the topic or message type in an XPath identifier.
By default, AMPS serializes fields with a NULL value in the preprocessing result. Preprocessing fields can include a directive that specifies that if a field contains a NULL value, it should be removed from the set of fields rather than serialized. The directive HINT OPTIONAL
applied to the XPath identifier specifies that if the result of the source expression is NULL
, AMPS does not provide the value for the message type to serialize. For example, use the following directive to remove a /source
field if the value provided is not in a specific list of values:
For more information on constructing preprocessing fields, see Constructing Preprocessing Fields.
AMPS enrichment operates on a message after the SOW key is computed, but before an incoming delta publish is merged to an existing message, or the incoming message is written to the transaction log, stored to the SOW, used to update views, or delivered to subscribers. Use this enrichment stage when the enrichment process depends on the previous values of the message, or when the updated fields will not be used in the SOW key. To specify enrichment for a topic, you add an Enrichment
directive to the configuration for the SOW topic.
Enrichment field directives operate on a single message and construct fields based on that message. Enrichment expressions operate on the current message and change the current message. The results of the enrichment directives are merged into the incoming message. Any field in the source message that is not changed or removed during enrichment is left unchanged, so it is not necessary to include all fields in the message in the Enrichment
directive.
Since enrichment fields apply to a specific message, enrichment fields cannot specify the topic or message type in an XPath identifier.
Within an enrichment expression, AMPS provides two special modifiers for XPath identifiers that specify whether an XPath identifier refers to the current incoming message or the previous state of the message. These modifiers apply only to the source expression, and cannot be used in special modifiers. They are:
OF CURRENT
Specify that the XPath identifier refers to the incoming message.
OF PREVIOUS
Specify that the XPath identifier refers to the previous state of the message in the SOW.
If there is no record in the SOW for this message, all identifiers that specify OF PREVIOUS
return NULL
.
By default, AMPS serializes fields with a NULL value during enrichment. Enrichment fields can include a directive that specifies that if a field contains a NULL value, it should be removed from the set of fields rather than serialized. The directive HINT OPTIONAL
applied to the XPath identifier specifies that if the result of the source expression is NULL
, AMPS does not include the value in the set of XPath identifiers for the message type to serialize. For example, use the following directive to use remove a /source
field if the value provided is not in a specific list of values:
For more information on constructing enrichment fields, see Constructing Enrichment Fields.
The following diagram presents a simplified, high-level view of the update process for an individual message. For the purposes of this diagram, views and conflated topics can be considered listeners on the SOW topic, while applications that connect to AMPS and the on-publish
and on-deliver
actions can be considered subscribers.
It's important to keep in mind the following aspects of the SOW update sequence:
If the publish is disallowed due to topic-based entitlements or the publish filter specified for entitlements, there is no change to the state of the SOW. The entitlement filter (if one exists), is applied to the incoming message before preprocessing, enrichment, or delta merge occurs.
AMPS records the enriched message in the transaction log and SOW file. When AMPS is configured for enrichment or your application performs a delta publish, the transaction log and SOW do not preserve a record of the original message received by AMPS. Instead, they record the enriched and merged message.
Content filtering for subscriptions, views, and so forth is done on the final enriched and merged message, not on the original message as published.
When processing a SOW query, AMPS has the ability to combine messages into batches for more efficient network usage. The maximum number of messages in a batch is determined by the BatchSize
parameter on the SOW query command. AMPS defaults to a BatchSize
value of 1, meaning AMPS sends one message per batch in the response. The BatchSize
is the maximum number of records that will be returned within a single response payload. Each AMPS response for the query contains a BatchSize
value in its header to indicate the number of messages in the batch. This number will be anywhere from 1 to BatchSize
.
Current versions of the AMPS client libraries set a batch size of 10 when no other batch size is specified.
Notice that the format of messages returned from AMPS may be different depending on the message type requested. However, the information contained in the messages is the same for all message types.
Using a BatchSize
greater than 1 can yield greater performance, particularly when querying a large number of small records. In general, 60East recommends using a BatchSize
that provides good network utilization without consuming excessive server memory. Most applications that use small messages set a batch size designed to create batches that fit well into the maximum transmission unit (MTU) for the network. AMPS reports an error if an application requests a batch size larger than 10,000 records (this value is orders of magnitude larger than the typical BatchSize
used by applications).
For applications that return a large number of messages that are larger than the MTU, 60East recommends testing performance with a variety of batch sizes. Because the client libraries parse the AMPS headers common to each message once per batch, a batch size larger than 1
can improve processing performance on the client side, particularly if the client message handling is efficient. Likewise, because the AMPS server only has to serialize the common headers once per batch, a batch size larger than 1
can improve performance at the server side (as well as reduce the overall bandwidth for a group of messages). At the same time, the server will hold a batch of messages until the batch can be transmitted together (or until the query is complete), so providing large values for the batch size can introduce latency in receiving results, and reduce performance if the total size of the batch is very large.
In general, the default client value is a good compromise for many application patterns if the messages are larger than will fit into the MTU of the network. For smaller messages, or if it is important to tune performance, 60East recommends testing with a variety of batch sizes.
For more information on executing queries, please see the Developer Guide for the AMPS client of your choice, available from the 60East documentation site at http://docs.crankuptheamps.com/.
One of the more difficult problems in messaging is knowing when a record that previously matched a subscription has been updated so that the record no longer matches the subscription. AMPS solves this problem by providing an out-of-focus, or OOF, message to let subscribers know that a record they have previously received no longer matches the subscription. The OOF messages help subscribers easily maintain state and remove records that are no longer relevant.
OOF notification is optional. A subscriber must explicitly request that AMPS provide out-of-focus messages for a subscription.
When OOF notification has been requested, AMPS produces an oof
message for any record that has previously been received by the subscription at the point at which:
The record is deleted
The record expires
The record no longer matches the filter criteria
The record is no longer within the pagination window (for paginated subscriptions), or
The subscriber is no longer entitled to view the new state of the record.
AMPS produces an oof
message for each record that no longer matches the subscription. The oof
message is sent as part of processing the update that caused the record to no longer match. Each oof
message contains information the subscriber can use to identify the record that has gone out of focus and the reason that the record is now out of focus.
Since AMPS must maintain the current state of a record to know when to produce an oof
message, these messages are only supported for SOW topics, conflated topics, and views. The oof
option is not supported for subscriptions that do not include a SOW query or bookmark replays.
When AMPS returns an OOF message, the data contained in the body of the message represents the updated state of the message (except as described below). This will allow the client to make a determination as to how to handle the data, be it to remove the data from the client view or to change their query to broaden the filter thresholds. This enables a client to take a different action depending on why the message no longer matches. For example, an application may present a different icon for an order that moves to a status of completed
than it would present for an order that moves to a status of cancelled
.
When a delta_publish
message causes the SOW record to go out of focus, AMPS returns the merged record.
When there is no updated message to send, AMPS sends the state of the record before the change that produced the oof
. This can occur when the message had been deleted, when the message has expired, or when an update causes the client to no longer have permission to receive the record.
For a conflated view or a subscription that uses conflation, the data included in the oof
message will be the last data that the subscriber received. When both an update to a message and a change that would cause the message to go out of focus happen in the same conflation interval, the subscriber receives an oof
notification with the previously-received state of the message. Likewise, if a change that causes a message to go out of focus and a change that causes the message to come back into focus occur within the same conflation interval, the subscriber receives the state of the message at the end of the conflation interval. The subscriber does not receive an indication that the message had gone out of focus during the conflation interval and then come back into focus.
An out of focus message returns the reason that the message was produced in the reason
field of the message header.
deleted
The message was deleted from the topic.
previous message
expired
The message was removed from the topic due to expiration.
previous message
match
The message no longer matches the content filter or has moved outside of the record set requested.
updated message that no longer matches subscription criteria
entitlement
The message has changed such that the user is not entitled to see the updated message.
previous message
Consider the following scenario where AMPS is configured with the following SOW key for the buyer topic:
When the following message is published, it is persisted in the SOW topic:
A client issues a sow_and_subscribe
request for the topic buyer
with the filter /buyer/loc="NY"
and the oof
option set on the request. The client will be sent the message as part of the SOW query result.
Subsequently, the following message is published to update the loc
tag to LN:
The original message in the SOW cache is updated. The client does not receive the second publish message, because that message does not match the filter (/buyer/loc="NY"
). This is problematic. The client has a message that is no longer in the SOW cache and that no longer matches the current state of the record. Since the oof
option was set on the subscription, however, the AMPS engine sends an oof
message to let these clients know that the message that they hold is no longer in the SOW cache. The following is an example of what's returned.
The header of the message will contain the following fields to help the application identify the reason for the oof
message and which message no longer matches:
Command
oof
Topic
buyer
Reason
match
SowKey
6387219447538349146
The message data will contain the updated message, as shown following.
Had the message been deleted, the message data for the OOF notification would contain deleted message, and the reason would be deleted
.
An easy way to think about the situations where AMPS sends an OOF notification is to consider what would happen if the client re-issued the original sow
request after the above message was published. The /client/loc="NY"
expression no longer matches the message in the SOW cache and as a result, this message would not be returned.
To help reinforce the concept of OOF messages, and how OOF messaging can be used in AMPS, consider a scenario where there is a GUI application whose requirement is to display all open orders of a client. There are several possible solutions to ensure that the GUI client data is constantly updated as information changes, some of which are examined below; however, the goal of this section is to build up a sow_and_subscribe
message to demonstrate the power that OOF notifications add to AMPS.
First, consider an approach that sends a sow_and_subscribe
message on the topic orders
using the filter /Client="Adam".
AMPS completes the sow
portion of this call by sending all matching messages from the orders
SOW topic. AMPS then places a subscription whereby all future messages that match the filter get sent to the subscribing GUI client.
As the messages come in, the GUI client will be responsible for determining the state of the order. It does this by examining the State
field and determining if the state is equal to “Open” or not, and then updating the GUI based on the information returned.
This approach puts the burden of work on the GUI and, in a high volume environment, has the potential to make the client GUI unresponsive due to the potential load that this filtering can place on a CPU. If a client GUI becomes unresponsive, AMPS will queue the messages to ensure that the client is given the opportunity to catch up. The specifics of how AMPS handles slow clients is covered in the section discussing Slow Client Management.
The next step is to add an additional ’AND’ clause to the filter. In this scenario we can let AMPS do the filtering work that was previously handled on the client. This is accomplished by modifying our original sow_and_subscribe
to use the following filter:
Similar to the above case, this sow_and_subscribe
will first send all messages from the orders
SOW topic that have a Client
field matching "Adam" and a State
field matching "Open". Once all of the SOW topic messages have been sent to the client, the subscription will ensure that all future messages matching the filter will be sent to the client.
There is a less obvious issue with this approach to maintaining the client state. The problem with this solution is that, while it initially will yield all open orders for client "Adam", this scenario is unable to stay in sync with the server. For example, when the order for Adam is filled, the State
changes to State=Filled
. This means that, inside AMPS, the order on the client will no longer match the initial filter criteria. The client will continue to display and maintain these out-of-sync records. Since the client is not subscribed to messages with a State
of “Filled,” the GUI client would never be updated to reflect this change.
The final solution is to implement the same sow_and_subscribe
query which was used in the first scenario. This time, we use the filter requests only for the State
that we're interested in, but we add the oof
option to the command so the subscriber receives OOF messages.
AMPS will respond immediately with the query results, exactly as it does with a sow_and_subscribe
command that does not use the oof
option.
This approach provides the following advantage: for all future messages in which the same Open
order is updated, such that its status is no longer Open
, AMPS will send the client an OOF
message specifying that the record which previously matched the filter criteria has fallen out of focus. AMPS will not send any further information about the message unless another incoming AMPS message causes that message to come back into focus.
In the following diagram, the Publisher publishes a message stating that Adam’s order for MSFT has been fulfilled. When AMPS processes this message, it will notify the GUI client with an oof
message that the original record no longer matches the filter criteria. The oof
message will include a Reason
field with it in the message header, defining the reason for the message to lose focus. In this case the Reason
field will state match
since the record no longer matches the filter.
AMPS will also send oof
messages when a message is deleted or has expired from the SOW topic.
We see the power of the oof
message when a client application wants to have a local cache that is a subset of the SOW. This is best managed by first issuing a query filter sow_and_subscribe
which populates the GUI, and enabling the oof
option. AMPS informs our application when those records which originally matched no longer do, at which time the program can remove them.
sow_and_subscribe
example/State
filter in a sow_and_subscribe
sow_and_subscribe
with oof
enabledoof
message