Indexing SOW Topics

AMPS maintains indices over SOW topics, views, and conflated topics to improve query efficiency.

There are two types of indices available:

  1. Memo indices are created automatically when AMPS needs to use a particular field for a query. These indices maintain the value of a key, and can be used for any type of query, including regular expression queries, range queries, and comparisons such as less than or greater than. You can also request that AMPS pre-create an index of this type with the Index directive of the SOW topic configuration.

  2. Hash indices are defined by the configuration for the topic, view or conflated topic. These indices maintain a hash derived from the values provided for the fields in the key. When the topic is configured so that AMPS generates the SOW key, AMPS automatically creates a hash index that contains all of the fields in the SOW Key. You can create any number of hash indexes for a SOW topic, with any combination of fields. Hash index queries are significantly faster than queries using memo indexes.

Both types of indices are maintained in memory. The section on Estimating AMPS Instance Memory Usage has more details.

A hash index can be created using any XPath Identifier in the message. For example, if you are using a composite-local message type, you can create a hash index using fields from any part of the message. If you are using an xml message, you can create a hash index that uses the XML attributes.

The values of hash indices are always evaluated as strings. Hash indices are only used for exact matches on the value of the fields or with the IN operator, and only for queries that use the exact set of fields in the hash index. Other operators or functions (for example, LIKE, !=, BETWEEN, IS NULL, IS NOT NULL, and so on) cannot use the hash index. To use a hash index, the comparison must use a literal string for comparison to specify that the comparison uses an exact string comparison and not a numeric comparison.

For example, if your configuration specifies a hash index that uses the fields /address/postalCode and /customerType, a filter such as /address/postalCode = '04109' AND /customerType = 'retail' will use the hash index. A filter such as /address/postalCode = '04109' AND /customerType LIKE 'retail|remainder' will not use the hash index, since this filter uses the LIKE operator rather than exact matching. Likewise, a comparison such as /address/postalCode = 04109 will not use the hash index, since the expression requests a numeric comparison rather than a string comparison.

Starting with AMPS 5.3.1.0, AMPS will also use a hash index for a compound filter if the first clause in the filter is an IN operator that can use a hash index and the other comparisons in the filter are evaluated using the AND operator. In this case, AMPS evaluates the IN clause first, and executes the rest of the expression against the results of the IN clause. For example, a filter like /id IN ('jon', 'jim', 'joy') AND /price > 50 will use a hash index to find matching records for /id and then compare the matching records to the rest of the filter (in this case, a numeric comparison on /price). (Notice that this optimization is not available if the other comparisons use the OR operator.)

AMPS uses a hash index for filters where possible. If the filter does not meet the requirements for using a hash index, AMPS uses memo indices for the fields in the filter if those are available. If one or more of the required memo indices is not available, AMPS creates the indexes during the query.

If your application frequently uses queries for an exact match on a specific set of fields (for example, retrieving a set of customers by the /address/postalCode field), creating a hash index can significantly improve the speed of those queries.

Last updated

Copyright 2013-2024 60East Technologies, Inc.