Performance Considerations
This section describes general performance considerations for the AMPS expression language and content filters. The considerations here are aspects of AMPS performance to be aware of in the general case. However, since the AMPS expression language operates on specific data, the structure and size of the messages that your application uses may have more effect on overall performance than the specific expressions used. For example, parsing and filtering a 20MB XML document is inherently more expensive than parsing and filtering a 400 byte BFlat document.
Use Short-Circuiting
When clauses in an expression are joined by OR
, AMPS will only evaluate the right side of an OR
expression if the left side of the expression is false.
When constructing an expression, this means that there can be a performance advantage to having relatively less expensive clauses on the left hand sides of the OR
. For example, in the following clause:
The regular expression comparison is only evaluated if the comparison /code = 'restricted'
is false. If the comparison is true, then the overall clause is true and there is no need to evaluate the regular expression.
Avoid Redundant Expressions
AMPS does not reorder or recombine complex expressions. Where feasible, your application can save work at the server by combining expressions. In particular, if an application is constructing a filter by reading options from various sources, performance can be improved by combining the queries.
For example, in a filter like the following:
The comparison against '12345'
will be evaluated three times in cases where the value of /id
does not match any of the values in the filter.
This filter is equivalent to:
The same results are produced, but only evaluates the /id
field against a given value one time.
Use Specialized Operators for Simple Comparisons, Use LIKE when Necessary
The LIKE
operator offers access to full Perl-Compatible Regular Expressions within the AMPS expression language. This flexibility allows for very precise filtering, and the PCRE engine performs well.
However, for comparisons for which AMPS provides a named function, the named function is highly-optimized and will perform somewhat better than the general-purpose regular expression engine.
For example, given a choice between two equivalent expressions:
and
The version that uses BEGINS WITH
will typically perform slightly better than the version that uses the regular expression.
This doesn't mean that regular expressions or the LIKE
operator perform poorly. The LIKE
operator can efficiently match patterns that would be difficult or impossible to match using the other operators. However, for very simple comparisons where AMPS provides a dedicated operator, that operator typically performs slightly better than a regular expression.
The following table shows some examples of regular expressions and the AMPS operator equivalent.
^something
BEGINS WITH('something')
something$
ENDS WITH ('something')
something
INSTR(/field, 'something') != 0
(?i)something
INSTR_I(/field, 'something') != 0
(?i)^something$
STREQUAL_I(/field, 'something') != 0
^a$
= 'a'
Optimize for Partial Parsing
Most AMPS message types have the ability to partially parse messages. That is, rather than parsing the entire message, the message type can simply find the identifiers that will be used, and stop the parsing process as soon as those identifiers are found.
This optimization is most useful for larger messages. For example, if the SOW key for a topic is based on the /id
field of a message and there are active content filters that use both the /id
field and the /code
field, while no other field is being indexed, then, considering the message below:
The AMPS parser can stop parsing after processing only the /id
and the /code
fields. In this case, halting the parsing after processing these two fields avoids the expense of parsing the remaining parts of the message.
Notice that this optimization will only improve performance in cases where AMPS doesn't need to parse the entire message. For example, if there is a delta_subscribe
active for the topic, or if the command being processed is a delta_publish
, AMPS will parse the message completely to be able to calculate the deltas. Likewise, if any filter refers to a field that doesn't appear in the message, AMPS will parse the message completely to be able to determine that the field does not appear in the message.
SOW Queries and Indexing
Queries over topics in the State of the World (SOW) have additional performance considerations. AMPS maintains indexes over SOW topics to help locate messages in response to a query.
Queries over a topic in the SOW can use SOW topic indexes. Where possible, use an exact string match and create a hash index to take advantage of hash indexes.
When a query is submitted with an XPath identifier for which no index exists, AMPS will create and populate a memo index for that XPath identifier. This can add to the amount of time a query takes the first time a given XPath identifier is queried. You can specify that AMPS creates a memo index for a given identifier by using the
Index
configuration item in theTopic
definition. Once an index is created, AMPS will continue to search for that XPath identifier in incoming messages for that topic to keep the index up to date.
Notice that SOW topic indexes are only used for sow
commands and during the sow
portion of a sow_and_subscribe
(or sow_and_delta_subscribe
) command. Once the subscription to current updates begins, the subscription does not use a SOW topic index because there is no need to locate a message. During a subscription, filters are run against the current message.
See the section on Indexing for State of the World topics for details.
Last updated