Batching Query Results

When processing a SOW query, AMPS has the ability to combine messages into batches for more efficient network usage. The maximum number of messages in a batch is determined by the BatchSize parameter on the SOW query command. AMPS defaults to a BatchSize value of 1, meaning AMPS sends one message per batch in the response. The BatchSize is the maximum number of records that will be returned within a single response payload. Each AMPS response for the query contains a BatchSize value in its header to indicate the number of messages in the batch. This number will be anywhere from 1 to BatchSize.

The BatchSize parameter only applies to the results of a SOW query. In all other cases, AMPS does not delay a message once it is ready to be sent to a subscriber.

Current versions of the AMPS client libraries set a batch size of 10 when no other batch size is specified.

Notice that the format of messages returned from AMPS may be different depending on the message type requested. However, the information contained in the messages is the same for all message types.

When issuing a sow_and_subscribe command AMPS will return a group_begin and group_end segment of messages before beginning the live subscription sequence of the query.

This is also true when a sow_and_subscribe command is issued against a non-SOW topic. When the topic is not in the State of the World, no messages will be delivered between the group_begin and group_end messages.

Using a BatchSize greater than 1 can yield greater performance, particularly when querying a large number of small records. In general, 60East recommends using a BatchSize that provides good network utilization without consuming excessive server memory. Most applications that use small messages set a batch size designed to create batches that fit well into the maximum transmission unit (MTU) for the network. AMPS reports an error if an application requests a batch size larger than 10,000 records (this value is orders of magnitude larger than the typical BatchSize used by applications).

For applications that return a large number of messages that are larger than the MTU, 60East recommends testing performance with a variety of batch sizes. Because the client libraries parse the AMPS headers common to each message once per batch, a batch size larger than 1 can improve processing performance on the client side, particularly if the client message handling is efficient. Likewise, because the AMPS server only has to serialize the common headers once per batch, a batch size larger than 1 can improve performance at the server side (as well as reduce the overall bandwidth for a group of messages). At the same time, the server will hold a batch of messages until the batch can be transmitted together (or until the query is complete), so providing large values for the batch size can introduce latency in receiving results, and reduce performance if the total size of the batch is very large.

In general, the default client value is a good compromise for many application patterns if the messages are larger than will fit into the MTU of the network. For smaller messages, or if it is important to tune performance, 60East recommends testing with a variety of batch sizes.

Using an appropriate BatchSize parameter can help achieve the maximum query performance with a large number of messages when many messages will fit into the MTU for your network. For larger messages, tune the batch size based on performance testing with a variety of batch sizes.

For more information on executing queries, please see the Developer Guide for the AMPS client of your choice, available from the 60East documentation site at http://docs.crankuptheamps.com/.

PreviousManaging Result Sets NextOut-of-Focus Messages (OOF)

Last updated 5 days ago