Capacity Planning

Sizing an AMPS deployment can be a complicated process that includes many factors, such as: configuration parameters used for AMPS, the data used within the deployment and how the deployment will be used. This section presents guidelines that you can use in sizing your host environment for an AMPS deployment given the following components that need to be taken into account: Memory, Storage, CPU and Network.

Capacity planning is one of the most important aspects of ensuring that an AMPS deployment can meet the needs of the application.

The capacity planning formulas in this section are intended to help you size a system to run an instance of AMPS. The actual resource consumption will vary based on usage and configuration.

System Goals and Requirements

When planning the capacity for a system, the most important questions to understand are: the purpose of the instance and the Service Level Agreement (SLA) offered by the instance. For example, is this a server for use by a development team for early exploration of ideas, or will this instance be core infrastructure for a major application? Is it important that the instance has the absolute minimum latency possible, or is the most important aspect of the system query response time for a 1TB topic in the SOW?

Since AMPS efficiently uses the system hardware, the limits of an AMPS instance are typically a result of the limitations of the underlying host system. Proper capacity planning (and operating system tuning) can mean the difference between an instance that performs well and handles increased traffic without incident and an instance that constantly pushes the hardware to the limit and becomes less responsive when traffic increases.

Single-Tenant or Multi-Tenant

AMPS performs well in both single-tenant and multi-tenant installations. When choosing whether to host more than one AMPS instance on a given system, it is important to plan for the highest level of traffic expected on all instances simultaneously. In a business setting, it is common for a sudden increase in traffic to affect a number of systems in the business, rather than being isolated to just one system. When planning capacity for a multi-tenant system, provision a host that exceeds the total maximum capacity required for all AMPS instances on the system at peak load.

For multi-tenant installations, disable AMPS-level NUMA tuning in the configuration file.

Physical Server or Virtual Machines

Although AMPS is designed to be highly adaptive to hardware on which it runs, AMPS does not require a dedicated physical server. AMPS can be successfully deployed on either physical hardware or virtual machines. In either deployment model, 60East recommends tuning Linux for best performance rather than accepting the distribution defaults (which are typically tuned for interactive use rather than for a high performance server).

Typically, installations that require the highest level of performance and lowest levels of latency deploy on physical hardware (with a single AMPS instance per server). Installations that are willing to trade predictable performance for ease and flexibility of deployment often use virtual machines.

When deployed on a virtual machine, disable AMPS-level NUMA tuning in the configuration file.

60East does not recommend over-committing the underlying hardware. When deploying on a virtual machine, it is important to consider the capacity of both the virtual machine itself and the underlying host hardware. In other words, the total memory needed by all virtual machines -- with all applications hosted by those machines running at peak traffic simultaneously -- should not exceed the physical memory of the hardware. Likewise, the total number of CPUs specified in all of the virtual machines on the host should not exceed the number of CPUs on the host hardware, the network bandwidth needed should not exceed the bandwidth allocated to the host, the traffic to the storage device should not exceed the throughput that the storage device is capable of, and so on. In an enterprise environment, it is not unusual for a wide variety of applications to all see peak loads at the same time, so the system should be provisioned to provide enough capacity that every hosted application can meet peak throughput requirements at the same time.

Develop a plan for monitoring the physical hardware as well as the virtualized host environment. If possible, the monitoring plan should include a method for correlating the activity on the virtual machine to the activity on the physical host (for example, it would be important to be able to correlate CPU saturation on the virtual machine to CPU saturation on the physical host).

60East does not recommend using virtualization systems that dynamically move running virtual machines for load-balancing purposes in an application that requires low latency or predictable response times. Although these systems work well for their intended purpose, a machine migration typically takes an extended period of time (for example, a target maximum time of 1s) to finalize the migration. During that time, the virtual machine (and therefore, AMPS) is temporarily paused. A pause this long is typically orders of magnitude longer than the typical low-latency system can tolerate for service interruption, and is effectively a temporary service outage during the migration.

Although this sort of migration is typically unworkable for load-balancing, the migration is much less downtime than would be required to stop and restart AMPS. These systems could be a good choice for reducing downtime for hardware-level or network-level maintenance.

Memory

AMPS is designed for high performance. It is designed to use memory, as needed, to improve performance and reduce latency. One of the most important aspects of managing an AMPS instance is, being sure that the instance has enough physical memory available to perform well.

This section contains general guidelines for creating an approximate sizing estimate for the AMPS process itself. An estimate on total memory capacity for a server would include the estimate for the AMPS process itself and estimates for any other processes running on the system (including monitoring software, security software, other applications, update and maintenance tasks, and so on). Notice that it is possible for AMPS to maintain quantities of data much larger than physical memory (for example, terabytes of SOW data). For instances that have this requirement, contact 60East support for tuning and sizing guidance.

Estimating AMPS Instance Memory Usage

The best way to estimate the memory usage for an AMPS instance is to simulate, as closely as possible, the traffic and usage pattern for the instance and collect statistics that show the amount of memory that the instance uses.

If actual numbers aren't available, you can use the formulas in this section to come up with a working approximation of the amount of memory to make available to AMPS for a given amount of data, number of clients, and so on. The AMPS server will use memory as necessary for performance, so the formulas here offer general estimates for system sizing purposes rather than precise predictions.

AMPS needs less than 1GB for its own binary image and initial start up state for most configurations. For production instances, we estimate 5GB as a typical working memory footprint for an active installation.

As a general estimate, because of indexing for queries, AMPS may need up to twice the size of messages stored in a topic in the SOW to fully index that topic (the same sizing applies to messages in views and conflated topics). AMPS maintains a copy of the latest journal file in memory for quick access, and maintains a small amount of metadata for each message in an AMPS queue. The MessageMemoryLimit configured for the instance (or the total of all MessageMemoryLimit settings for each Transport in the instance) specifies the total amount of memory devoted to buffering messages for clients, including conflated subscriptions, aggregated subscriptions, and paginated subscriptions.

This puts a general estimate of the amount of memory to be available for the AMPS server itself at:

5GB + ``SowSizeEstimate``
    + ( C * 4096 bytes)
    + ( TMemLimit ) + (J * 2) + (Q * 250 bytes) [ + (QA * 20 bytes) ]

where:

SowSizeEstimate = Estimates for SOW topic size, as described below (in bytes)
C = Number of Clients
TMemLimit = Total of all MessageMemoryLimit settings in the instance
J = JournalSize setting
Q = Total number of active unacknowledged messages in the queues for the instance
QA = Total number of acknowledgments received for messages that are not yet in the queue

By default, all unacknowledged messages in the instance will be active in the queue. When a queue specifies a TargetQueueDepth, the total number of active unacknowledged messages for the queue will, in most cases, be limited to the TargetQueueDepth.

When acknowledgment messages are received for messages that are not currently active in the queue, AMPS must track those acknowledgments to be able to efficiently prevent those messages from entering the queue. Not every application consumption pattern can produce this situation; however, if it arises, this calculation can help estimate the amount of memory required to maintain information about these acknowledgments until the message enters the queue.

To calculate the SowSizeEstimate, the memory footprint required for topics, views and conflated topics in the SOW, use the following formula to calculate for each (Topic, View and ConflatedTopic):

( 2 * (S + 128 bytes) * M ) + ((16 bytes * M) * H)

where:

S = Average message size for the Topic, View or ConflatedTopic in the SOW (in bytes)
M = Maximum expected number of messages for the Topic, View or ConflatedTopic
H = Number of hash indexes for the Topic, View or ConflatedTopic

Estimating topic-by-topic generally gives a more precise estimate. However, if that data is not available, you can also use overall message sizes and message count for the instance.

If more configuration detail is available, it may be possible to create a more precise estimate. For example, if the SlabSize configured for a SOW topic is not an exact fit for the message size + header, it is possible to estimate the amount of free space remaining in each slab.

As a simple example, a general estimate of the amount of memory that should be left available to run an instance of AMPS might be:

5GB + [ ( 2 * (1024+128) * 1,000,000 ) + (16 * (1,000,000 * 2) ) +
        ( 2 * ( 512+128) * 1,000,000 ) + ( 0 ) +
        ( 2 * (1024+128) * 8,000,000 ) + (16 * (8,000,000 * 4) ]
    + ( 200 * 4096) + ( 10,000,000,000) +
      ( 1,000,000,000 * 2 )  + ( 750,000 * 250)

where:

For theSowSizeEstimate, the instance will have two Topics and a View.
For the first topic:
S = 1024
M = 4,750,000
H = 2
For a view over the first topic (the view uses no HashIndexes):
S = 512
M = 3,000,000
H = 0
For the second topic:
S = 1024
M = 8,000,000
H = 4
For the overall AMPS estimate:
C = 200
TMemLimit = 10,000,000,000 (10GB)
J = 1,000,000,000 (1GB)
Q = 750,000

This shows sizing for an AMPS deployment with the following characteristics:

Three topics in the SOW (including one view):
- One topic has a message size of 1024 bytes, will hold 5 million messages and configure 2 hash indexes.
- One view has a message size of 512 bytes, will hold 3 million messages and does not configure a hash index.
- One topic has a message size of 1024 bytes, will hold 8 million messages and will configure 4 hash indexes.
A maximum of 10GB of memory for in-flight messages and working state (for aggregated subscriptions, pagination sets and so on)
A journal size setting configured to 1GB
A maximum of 750,000 total unacknowledged messages at a time across all message queues
No more than 200 clients connected simultaneously
No external modules loaded

This estimate suggests that no less than 52GB of physical memory on the server should be available for the AMPS instance itself while AMPS is processing the expected volume of messages. When AMPS first starts, or if traffic is light, AMPS may consume less than the estimated amount. AMPS may also consume more than this amount of memory during memory-intensive operations in some cases.

The formula in this section is a general estimate designed to produce a recommended minimum amount of physical memory to have available for AMPS. It is intended as a guideline when actual measurements are not available. For more accurate estimates, use measurements of the expected workload. A given instance of AMPS may not match these estimates at any particular time, based on usage, precise configuration, traffic, client activity, and so forth.

Estimating Overall System Capacity

The AMPS instance memory usage is one component of estimating the needs of the overall system. In addition to this, there is also: operating system tasks, management and maintenance (including monitoring, security and management software), and any other applications running on the system must also be taken into consideration.

Further, Linux memory management is most efficient when the operating system has 10-20% headroom.

For best performance and a lower risk of problems related to an unexpected spike in message volume, 60East recommends factoring in all of the components that will consume memory on the system, and then sizing the overall physical memory to handle 200% of the capacity estimated while still retaining 10-20% physical RAM. Note that these are rough guidelines. An especially critical system, or a system that has in the past seen larger volumes might size memory to 350% or more, while a less critical system might allocate less than 200% of the estimated capacity. A VM on a developer desktop might be sized at or below the capacity estimate, since the system is completely under the control of a single user and is not intended to handle production loads.

For example, in the estimate above, the system should reserve a minimum of 52GB of free RAM for the AMPS process itself. Suppose that the monitoring, access control and server management software are very lightweight and only consume 3GB of memory under production load. The following estimates would be reasonable:

Production server with strict SLA and tolerance for usage variation - 128GB
This estimate accounts for 200% of the estimated required capacity while still allowing 16GB (between 10 and 20% of the physical memory) free for efficient memory management. (Calculation: 52GB for AMPS, 3GB for the monitoring software = 55GB. Multiply by 2 = 110GB.) For a server that needs to be available during periods of heavy activity, this sizing could be a good option.
Production server with stable usage or variable SLA - 96GB
This estimate covers the expected capacity of the AMPS server and monitoring software and leaves enough headroom for the operating system, but does not allow for large growth in volume or changes in usage patterns. For a server with predictable usage and volumes, or a server where some performance impact is acceptable if volume or usage increases and where the general estimates for AMPS capacity are known to be very precise, this server size could be a good option.
Shared development server with minimal performance SLA - 64GB
This minimal estimate covers the expected capacity of the AMPS server and monitoring system, but does not leave enough excess capacity for the server to absorb unexpected traffic. This server would be expected to have periodic performance degradation and to possibly exit due to out of memory conditions if the volume of traffic increases or the usage pattern changes. This could be a good sizing for an instance that is used only for development, where the instance is not guaranteed to provide any particular performance guarantees and it is acceptable for the server to be temporarily unavailable if there was an unexpected increase in load.

Storage

AMPS needs enough space to store its own binary images, configuration files, SOW persistence files, log files, transaction log journals, and slow client offline storage, if any. Not every deployment configures a SOW or transaction log, so the storage requirements are largely driven by the configuration.

AMPS Log Files

Log file sizes vary depending on the log level and how the engine is used. For example, in the worst-case, trace level logging. AMPS will need at least enough storage for every message published into AMPS and every message sent out of AMPS plus 20%.

For info level logging, a good estimate of AMPS log file sizes would be 2MB per 10 million messages published.

Logging space overhead can be capped by implementing a log rotation strategy which uses the same file name for each rotation. This strategy effectively truncates the file when it reaches the log rotation threshold to prevent it from growing larger.

SOW Topics

When calculating the amount of storage to reserve for topics in the SOW, there are a couple of factors to keep in mind. The first is the average size of messages stored in the SOW, the number of messages stored in the SOW and the SlabSize defined in the configuration file for each Topic. Using these values, it is possible to estimate the minimum and maximum storage requirements for the SOW.

A rough estimate of the minimum size for a SOW topic is as follows:

Min = ( MsgSize * MsgCount ) + ( Cores * SlabSize )

where:

Min = Minimum SOW size
MsgSize = Average SOW message size
MsgCount = Number of SOW messages
Cores = Number of processor cores in the system
SlabSize = Slab Size for the SOW

A rough estimate of the maximum size for a SOW topic is as follows:

Max =  ( ( MsgCount / ( ( SlabSize / MsgSize) / 2) ) * SlabSize) + (Cores * SlabSize)

Maximum SOW Size

where:

Max = Maximum SOW size
MsgCount = Number of SOW messages
SlabSize = Slab size for the SOW
MsgSize = Average SOW message size
Cores = Number of CPU cores in the system

The storage requirements should be between the two values above, however it is still possible for the SOW to consume additional storage based on the unused capacity configured for each SOW topic.

Notice that, as suggested in this calculation, AMPS reserves the configured SlabSize for each processor core in the system the first time a thread running on that core writes to the SOW.

For example, in an AMPS configuration file with the SlabSize set to 1MB, the SOW for this topic will consume 1MB per processor core with no messages stored in the SOW. Pre-allocating SOW capacity in chunks, as a chunk is needed, is more efficient for the operating system and storage devices, and helps amortize the SOW extension costs over more messages.

It is also important to be aware of the maximum message size that AMPS guarantees the SOW can hold. The maximum message size is calculated in the following manner:

Max = SlabSize - 64 bytes

where:

Max = Maximum message size that can be stored in the SOW
SlabSize = The configured SlabSize for the SOW

This calculation says that the maximum message size that can be stored in the SOW in a single message is the SlabSize minus 64 bytes for the record header information.

Transaction Logs

Transaction logs are used for message replay, replication and to ensure consistency in environments where each message is critical. Transaction logs are optional in AMPS (though some features require them), and transaction logs can be configured to record specific topics.

When planning for transaction logs, there are three main considerations:

The total size needed for the transaction log, including in disaster recovery scenarios
The size to allow for each file that makes up the transaction log
How many files to preallocate

You can calculate the approximate total size of the transaction log once the system reaches steady state as follows:

Capacity = ( S + 512 bytes ) * N  + ( J ) + ( TI )

where:

Capacity = Estimated storage capacity required for transaction log
S = Average message size
N = Number of messages to retain
J = Journal file size
TI = Topic Index (if configured), larger of 200MB or (64 * number of messages indexed)

Size your files to match the aging policy for the transaction log data. To remove data from the transaction log, use AMPS actions to remove the journal files that are no longer needed. You can size your files to make this easier. For example, if your application typically generates 100GB a day of transaction log, you could size your files in 25GB units to make it easier to remove 100GB increments.

AMPS allows you to preallocate files for the transaction log. For applications that are very latency-sensitive, preallocation can help provide consistent latency. We recommend that those applications preallocate files, if storage capacity and retention policy permit. For example, an application that sees heavy throughput during a working day might preallocate enough files so that there is no need for additional allocation within the working day.

Notice that, if your application uses replication, the AMPS transaction log maintenance actions will not delete unreplicated messages that this instance is responsible for replicating. This means that, when calculating the maximum storage space required, the recovery window for a failure is also important. For example, many systems have a policy of not restarting a failed system until a scheduled maintenance window: if one server in a replicated set of servers could, potentially, be offline for up to 8 hours, then the other servers must be able to store a minimum of 8 hours of journals, even in cases where the normal retention period would be shorter.

File-Backed Queue Metadata

AMPS can, optionally, persist queue metadata to the filesystem to allow metadata to be paged out of memory and potentially improve recovery time.

For any queue that uses the FileBackedMetadata option, the following formula can be used to estimate the storage space that AMPS may use for each queue:

GREATER OF ( (MaxMsgCount * 250 bytes ) or 4MB )

where:

MaxMsgCount = Maximum number of messages active in the queue

The maximum number of messages active is the largest number of unacknowledged messages in the queue since the instance was started, as measured by the maximum value of the queue_depth metric for the queue. Notice that if the queue also uses the TargetQueueDepth option, the active message count will typically be the TargetQueueDepth unless AMPS has temporarily expanded this depth to avoid halting queue delivery.

AMPS preallocates files of approximately 4MB for the metadata cache, then grows the file if needed to maintain metadata. The size of the file does not shrink while the instance is running. AMPS may reduce the size of the file during recovery. As with all capacity estimates, this formula is intended to provide a working approximation of the amount of disk space needed, and does not mean that the file will be precisely the size the formula indicates.

Choosing Storage Devices

The previous sections discuss the scope of sizing the storage, however scenarios exist where the performance of the storage devices must also be taken into consideration.

In cases where messages are persisted (to the transaction log, to a topic in the SOW, or both), overall throughput of the instance can be limited by the performance of the storage device. It is important that the storage device be able to keep up with the peak rate at which the instance will receive messages.

Different aspects of the AMPS server have different patterns of access to storage, as shown below:

For each of these uses, ensure that the underlying system has enough I/O bandwidth to meet the needs of an instance. For example, publishing to a topic that is in the SOW and also recorded in the transaction log will write the message to both the SOW and the transaction log, and may (depending on logging settings) also generate a write to the error and event log.

Consider a case where an instance is recording messages in the transaction log at a high incoming message rate. If performance greater than 50MB/second is required for the AMPS transaction log, experience has demonstrated that flash storage (or better) would be recommended. Magnetic hard disks lack the performance to produce results greater than this with a consistent latency profile.

For applications that require high performance and persist state, 60East recommends separate storage for the SOW, the transaction log and the error and event log where practical.

CPU

SOW queries with content filtering make heavy use of CPU-based operations and, as such, CPU performance directly impacts the content filtering performance and rates at which AMPS processes messages. The number of cores within a CPU largely determines how quickly SOW queries execute.

AMPS contains optimizations which are only enabled on recent 64-bit x86 CPUs. To achieve the highest level performance, consider deploying on a CPU which includes support for the SSE 4.2 instruction set.

To give an idea of AMPS performance, repeated testing has demonstrated that a moderate query filter with 5 predicates can be executed against 1KB messages at more than 1,000,000 messages per second, per core on an Intel i7 3GHz CPU. This applies to both subscription based content filtering and SOW queries. Actual messaging rates will vary based on matching ratios and network utilization.

Network

When capacity planning a network for AMPS, the requirements for messaging traffic are largely dependent on the following factors:

Average message size
The rate at which publishers will publish messages to AMPS
The number of publishers and the number of subscribers

AMPS requires sufficient network capacity to service inbound publishing as well as outbound messaging requirements. In most deployments, outbound messaging to subscribers and query clients has the highest bandwidth requirements due to the increased likeliness for a “one to many” relationship of a single published message matching subscriptions/queries for many clients.

Estimating network capacity requires knowledge about several factors, including but not limited to: the average message size published to the AMPS instance, the number of messages published per second, the average expected match ratio per subscription, the number of subscriptions, and the background query load. Once these key metrics are known, then the necessary network capacity can be calculated:

R * Sz ( 1 + M * Sb ) + Q

where:

R = Rate
Sz = Average Message Size
M = Match Ratio
Sb = Number of Subscribers
Q = Query Load

where “Query Load” is defined as:

Mq * S * Qs

where:

Mq = Messages per Query
S = Average Message Size
Qs = Queries per Second

In a deployment required to process published messages at a rate of 5000 messages per second, with each message having an average message size of 600 bytes, the expected match rate per subscription is 2% (or 0.02) with 100 subscriptions. The deployment is also expected to process 5 queries per 1 minute (or 12 queries per second), with each query expected to return 1000 messages.

5000 * 600 B * ( 1 + 0.02 * 100 ) + ( 1000 * 600 B * 1/12 ) ~ 9 MB / s ~ 72 Mb / s

Based on these requirements, this deployment would need at least 72Mb/s of network capacity to achieve the desired goals. This analysis demonstrates AMPS by itself would fall into a 100Mb/s class network. It is important to note, this analysis does not examine any other network based activity which may exist on the host and as such, a larger capacity networking infrastructure than 100Mb/s would likely be required.

Replication Network Bandwidth

For replication connections, the general recommendation is to estimate bandwidth needs as though each outgoing replication destination is a subscriber that subscribes to all of the replicated topics, and each incoming destination is a publisher that fully publishes the replicated topics. Although AMPS replication connections support compression, the general recommendation is to provision enough network capacity to support the full replication stream, and then to use compression to save capacity.

Additional Network Considerations

When calculating the available bandwidth for an instance, it is also important to take into account any other use of the network. For example, if the system uses network attached storage, and the traffic to that storage is not isolated from messaging traffic, bandwidth to the storage device should also be taken into account when planning the network capacity available to the instance.

Likewise, any other process that consumes bandwidth, such as monitoring applications or log collection processes, should be considered when planning the overall bandwidth capacity available.

NUMA Considerations

AMPS is designed to take advantage of non-uniform memory access (NUMA). For the lowest latency in networking, we recommend that you install your NIC in the slot closest to NUMA node 0. When AMPS NUMA tuning is enabled, AMPS runs critical threads on node 0, so positioning the NIC closest to that node provides the shortest path from processor to NIC.

When a single instance of AMPS is deployed on the system (physical host), as is the case with most critical production systems, 60East recommends leaving AMPS NUMA tuning enabled (this is the default).

If more than one instance of AMPS is running on the same physical host, or if other CPU-intensive processes are running on the same physical host, 60East recommends disabling AMPS NUMA tuning in the AMPS configuration file and relying on the operating system NUMA management. Likewise, if a mechanism is used to restrict AMPS to specific processors, AMPS NUMA tuning should be disabled.

When AMPS is deployed on a virtual machine, 60East recommends disabling the AMPS level NUMA tuning in the configuration file.

PreviousOperation and Deployment NextLinux OS Settings

Last updated 13 days ago