Storing Multiple Logical Topics in One Physical Topic

AMPS provides a way to easily define a set of related SOW topics by specifying a Pattern element in the Topic configuration. When this element is present, AMPS creates a container SOW topic that can include a number of SOW topics in one physical file. A publish to a topic name that matches the Pattern will be treated as an individual SOW topic within the container topic that defines the pattern. The definition of each individual topic (for example, the Key values defined, the hash indexes defined, and so on) is defined by the container Topic, and is the same for every individual topic within the container SOW topic.

Using this approach creates a single physical topic (that is, the container is a single in-memory SOW topic and, when the topic is persisted, a single file) that contains records for any number of individual topic names. The topics within the container maintain the last value of each individual record within each of the topics. Publishers and subscribers can use these topics as though the topics were each configured individually as a Topic in the AMPS configuration file (with some minor behavioral differences resulting from all of the topics and messages being stored in the same data structure, as described in the following sections).

Although AMPS treats every topic within the container SOW topic as a distinct topic for the purposes of publishing and subscribing, AMPS manages those topics as records within a single SOW object. When the overall SOW topic is persisted, every message for an individual topic is stored within the same file. Likewise, the overall SOW topic is treated as a single topic in memory (including for monitoring and statistics purposes). In cases where an application has a large number of topics and each topic has a small number of messages (typically, in cases where each topic has only a single message), using a Pattern can use considerably less memory than individual Topic entries for the same number of topics and messages.

60East recommends using a Pattern for a topic in situations where an existing system uses topic names rather than content filtering, and it is not practical to adjust the system. For example, when migrating a legacy system that distinguishes orders for different customers using different topic names rather than using the content of the message (such as using topic names /orders/customerA and /orders/customerB rather than including a customer field on the message), creating a SOW topic using a Pattern of ^/orders/ might be the most straightforward way to adapt the system to AMPS.

For a small number of topics, or cases where an individual topic would have a large number of entries, 60East recommends using individual topics rather than specifying a Pattern.

When to Use a Pattern in a Topic

The Pattern element allows you to define a large number of SOW topics that will hold a small number of records (typically, only one record per topic) while minimizing the memory and storage overhead for each topic. This can be especially helpful when migrating a system that uses topic-based routing to AMPS, since you can easily create a large number of topics (hundreds or thousands) without having to explicitly specify each one in the AMPS configuration file in cases where it is important to query the last value of each topic.

Consider using the Pattern element in cases where:

  • You need to be able to query the current value of a record (or topic). If you do not need to query current values, there is no need to define topics in the SOW at all (consider using ad hoc topics instead).

  • The information that determines whether a given message is unique is not contained in the message itself. If that information is already present in the data, it is more efficient to use a Topic with the unique property configured as a Key.

  • The messages have the same structure and are the same logical type of message. Messages that are different types, or that have different structures, would typically be represented in different topics.

  • Your application requires a large number of topics, or you do not know the topics in advance, such that it is impractical to define the topics using individual Topic declarations.

  • Each unique topic will have a small number of messages (ideally, only one message per topic).

  • Your application does not require historical point in time query, or enrichment on the messages.

  • All of the topics to be managed together have the same general set of permissions. AMPS does not support applying different entitlements to individual topics within an overall SOW topic (some limited workarounds are available through content filters).

If any of the above considerations are not true, consider using a set of Topic declarations rather than using the Pattern element in a single Topic.

Container topics are most commonly used when adapting a system that did not support content filtering (content-based routing) to an AMPS-based application in cases where the message data itself does not contain enough information to support content-based routing. Applications designed for AMPS most frequently use a Topic and content filtering rather specifying a Pattern for a Topic and providing routing information in the topic name.

Limitations When Storing Multiple Logical Topics in a Physical Topic

For most purposes, topics that use a Pattern work just like any other topic defined using the Topic directive. However, there are some differences in behavior, as outlined below:

  • When an application issues a sow or sow_and_subscribe that uses a regular expression for the topic name, messages from topics within a topic that uses Pattern are delivered between a single group_begin and group_end pair. Messages from any topic name within the topic may be delivered in any order within the query results. Each message will indicate which topic within the topic it originated from.

  • A topic that uses Pattern cannot be the underlying topic for a view.

  • A topic that uses Pattern can be the underlying topic for a conflated topic, but the conflated topic must be configured to use such a topic.

  • All of the topic names within the topic must have the same permissions.

Configuration File Precedence

AMPS allows you to define a standalone topic, view, queue, or conflated topic with a Name that matches the Pattern of the Topic. To do so, however, that definition must appear in the configuration file before the definition of the topic that uses the Pattern. The topic, view, queue, or conflated topic will be configured as though the topic that defined the Pattern is not present.

For example, the following SOW configuration creates a Topic named /orders/specialHandling and a Topic with a Pattern that matches ^/orders/. The /orders/specialHandling topic adds preprocessing, and could also, in principle, have different permissions than the topic names that are matched by the Pattern.

<SOW>
  <Topic>
     <Name>/orders/specialHandling</Name>
     <MessageType>json</MessageType>
     <Key>/orderId</Key>
     <Preprocessing>
       <Field>COALESCE(/orderId,
                  CONCAT(/customerName, /customerSerialNumber)) as /orderId</Field>
     </Preprocessing>
     <FileName>./sow/%n.sow</FileName>
  </Topic>

  <!-- Regular expression topic since this definition
       has a Pattern directive. -->

  <Topic>
     <Name>RegexOrders</Name>
     <Pattern>^/orders/</Pattern>
     <MessageType>json</MessageType>
     <Key>/orderId</Key>
     <FileName>./sow/%n.sow</FileName>
  </Topic>
</SOW>

With these definitions, a publish to the following topic names would produce the following results:

Message Published to Topic

Results

/orders/specialHandling

Matches Topic definition. Stored in the Topic.

The Preprocessing directive runs and creates the /orderId from the /customerName and /customerSerialNumber if there is no /orderId already present.

/orders/RHAT

Matches the regular expression topic definition, stored in the regular expression topic.

/orders/specialHandling/oops

Matches the regular expression topic definition, stored in the regular expression topic.

Notice that a Topic definition is an exact match on the topic name, not a pattern match.

/customer/orders/timothy_someone

Does not match either the Topic or the regular expression topic.

Not included in the SOW.

Entitlements for Logical Topics

AMPS considers permissions for all of the logical topics within the physical SOW topic to be identical. When checking permissions for the topic with an entitlement module, AMPS requests that the module provide permissions for the Pattern specified in the topic. Any topic name included in the container will use the permissions, entitlement filter, and entitlement select list specified by the module for that Pattern.

If it becomes necessary to restrict access to individual topics within the physical topic, there are two approaches that you can take:

  1. Create a new topic with a Pattern that specifies the topics that require different permissions, and apply the permissions to that topic.

  2. Provide an entitlement filter that uses the TOPIC_NAME() function to restrict access to specific topic names; for example, TOPIC_NAME() IN ('/orders/RHAT', '/orders/MSFT', '/orders/IBM'). Using this method is less efficient than providing permissions for those topics (either as standalone topics, or for a regular expression topic containing exactly those three topics), but this approach can be a good option in cases where subscribers typically subscribe only to the topics they are entitled to, different subscribers have substantially different sets of entitlements, or there are no logical or convenient groupings that can be used to separate the topics into several regular expression topic declarations.

Last updated

Copyright 2013-2024 60East Technologies, Inc.