Protobuf Message Types

Protocol buffers, or protobufs for short, is an efficient, automated mechanism for serializing structured data. AMPS supports Google protobuf messages (version 2 and version 3) as a message format.

Since Google protocol buffers use a fixed format for messages, to use protobuf, you must configure AMPS with the definition of the messages AMPS will process. This involves defining a MessageType. You must define a MessageType for AMPS to be able to parse protobuf messages.

60East recommends that the .proto files used with AMPS explicitly declare the protocol buffer syntax version used. If there is no explicit declaration, AMPS assumes the file uses protocol buffer 2 syntax.

The AMPS engine is message-type agnostic. Except for the limitations described in this section, there is no difference to the AMPS engine between message types that use protocol buffers and other message types such as JSON or XML or FIX.

Configuring Protobuf Message Types

To use a protobuf message, you must first edit the configuration file to include a new MessageType. Then, specify the path to the protobuf file and the name of the protobuf file itself inside the MessageType. Below is a sample configuration of a protobuf message type:

...

<MessageType>
    <Name>my-protobuf-messages</Name>
    <Module>protobuf</Module>
    <ProtoPath>proto-archive;/mnt/shared/protofiles</ProtoPath>
    <ProtoFile>proto-archive/person.proto</ProtoFile>
    <Type>MyNamespace.Message</Type>
</MessageType>

...

Each message type references a ProtoFile, and specifies a single top-level type from the file. The ProtoFile may include other files through the standard protocol buffer include mechanism. Likewise, the top-level type may be any valid protocol buffer definition, including definitions that contain other types.

Once the protocol buffer MessageType is created as described above, you must either create a Transport that specifies that message type exactly, or you must create a Transport that can accept any known message type and ensure that the client specifies the new message type (in the example case, my-protobuf-message) in the connect string.

When creating a protobuf message type, you must provide the following parameters:

Filtering with Protobuf Messages

To filter protobuf messages, there are a couple of conventions you must remember. AMPS XPath identifiers begin at the outermost message, so you can simply use member names for that message. If you have nested messages, you use the name of the nested message and the member name when creating an XPath identifier.

For example, suppose you have the following definition in a .proto file:

message person {
    required string name = 1;
    required int32 personID = 2;
}

To access the personID data member, you simply use the name of the data member as the XPath identifier. An example filter that verifies that a personID is greater than 1000 would be:

/personID > 1000

If you have nested messages, you simply provide the path to the nested message you want to access.

Let's assume that the person message from the above example was nested inside another message with the name of record. The example filter below shows how to access the nested person message, and then filter to the personID:

/person/personID > 1000

In this case, the first part of the identifier (/person) specifies the sub-message. The second part of the identifier (/personID) specifies the field within that sub-message. Notice that, as always, there is no need to specify the name of the message for the outermost message.

Working with Multiple Protocol Buffer Types

Some applications require messages of different types: for example, an inventory management system may work with customer records, inventory records, and shipping order records.

When using protocol buffers, each of these messages would use a different .proto file, and therefore would be a different message type. Unlike a self-describing format such as JSON or XML, the serialized form of a protocol buffer message type does not automatically contain any information about the type of message or the fields that the message contains. Therefore, each protocol buffer message type is best considered as a completely distinct type. For example, the parser created for an order record and the parser created for a customer record are different. Unlike self-describing formats, it is not possible to use a single parser for these types, or for a parser to correctly handle a previously-unknown message structure.

There are two approaches to working with multiple protocol buffer types in an AMPS application:

  1. Keep the message types distinct. Each message type requires a separate connection to AMPS. The advantage of this approach is that the .proto files can be maintained and updated separately. Each connection has a distinct type and only needs to handle messages of that type. The disadvantage of this approach is that the application must make a connection to AMPS for each type of message received.

  2. Create a "container" type that can optionally contain any of the needed message types. The advantage of this approach is that this requires only a single connection to AMPS. Since there is a single "container" type, a topic can hold this "container" type and have heterogeneous actual contents. The disadvantage to this approach is that it requires a consumer to understand the "container" type and changes to the contained types may need to be carefully managed across the consumers that use the container. A "container" type is typically a oneof of the contained types.

For example, you might define a container as follows:

message Container {

    oneof {
      Order      order_type = 1;
      Payment    payment_type = 2;
    }
}

message Order {
    required string customer_id = 1;
    ...
}

message Payment {
    required string customer_id = 1;
    ...
}

In this case, the container type will include either an Order or a Payment.

Union Types

When using a protocol buffer message type that contains a union, you can navigate the union using the names defined in the top-level element. For example, given the union defined below:

message MyUnion {
    optional Order      order_type = 1;
    optional Payment    payment_type = 2;
}

message Order {
    required string customer_id = 1;
    ...
}

message Payment {
    required string customer_id = 1;
    ...
}

Providing a filter of /order_type IS NOT NULL will return all of the MyUnion messages that contain an Order, while providing a filter of /payment_type/customer_id = '42' will return only the MyUnion messages that contain a Payment message with a customer_id of 42.

Limitations of the Protobuf Message Type

Because the protobuf message type requires a specific, fixed definition for messages, AMPS does not support operations that construct messages that may contain arbitrary values. In particular, protobuf does not support:

  • Creating a View with a protobuf type as the MessageType. AMPS allows you to aggregate protobuf messages and project the results as another type, but the destination MessageType for a View cannot be a protobuf message type.

  • Creating an aggregated subscription for a topic that contains messages of a protobuf message type.

  • Subscriptions to AMPS internal topics. Protobuf message types do not support creating messages for AMPS internal topics, such as /AMPS/ClientStatus.

  • Enriching or preprocessing protobuf message types. AMPS does not support enrichment or preprocessing of protobuf messages.

Protocol buffer version 3 messages provide fixed default values for omitted fields. This means that there is no reliable way for AMPS to determine if a missing field has been intentionally left out of the message, or simply contains the fixed default value. The result is an additional limitation for protocol buffer version 3 message types:

  • Protocol buffer version 3 message types do not support delta publish or delta subscribe.

Protocol buffer version 2 message types can require that specific fields are provided in a message (that is, fields can be marked required). The result is an additional limitation for protocol buffer version 2 message types:

  • Protocol buffer version 2 message types do not support providing a subset of fields in a message by specifying a select list.

There are no other limitations in working with protocol buffer message types.

Working with Optional Default Values

Google protocol buffers provide the ability for a message to have fields that are both optional, so they need not be provided in the serialized message, and defaulted, so that there is a specific value interpreted when there is no value provided.

When no value is provided in the serialized message for an optional default value, AMPS interprets the message differently depending on the context:

  • For most uses, AMPS interprets the message as though the value is present and set to the default value. This means that you can filter on optional default values, use them as SOW keys, and aggregate optional default values regardless of whether a value is present in the serialized message.

  • For delta messaging with protocol buffer version 2, AMPS treats an optional default value as though there is no value present. AMPS does not provide the default value. This means that a delta update must provide the default value explicitly in the serialized message to set the field to the default value. This also means that, if the value present in the message is not the default value, but was not changed on the current update, AMPS will not emit that value in messages to delta subscribers. (Since delta messaging is not supported with protocol buffer version 3, this issue does not arise with that version.)

Last updated

Copyright 2013-2024 60East Technologies, Inc.