Protobuf Message Types
Protocol buffers, or protobufs for short, is an efficient, automated mechanism for serializing structured data. AMPS supports Google protobuf messages (version 2 and version 3) as a message format.
Since Google protocol buffers use a fixed format for messages, to use protobuf, you must configure AMPS with the definition of the messages AMPS will process. This involves defining a MessageType
. You must define a MessageType
for AMPS to be able to parse protobuf messages.
60East recommends that the .proto
files used with AMPS explicitly declare the protocol buffer syntax version used. If there is no explicit declaration, AMPS assumes the file uses protocol buffer 2 syntax.
The AMPS engine is message-type agnostic. Except for the limitations described in this section, there is no difference to the AMPS engine between message types that use protocol buffers and other message types such as JSON or XML or FIX.
Configuring Protobuf Message Types
To use a protobuf message, you must first edit the configuration file to include a new MessageType
. Then, specify the path to the protobuf file and the name of the protobuf file itself inside the MessageType
. Below is a sample configuration of a protobuf message type:
Each message type references a ProtoFile
, and specifies a single top-level type from the file. The ProtoFile
may include other files through the standard protocol buffer include mechanism. Likewise, the top-level type may be any valid protocol buffer definition, including definitions that contain other types.
Once the protocol buffer MessageType
is created as described above, you must either create a Transport
that specifies that message type exactly, or you must create a Transport
that can accept any known message type and ensure that the client specifies the new message type (in the example case, my-protobuf-message
) in the connect string.
When creating a protobuf
message type, you must provide the following parameters:
Parameter | Description |
---|---|
The name of the new, customized message type. The rest of the configuration file will use this name to refer to the message type. | |
The module that contains the message type. Use | |
The path in which to search for The alias provides a short identifier to use when searching for .proto files. The full path is the path that is substituted for that identifier. For example, in the sample above, A configuration may omit the alias and simply provide the path. For example: You may specify any number of | |
The name of the To use an alias, prefix the name of the file with the alias, as shown in the example above. | |
The name of the type inside the AMPS requires a single type. |
Filtering with Protobuf Messages
To filter protobuf messages, there are a couple of conventions you must remember. AMPS XPath identifiers begin at the outermost message, so you can simply use member names for that message. If you have nested messages, you use the name of the nested message and the member name when creating an XPath identifier.
For example, suppose you have the following definition in a .proto
file:
To access the personID
data member, you simply use the name of the data member as the XPath identifier. An example filter that verifies that a personID
is greater than 1000 would be:
If you have nested messages, you simply provide the path to the nested message you want to access.
Let's assume that the person
message from the above example was nested inside another message with the name of record
. The example filter below shows how to access the nested person
message, and then filter to the personID
:
In this case, the first part of the identifier (/person
) specifies the sub-message. The second part of the identifier (/personID
) specifies the field within that sub-message. Notice that, as always, there is no need to specify the name of the message for the outermost message.
Working with Multiple Protocol Buffer Types
Some applications require messages of different types: for example, an inventory management system may work with customer records, inventory records, and shipping order records.
When using protocol buffers, each of these messages would use a different .proto
file, and therefore would be a different message type. Unlike a self-describing format such as JSON or XML, the serialized form of a protocol buffer message type does not automatically contain any information about the type of message or the fields that the message contains. Therefore, each protocol buffer message type is best considered as a completely distinct type. For example, the parser created for an order record and the parser created for a customer record are different. Unlike self-describing formats, it is not possible to use a single parser for these types, or for a parser to correctly handle a previously-unknown message structure.
There are two approaches to working with multiple protocol buffer types in an AMPS application:
Keep the message types distinct. Each message type requires a separate connection to AMPS. The advantage of this approach is that the
.proto
files can be maintained and updated separately. Each connection has a distinct type and only needs to handle messages of that type. The disadvantage of this approach is that the application must make a connection to AMPS for each type of message received.Create a "container" type that can optionally contain any of the needed message types. The advantage of this approach is that this requires only a single connection to AMPS. Since there is a single "container" type, a topic can hold this "container" type and have heterogeneous actual contents. The disadvantage to this approach is that it requires a consumer to understand the "container" type and changes to the contained types may need to be carefully managed across the consumers that use the container. A "container" type is typically a
oneof
of the contained types.
For example, you might define a container as follows:
In this case, the container type will include either an Order
or a Payment
.
Union Types
When using a protocol buffer message type that contains a union, you can navigate the union using the names defined in the top-level element. For example, given the union defined below:
Providing a filter of /order_type IS NOT NULL
will return all of the MyUnion
messages that contain an Order
, while providing a filter of /payment_type/customer_id = '42'
will return only the MyUnion
messages that contain a Payment
message with a customer_id
of 42
.
Limitations of the Protobuf Message Type
Because the protobuf
message type requires a specific, fixed definition for messages, AMPS does not support operations that construct messages that may contain arbitrary values. In particular, protobuf does not support:
Creating a View with a
protobuf
type as theMessageType
. AMPS allows you to aggregate protobuf messages and project the results as another type, but the destinationMessageType
for a View cannot be aprotobuf
message type.Creating an aggregated subscription for a topic that contains messages of a
protobuf
message type.Subscriptions to AMPS internal topics. Protobuf message types do not support creating messages for AMPS internal topics, such as
/AMPS/ClientStatus
.Enriching or preprocessing
protobuf
message types. AMPS does not support enrichment or preprocessing ofprotobuf
messages.
Protocol buffer version 3 messages provide fixed default values for omitted fields. This means that there is no reliable way for AMPS to determine if a missing field has been intentionally left out of the message, or simply contains the fixed default value. The result is an additional limitation for protocol buffer version 3 message types:
Protocol buffer version 3 message types do not support delta publish or delta subscribe.
Protocol buffer version 2 message types can require that specific fields are provided in a message (that is, fields can be marked required). The result is an additional limitation for protocol buffer version 2 message types:
Protocol buffer version 2 message types do not support providing a subset of fields in a message by specifying a select list.
There are no other limitations in working with protocol buffer message types.
Working with Optional Default Values
Google protocol buffers provide the ability for a message to have fields that are both optional, so they need not be provided in the serialized message, and defaulted, so that there is a specific value interpreted when there is no value provided.
When no value is provided in the serialized message for an optional default value, AMPS interprets the message differently depending on the context:
For most uses, AMPS interprets the message as though the value is present and set to the default value. This means that you can filter on optional default values, use them as SOW keys, and aggregate optional default values regardless of whether a value is present in the serialized message.
For delta messaging with protocol buffer version 2, AMPS treats an optional default value as though there is no value present. AMPS does not provide the default value. This means that a delta update must provide the default value explicitly in the serialized message to set the field to the default value. This also means that, if the value present in the message is not the default value, but was not changed on the current update, AMPS will not emit that value in messages to delta subscribers. (Since delta messaging is not supported with protocol buffer version 3, this issue does not arise with that version.)
Last updated