Skip to content

Cpp OTF User Guide

Todd L. Montgomery edited this page Jan 6, 2016 · 10 revisions

Some applications, such as network sniffers, need to process messages dynamically and thus have to use the Intermediate Representation to decode the messages on-the-fly (OTF). An example of using the OTF API can be found here.

The C++ OTF decoder follows the design principles of the generated codec stubs, and it is thread safe to be reused concurrently across multiple threads for memory efficiency.

Note: Due to the dynamic nature of OTF decoding, the stubs generated by the SBE compiler will yield greater relative performance.

Getting Started

Before messages can be decoded it is necessary to retrieve the IR for the schema describing the messages and types. This can be down by reading the encoded IR from a file or from a buffer.

    IrDecoder irDecoder;

    if (irDecoder.decode("generated/example-schema.sbeir") != 0)
    {
        std::cerr << "could not load SBE IR\n";
        return -1;
    }

Once the IR is decoded you can then create the OTF decoder for the message header:

    std::shared_ptr<std::vector<Token>> headerTokens = irDecoder.header();

    OtfHeaderDecoder headerDecoder(headerTokens);

You are now ready to decode messages as they arrive. This can be done by first reading the message header then looking up the appropriate template to decode the message body.

    std::uint64_t templateId = headerDecoder.getTemplateId(buffer);
    std::uint64_t actingVersion = headerDecoder.getSchemaVersion(buffer);
    std::uint64_t blockLength = headerDecoder.getBlockLength(buffer);

Once you have decoded the header you can lookup the IR for the appropriate message body then begin decoding.

    char buffer[2048];
    ExampleTokenListener tokenListener;

    const char *messageBuffer = buffer + headerDecoder.encodedLength();
    std::uint64_t length = sz - headerDecoder.encodedLength();
    std::uint64_t templateId = headerDecoder.getTemplateId(buffer);
    std::uint64_t actingVersion = headerDecoder.getSchemaVersion(buffer);
    std::uint64_t blockLength = headerDecoder.getBlockLength(buffer);

    std::shared_ptr<std::vector<Token>> messageTokens = irDecoder.message(templateId, actingVersion);

    const std::size_t result =
        OtfMessageDecoder::decode(messageBuffer, length, actingVersion, blockLength, messageTokens, tokenListener);

The eagle eyed will have noticed the TokenListener. If you are wondering what this is then wonder no longer and read on.

Decoding Messages

As messages are decoded a number of callback events will be generated as the structural elements of the message are encountered. The callbacks are received by the TokenListener parameter of the decode function. If you only want to receive some of the callbacks then inherit from BasicTokenListener.

Decoding Primitive Fields

Primitive fields are the most common data element to be decoded. These are simple types such as integers, floating point numbers, or characters. Primitive field encodings can be a single value or a fixed length array of the same type. To receive primitive values override the following method:

    virtual void onEncoding(
        Token& fieldToken,
        const char *buffer,
        Token& typeToken,
        std::uint64_t actingVersion)
    {
        printScope();
        std::cout << fieldToken.name() << "=" << asString(typeToken, buffer) << "\n";
    }

where asString is something like the below.

    std::string asString(const Token& token, const char *buffer)
    {
        const Encoding& encoding = token.encoding();
        const PrimitiveType type = encoding.primitiveType();
        const std::uint64_t length =
            (token.isConstantEncoding()) ?
                encoding.constValue().size() :
                static_cast<std::uint64_t>(token.encodedLength());
        std::ostringstream result;

        std::uint64_t num = length / lengthOfType(type);

        switch (type)
        {
            case PrimitiveType::CHAR:
            {
                if (num > 1)
                {
                    if (token.isConstantEncoding())
                    {
                        buffer = encoding.constValue().getArray();
                    }

                    result << std::string(buffer, length);
                }
                break;
            }
            case PrimitiveType::INT8:
            case PrimitiveType::INT16:
            case PrimitiveType::INT32:
            case PrimitiveType::INT64:
            {
                if (num > 1)
                {
                    const char *separator = "";

                    for (size_t i = 0; i < num; i++)
                    {
                        result << separator << Encoding::getInt(type, encoding.byteOrder(), buffer + (i * lengthOfType(type)));
                        separator = ", ";
                    }
                }
                else
                {
                    if (token.isConstantEncoding())
                    {
                        result << encoding.constValue().getAsInt();
                    }
                    else
                    {
                        result << encoding.getAsInt(buffer);
                    }
                }
                break;
            }
            case PrimitiveType::UINT8:
            case PrimitiveType::UINT16:
            case PrimitiveType::UINT32:
            case PrimitiveType::UINT64:
            {
                if (num == 1)
                {
                    if (token.isConstantEncoding())
                    {
                        result << encoding.constValue().getAsUInt();
                    }
                    else
                    {
                        result << encoding.getAsUInt(buffer);
                    }
                }
                break;
            }
            case PrimitiveType::FLOAT:
            case PrimitiveType::DOUBLE:
            {
                if (num == 1)
                {
                    result.setf(std::ios::fixed);
                    result << std::setprecision(1) << encoding.getAsDouble(buffer);
                }
                break;
            }
            default:
            {
                break;
            }
        }

        return result.str();
    }

The above code will output the values as strings to the console.

Note: Constant and optional fields are handled by using the metadata provided in the typeToken.

Decoding Enums

Enums are encoded on the wire as simple integers or characters. It is necessary to lookup the encoded representation via the metadata tokens to understand the wire encoded value.

    virtual void onEnum(
        Token& fieldToken,
        const char *buffer,
        std::vector<Token>& tokens,
        std::size_t fromIndex,
        std::size_t toIndex,
        std::uint64_t actingVersion)
    {
        const Token& typeToken = tokens.at(fromIndex + 1);
        const Encoding& encoding = typeToken.encoding();

        printScope();
        std::cout << fieldToken.name() << "=";

        for (size_t i = fromIndex + 1; i < toIndex; i++)
        {
            const Token &token = tokens.at(i);
            const PrimitiveValue constValue = token.encoding().constValue();

            if (typeToken.isConstantEncoding())
            {
                std::cout << token.name();
                break;
            }

            if (encoding.primitiveType() == PrimitiveType::CHAR)
            {
                if (encoding.getAsInt(buffer) == constValue.getAsInt())
                {
                    std::cout << token.name();
                    break;
                }
            }
            else if (encoding.primitiveType() == PrimitiveType::UINT8)
            {
                if (encoding.getAsUInt(buffer) == constValue.getAsUInt())
                {
                    std::cout << token.name();
                    break;
                }
            }
        }

        std::cout << "\n";
    }

Decoding BitSets

BitSets are represented on the wire as an integer with a bit set in the position indicating true or false for the choice value.

    virtual void onBitSet(
        Token& fieldToken,
        const char *buffer,
        std::vector<Token>& tokens,
        std::size_t fromIndex,
        std::size_t toIndex,
        std::uint64_t actingVersion)
    {
        const Token& typeToken = tokens.at(fromIndex + 1);
        const Encoding& encoding = typeToken.encoding();
        const std::uint64_t value = encoding.getAsUInt(buffer);

        printScope();
        std::cout << fieldToken.name() << ":";

        for (size_t i = fromIndex + 1; i < toIndex; i++)
        {
            const Token &token = tokens.at(i);
            const std::uint64_t constValue = token.encoding().constValue().getAsUInt();

            std::cout << " " << token.name() << "=";
            if (constValue && value)
            {
                std::cout << "true";
            }
            else
            {
                std::cout << "false";
            }
        }

        std::cout << "\n";
    }

A little bitwise manipulation is required to determine if a each choice is true or false as in the example above.

Decoding Composites

A composite is a reusable collection of fields to simplify the assembly of messages. The collection of fields usually has a semantic significance. Fields within a composite are decoded just like normal fields. Composites are signalled via callbacks to indicate the beginning and end of the composite. In the example, the begin and end are captured to scope fields by adding the scope to a stack in the example TokenListener.

    virtual void onBeginComposite(
        Token& fieldToken,
        std::vector<Token>& tokens,
        std::size_t fromIndex,
        std::size_t toIndex)
    {
        scope.push_back(fieldToken.name() + ".");
    }

    virtual void onEndComposite(
        Token& fieldToken,
        std::vector<Token>& tokens,
        std::size_t fromIndex,
        std::size_t toIndex)
    {
        scope.pop_back();
    }

Decoding Repeating Groups

Fields can be semantically bound into a repeating group. On the wire the repeating group has a header that defines the size in bytes of the block of fields and a count of how many times the block will repeat. Repeating groups are signalled by callbacks to indicate the beginning and end of block of fields with counter details for the iteration count and the number of times it will repeat in total.

    virtual void onGroupHeader(
        Token& token,
        std::uint64_t numInGroup)
    {
        printScope();
        std::cout << token.name() << " Group Header: numInGroup=" << numInGroup << "\n";
    }

    virtual void onBeginGroup(
        Token& token,
        std::uint64_t groupIndex,
        std::uint64_t numInGroup)
    {
        scope.push_back(token.name() + ".");
    }

    virtual void onEndGroup(
        Token& token,
        std::uint64_t groupIndex,
        std::uint64_t numInGroup)
    {
        scope.pop_back();
    }

Note: Repeating groups can nest so it is necessary to be prepared to handle this scope recursively.

Decoding Variable Length Data

At the end of a message it is possible to encode variable length strings or binary blobs. Strings are binary data that uses a schema defined character encoding.

    virtual void onVarData(
        Token& fieldToken,
        const char *buffer,
        std::uint64_t length,
        Token& typeToken)
    {
        printScope();
        std::cout << fieldToken.name() << "=" << std::string(buffer, length) << "\n";
    }