Skip to content
matiu2 edited this page Nov 18, 2011 · 3 revisions

Yet Another JSON Parser

This is a fun project experimenting with state machines.

We use ragel to generate a state machine to parser JSON in C++

A similar thing was done for Ruby's JSON parser: https://github.com/flori/json/blob/master/ext/json/ext/parser/parser.rl

This new JSON Parser has (or hope to have) the following features/advantages:

  • Is fun for learning state machines
  • Maps well to existing C++ objects
  • Super duper fast
  • Well tested
  • Well documented

State machines

Breifly to understand the diagrams below:

  • Each circle is a state
  • Every time a character of JSON is read, depending on the character, we'll make a transition to a new state
  • Some transitions and states have actions associated with them
  • If the character ends up in a final state (double circles) the json was parsed correctly

In the diagrams below, each transition (arrow) shows the character range that would trigger it. If a character is hit that is not in any ranges we expect, an error action is thrown and the parser will try to recover by switching it's state to 'findNextElement'

The work after the slash is the name of the action we execute when we hit a character for that transition.

Every time a character is read, the state changes, a transition occurs.

Here's the state machine for parsing a number:

Number State Machine

Here's the state machine (work in progress) for parsing a JSON string:

String parsing state machine

c++ object mapping

There are several layers in the parsing library:

  1. The lowest level parses simple strings a numbers
  2. The higher level tells the parser what we want

The higher level allows the user to write a template function mapping attributes of objects to names. For example:

struct Person {
   std::string name;
   unsigned short age;
   template<T>
   void jsonMapClass(T& mapper) {
      mapper.map("name", name);
      mapper.map("age", age);
   }
}

then if you want to read a bunch of people from json, you'd pass the parser a std::vector and it'll fill it all in for you calling jsonMapClass.