Tag Archives: JSON

JSON# – Tutorial #5: Deserialising Complex Objects

Fork on Github
Download the Nuget package

The previous tutorial focused on deserialising simple JSON objects. This tutorial describes the process of deserialising a more complex object using JSON#.

Let’s use the ComplexObject class that we’ve leveraged in earlier tutorials:

class ComplexObject : IHaveSerialisableProperties {
    public string Name { get; set; }
    public string Description { get; set; }
    public List<ComplexArrayObject> ComplexArrayObjects { get; set; }
    public List<double> Doubles { get; set; }

    public SerialisableProperties GetSerializableProperties() {
        return new SerialisableProperties("complexObject", new List<JsonProperty> {
            new StringJsonProperty {
                Key = "name",
                Value = Name
            },
            new StringJsonProperty {
                Key = "description",
                Value = Description
            }
            }, new List<JsonSerialisor> {
                    new ComplexJsonArraySerialisor("complexArrayObjects",
                        ComplexArrayObjects.Select(c => c.GetSerializableProperties())),
                    new JsonArraySerialisor("doubles",
                        Doubles.Select(d => d.ToString(CultureInfo.InvariantCulture)), JsonPropertyType.Numeric)
            });
        }
    }

Let’s instantiate this with some values, and serialise to JSON. I won’t bloat this post with details on how to serialise, covered in previous posts. Here is our serialised ComplexObject instance:

{"complexObject":{"name":"Complex Object","description":"A complex object","complexArrayObjects":[{"name":"Array Object #1","description":"The 1st array object"},{"name":"Array Object #2","description":"The 2nd array object"}],"doubles":[1,2.5,10.8]}}

Notice that we have 2 collections. A simple collection of Doubles, and a more complex collection of ComplexArrayObjects. Let’s start with those.

First, create a new class, ComplexObjectDeserialiser, and implement the required constructor and Deserialise method.

Remember this method from the previous tutorial?

    var properties = jsonNameValueCollection.Parse(mergeArrayValues);

This effectively parses the JSON and loads each element into a NameValueCollection. This is fine for simple properties, however collection-based properties would cause the deserialiser to load each collection-element as a separate item in the returned NameValueCollection, which may be somewhat cumbersome to manage:

Flattened JSON collection

Flattened JSON collection

This is where the Boolean parameter mergeArrayValues comes in. It will concatenate all collection-based values in a comma-delimited string, and load this value into the returned NameValueCollection. This is much more intuitive, and allows consuming code to simply split the comma-delimited values and iterate as required.

Compressed JSON Collection

Compressed JSON Collection

Here is the complete code-listing:

class ComplexObjectDeserialiser : Deserialiser<ComplexObject> {
        public ComplexObjectDeserialiser(JsonNameValueCollection jsonNameValueCollection) : base(jsonNameValueCollection) {}

        public override ComplexObject Deserialise(bool mergeArrayValues = false) {
            var properties = jsonNameValueCollection.Parse(mergeArrayValues);

            var complexObject = new ComplexObject {
                Name = properties["complexObject.name"],
                Description = properties["complexObject.description"],
                ComplexArrayObjects = new List<ComplexArrayObject>(),
                Doubles = new List<double>()
            };

            var complexArrayObjectNames = properties["complexObject.complexArrayObjects.name"].Split(',');
            var complexArrayObjectDescriptions = properties["complexObject.complexArrayObjects.description"].Split(',');

            for (var i = 0; i &lt; complexArrayObjectNames.Length; i++) {
                var complexArrayObjectName = complexArrayObjectNames[i];

                complexObject.ComplexArrayObjects.Add(new ComplexArrayObject {
                    Name = complexArrayObjectName,
                    Description = complexArrayObjectDescriptions[i]
                });
            }

            var complexArrayObjectDoubles = properties["complexObject.doubles"].Split(',');

            foreach (var @double in complexArrayObjectDoubles)
                complexObject.Doubles.Add(Convert.ToDouble(@double));

            return complexObject;
        }
    }

As before, we deserialise as follows:

    Json.Deserialise(new ComplexObjectDeserialiser(new StandardJsonNameValueCollection("<JSON string...>")), true);

JSON# does most of the work, loading each serialised element into a NameValueCollection. We simply read from that collection, picking and choosing each element to map to an associated POCO property.
For collection-based properties, we simply retrieve the value associated with the collection’s key, split that value into an array, and loop through the array, loading a new object for each item in the collection, building our ComplexObject step-by-step.

This is the final JSON# post covering tutorial-based material. I’m working on a more thorough suite of performance benchmarks, and will publish the results, as well as offering a more in-depth technical analysis of JSON# in future posts. Please contact me if you would like to cover a specific topic.

The next posts will feature a tutorial series outlining Object Oriented, Test Driven Development in C# and Java. Please follow this blog to receive updates.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+

JSON# – Tutorial #4: Deserialising Simple Objects

Fork on Github
Download the Nuget package

The previous tutorial focused on serialising complex JSON objects. This tutorial describes the process of deserialisation using JSON#.

The purpose of JSON# is to allow memory-efficient JSON processing. At the root of this is the ability to dissect large JSON files and extract smaller structures from within, as discussed in Tutorial #1.

One thing, among many, that JSON.NET achieves, is allow memory-efficient JSON-parsing using the JsonTextReader component. The last thing that I want to do is reinvent the wheel; to that end, JSON# also provides a simple means of deserialisation, which wraps JsonTextReader. Using JsonTextReader alone, requires quite a lot of custom code, so I’ve provided a wrapper to make things simpler and allow reusability.

At the root of the deserialization process lies the StandardJsonNameValueCollection class. This is an implementation of the JsonNameValueCollection, from which custom implementations can be derived, if necessary, in a classic bridge design, leveraged from the Deserialiser component. Very simply, it reads JSON node values and stores them as key-value pairs; they key is the node’s path from root, and the value is the node’s value. This allows us to cache the JSON object and store it in a manner that provides easy access to its values, without traversing the tree:

class StandardJsonNameValueCollection : JsonNameValueCollection {
    public StandardJsonNameValueCollection(string json) : base(json) {}

    public override NameValueCollection Parse() {
    var parsed = new NameValueCollection();

    using (var reader = new JsonTextReader(new StringReader(json))) {
        while (reader.Read()) {
            if (reader.Value != null &amp;&amp; !reader.TokenType.Equals(JsonToken.PropertyName))
                parsed.Add(reader.Path, reader.Value.ToString());
            }
            return parsed;
        }
    }
}

Let’s work through an example using our SimpleObject class from previous tutorials:

class SimpleObject : IHaveSerialisableProperties {
    public string Name { get; set; }
    public int Count { get; set; }

    public virtual SerialisableProperties GetSerializableProperties() {
        return new SerialisableProperties("simpleObject", new List {
            new StringJsonProperty {
                Key = "name",
                Value = Name
            },
            new NumericJsonProperty {
                Key = "count",
                Value = Count
            }
        });
    }
}

Consider this class represented as a JSON object:

{
    "simpleObject": {
        "name": "SimpleObject",
        "count": 1
    }
}

The JSON# JsonNameValueCollection implementation will read each node in this JSON object and return the values in a NameValueCollection. Once stored, we need only provide a mechanism to instantiate a new SimpleObject POCO with the key-value pairs. JSON# provides the Deserialiser class as an abstraction to provide this functionality. Let’s create a class that accepts the JsonNameValueCollection and uses it to populate an associated POCO:

class SimpleObjectDeserialiser : Deserialiser {
    public SimpleObjectDeserialiser(JsonNameValueCollection parser) : base(parser) {}

    public override SimpleObject Deserialise() {
        var properties = jsonNameValueCollection.Parse();
        return new SimpleObject {
            Name = properties.Get("simpleObject.name"),
            Count = Convert.ToInt32(properties.Get("simpleObject.count"))
        };
    }
}

This class contains a single method designed to map properties. As you can see from the code snippet above, we read each key as a representation of corresponding node’s path, and then bind the associated value to a POCO property.
JSON# leverages the SimpleObjectDeserialiser as a bridge to deserialise the JSON string:

const string json = "{\"simpleObject\":{\"name\":\"SimpleObject\",\"count\":1}}";
var simpleObjectDeserialiser = new SimpleObjectDeserialiser(new StandardJsonNameValueCollection(json));
var simpleObject = Json.Deserialise(_simpleObjectDeserialiser);

So, why do this when I can just bind my objects dynamically using JSON.NET:

JsonConvert.DeserializeObject(json);

There is nothing wrong with deserialising in the above manner. But let’s look at what happens. First, it will use Reflection to look up CLR metadata pertaining to your object, in order to construct an instance. Again, this is ok, but bear in mind the performance overhead involved, and consider the consequences in a web application under heavy load.

Second, it requires that you decorate your object with serialisation attributes, which lack flexibility in my opinion.

Thirdly, if your object is quite large, specifically over 85K in size, you may run into memory issues if your object is bound to the Large Object Heap.

Deserialisation using JSON#, on the other hand, reads the JSON in stream-format, byte-by-byte, guaranteeing a small memory footprint, nor does it require Reflection. Instead, your custom implementation of Deserialiser allows a greater degree of flexibility when populating your POCO, than simply decorating properties with attributes.

I’ll cover deserialising more complex objects in the next tutorial, specifically, objects that contain complex properties such as arrays.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+

JSON# – Tutorial #3: Serialising Complex Objects

Fork on Github
Download the Nuget package

The last tutorial focused on serialising simple JSON objects. This tutorial contains a more complex example.

Real-world objects are generally more complex than typical “Hello, World” examples. Let’s build such an object; and object that contains complex properties, such as other objects and collections. We’ll start by defining a sub-object:

class SimpleSubObject: IHaveSerialisableProperties {
    public string Name { get; set; }
    public string Description { get; set; }

    public SerialisableProperties GetSerializableProperties() {
        return new SerialisableProperties(&quot;simpleSubObject&quot;, new List&lt;JsonProperty&gt; {
            new StringJsonProperty {
                Key = &quot;name&quot;,
                Value = Name
            },
            new StringJsonProperty {
                Key = &quot;description&quot;,
                Value = Description
            }
        });
    }
}

This object contains 2 simple properties; Name and Description. As before, we implement the IHaveSerialisableProperties interface to allow JSON# to serialise the object. Now let’s define an object with a property that is a collection of SimpleSubObjects:

class ComplexObject: IHaveSerialisableProperties {
    public string Name { get; set; }
    public string Description { get; set; }

    public List&lt;SimpleSubObject&gt; SimpleSubObjects { get; set; }
    public List&lt;double&gt; Doubles { get; set; }

    public SerialisableProperties GetSerializableProperties() {
        return new SerialisableProperties(&quot;complexObject&quot;, new List&amp;lt;JsonProperty&amp;gt; {
            new StringJsonProperty {
                Key = &quot;name&quot;,
                Value = Name
            },
            new StringJsonProperty {
                Key = &quot;description&quot;,
                Value = Description
            }
        }, 
        new List&lt;JsonSerialisor&gt; {
            new ComplexJsonArraySerialisor(&quot;simpleSubObjects&quot;,
                SimpleSubObjects.Select(c =&amp;gt; c.GetSerializableProperties())),
            new JsonArraySerialisor(&quot;doubles&quot;,
                Doubles.Select(d =&amp;gt; d.ToString(CultureInfo.InvariantCulture)), JsonPropertyType.Numeric)
        });
    }
}

This object contains some simple properties, as well as 2 collections; the first, a collection of Double, the second, a collection of SimpleSubObject type.

Note the GetSerializableProperties method in ComplexObject. It accepts a collection parameter of type JsonSerialisor, whichrepresents the highest level of abstraction in terms of the core serialisation components in JSON#. In order to serialise our collection of SimpleSubObjects, we leverage an implementation of JsonSerialisor called ComplexJsonArraySerialisor, designed specifically to serialise collections of objects, as opposed to primitive types. Given that each SimpleSubObject in our collection contains an implementation of GetSerializableProperties, we simply pass the result of each method to the ComplexJsonArraySerialisor constructor. It will handle the rest.

We follow a similar process to serialise the collection of Double, in this case leveraging JsonArraySerialisor, another implementation of JsonSerialisor, specifically designed to manage collections of primitive types. We simply provide the collection of Double in their raw format to the serialisor.

Let’s instantiate a new instance of ComplexObject:

var complexObject = new ComplexObject {
    Name = &quot;Complex Object&quot;,
    Description = &quot;A complex object&quot;,

    SimpleSubObjects = new List&lt;SimpleSubObject&gt; {
        new SimpleSubObject {
            Name = &quot;Sub Object #1&quot;,
            Description = &quot;The 1st sub object&quot;
        },
            new SimpleSubObject {
            Name = &quot;Sub Object #2&quot;,
            Description = &quot;The 2nd sub object&quot;
        }
    },
    Doubles = new List&lt;double&gt; {
        1d, 2.5d, 10.8d
    }
};

As per the previous tutorial, we serialise as follows:

var writer = new BinaryWriter(new MemoryStream(), new UTF8Encoding(false));
var serialisableProperties = complexObject.GetSerializableProperties();

using (var serialisor = new StandardJsonSerialisationStrategy(writer))
    Json.Serialise(serialisor, new JsonPropertiesSerialisor(serialisableProperties));

Note the use of StandardJsonSerialisationStrategy here. This is the only implementation of JsonSerialisationStrategy, one of the core serialisation components in JSON#. The abstraction exists to provide extensibility, so that different strategies might be applied at runtime, should specific serialisation rules vary across requirements.

In the next tutorial I’ll discuss deserialising objects using JSON#.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+

JSON# – Tutorial #2: Serialising Simple Objects

Fork on Github
Download the Nuget package

The last tutorial focused on parsing embedded JSON objects. This time, we’ll focus on serialising simple objects in C#.

Object serialisation using JSON# is 25 times to several hundred times faster than serialisation using JSON.NET, on a quad-core CPU with 16GB RAM. The source code is written in a BDD-manner, and the associated BDD features contain performance tests that back up these figures.

Let’s start with a basic class in C#:

class SimpleObject {
    public string Name { get; set; }
    public int Count { get; set; }
}

Our first step is to provide serialisation metadata to JSON#. Traditionally, most frameworks use Reflection to achieve this. While this works very well, it requires the component to know specific assembly metadata that describes your object. This comes with a slight performance penalty.

Ideally, when leveraging Reflection, the optimal design is a solution that reads an object’s assembly metadata once, and caches the result for the duration of the application’s run-time. This is generally not achievable with stateless HTTP calls. Using Reflection, we will likely query the object’s assembly during each HTTP request when serialising or de-serialising an object, suffering the associated performance-overhead for each request.

JSON# allows us to avoid that overhead by exposing serialisation metadata in the class itself:

class SimpleObject : IHaveSerialisableProperties {
    public string Name { get; set; }
    public int Count { get; set; }

    public virtual SerialisableProperties GetSerializableProperties() {
        return new SerialisableProperties(&quot;simpleObject&quot;, 
        new List&lt;JsonProperty&gt; {
            new StringJsonProperty {
                Key = &quot;name&quot;,
                Value = Name
            },
            new NumericJsonProperty {
                Key = &quot;count&quot;,
                Value = Count
            }
        });
    }
}

First, we need to implement the IHaveSerialisableProperties interface, allowing JSON# to serialise our object. Notice the new method, GetSerializableProperties, that returns a SerialisableProperties object, which looks like this:

public class SerialisableProperties {
   public string ObjectName { get; set; }
       public IEnumerable&lt;JsonProperty&gt; Properties { get; private set; }
       public IEnumerable&lt;JsonSerialisor&gt; Serialisors { get; set; }

       public SerialisableProperties(IEnumerable&lt;JsonProperty&gt; properties) {
           Properties = properties;
       }

       public SerialisableProperties(IEnumerable&lt;JsonSerialisor&gt; serialisors) {
           Serialisors = serialisors;
       }

       public SerialisableProperties(string objectName,
           IEnumerable&lt;JsonProperty&gt; properties) : this(properties) {
           ObjectName = objectName;
       }

       public SerialisableProperties(string objectName,
           IEnumerable&lt;JsonSerialisor&gt; serialisors) : this(serialisors) {
           ObjectName = objectName;
       }

       public SerialisableProperties(IEnumerable&lt;JsonProperty&gt; properties,
            IEnumerable&lt;JsonSerialisor&gt; serialisors) : this(properties) {
            Serialisors = serialisors;
        }

        public SerialisableProperties(string objectName,
            IEnumerable&lt;JsonProperty&gt; properties, IEnumerable&lt;JsonSerialisor&gt; serialisors)
            : this(properties, serialisors) {
            ObjectName = objectName;
        }
    }
}

This object is essentially a mapper that outlines how an object should be serialised. Simple types are stored in the Properties property, while more complex types are retrieved through custom JsonSerialisor objects, which I will discuss in the next tutorial. The following code outlines the process involved in serialising a SimpleObject instance:

First, we initialise our object

    var simpleObject = new SimpleObject {Name = &quot;Simple Object&quot;, Count = 10};

Now initialise a BinaryWriter, setting the appropriate Encoding. This will be used to build the object’s JSON-representation, under-the-hood.

var writer = new BinaryWriter(new MemoryStream(), new UTF8Encoding(false));

Now we use our Json library to serialise the object

var serialisableProperties = simpleObject.GetSerializableProperties();
byte[] serialisedObject;

using (var serialisor = new StandardJsonSerialisationStrategy(writer)) {
    Json.Serialise(serialisor, new JsonPropertiesSerialisor(serialisableProperties));
    serialisedObject = serialisor.SerialisedObject;
}

Below is the complete code-listing:

var simpleObject = new SimpleObject {Name = &quot;Simple Object&quot;, Count = 10};

var writer = new BinaryWriter(new MemoryStream(), new UTF8Encoding(false));
var serialisableProperties = simpleObject.GetSerializableProperties();

byte[] serialisedObject;

using (var serialisor = new StandardJsonSerialisationStrategy(writer)) {
    Json.Serialise(serialisor, new JsonPropertiesSerialisor(serialisableProperties));
    serialisedObject = serialisor.SerialisedObject;
}

Now our serialisedObject variable contains a JSON-serialised representation of our SimpleObject instance, as an array of raw bytes. We’ve achieved this without Reflection, by implementing a simple interface, IHaveSerialisableProperties in our SimpleObject class, and have avoided potentially significant performance-overhead; while a single scenario involving reflection might involve very little performance-overhead, consider a web application under heavy load, leveraging Reflection. We can undoubtedly support more concurrent users per application tier if we avoid Reflection. JSON# allows us to do just that.

In the next tutorial, I’ll discuss serialising complex objects.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+

JSON# – Tutorial #1: Returning Embedded-Objects

Fork on Github
Download the Nuget package

I’ve previously blogged about the premise behind JSON#. For a full explanation of the theory behind the code, check out this post.

Now, let’s dive into an example…

Let’s say that we have a JSON object that represents a real-world object, like a classroom full of students. It might look something like this:

{
    "classroom": {
        "teachers": [
            {
                "name": "Pablo",
                "age": 33
            },
            {
                "name": "John",
                "age": 28
            }
        ],
        "blackboard": {
            "madeOf": "wood",
            "height": "100",
            "width": "500"
        }
    }
}

This doesn’t look like it presents any great challenge to parse on any platform. But what if we expand it further to describe a school:

{
    "school": {
        "classrooms": [
            {
                "name": "Room #1",
                "teachers": [
                    {
                        "name": "Pablo",
                        "age": 33
                    },
                    {
                        "name": "John",
                        "age": 28
                    }
                ],
                "blackboard": {
                    "madeOf": "wood",
                    "height": "100",
                    "width": "500"
                }
            },
            {
                "name": "Room #2",
                "teachers": [
                    {
                        "name": "David",
                        "age": 33
                    },
                    {
                        "name": "Mary",
                        "age": 28
                    }
                ],
                "blackboard": {
                    "madeOf": "metal",
                    "height": "200",
                    "width": "600"
                }
            }
        ]
    }
}

Notice that our school object contains 2 classrooms, each of which contain similar objects, such as “blackboard”. Imagine that our school object needs to represent every school in the country. For argument’s sake, let’s say that we need to retrieve details about the blackboard in every classroom of every school. How would we go about that?

Well, we could refer to one of numerous JSON-parsing tools available. But how do these tools actually operate? Firstly, our massive JSON object will likely end up a large object. I’ve mentioned in previous blogs that objects greater than 85KB can significantly impact performance. So, immediately we’re potentially in trouble.

We can always cache the JSON object as a Stream, and read from it byte-by-byte. Tools like JSON.net offer capabilities like this, using components such as the JsonTextReader. So we’ve overcome the performance overhead associated with storing big strings in memory. But now we have another problem – we’re drilling into a massive JSON file, and searching for metadata that’s spread widely. We’re going to have to implement a lot of logic in order to draw the “blackboard” objects out.

What if requirements change, and we no longer need the “blackboard” objects? Now we just need the height of each blackboard. Well, we’ll have to throw out a lot of code, which is essentially wasted effort. Requirements change again, and we no longer need “blackboard” objects at all – now we need “teacher” objects. We need to rewrite all of our logic. Not the most flexible solution.

Let’s do this instead:

First, download the JSON# library from Github (MIT license).

Now let’s get those “blackboard” objects:

    const string schoolMetadata = @"{ "school": {...";
    var jsonParser = new JsonObjectParser();

    using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(schoolMetadata))) {
        Json.Parse(jsonParser, stream, "blackboard");
    }

This will return all “blackboard” objects from the JSON file. Let’s say that our requirements change. Now we need all “teacher” objects instead. Simply change the code as follows:

    const string schoolMetadata = @"{ "school": {...";
    var jsonParser = new JsonObjectParser();

    using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(schoolMetadata))) {
        Json.Parse(jsonParser, stream, "teachers");
    }

Such a change would have required significant effort, had we implemented our own custom logic using JSON.net’s JsonTextReader, or a similar component. Using JSON#, we achieve this by changing a single word. Now we’ve:

  • Optimised performance
  • Reduced our application’s memory-footprint
  • Avoided the Large Object Heap
  • Reduced development-time

The next tutorial outlines how to serialise objects using JSON#.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+

Ultrafast JSON Parsing

Fork on Github
Download the Nuget package

I previously blogged about parsing JSON using JSON.NET’s JsonTextReader, during which I touched on a key point; the Large Object Heap, and why to avoid it.
We’ve all been asked at some point or another to fix a buckled project while consuming minimal cost in terms of development effort. This generally happens when a project grinds to a halt due to poor performance as a result of bad design. Nobody wants to acknowledge that they’ve commissioned one of these, so at that point, you’re called in, and expected to deliver this, within the constraints of this.

I’ve had to privilege of taking part in such projects several times, and one in particular comes to mind. The project marshalled objects from one tier to another in JSON-format. These objects were unusually large, over 1MB in most cases and overall performance was poor. The interesting thing was that only certain segments of the JSON file were necessary for the various HTTP endpoints to process. Pruning the JSON structure wasn’t an option, so I started to look at optimisation.

The first thing to come to mind was the Large Object Heap in .NET. Any .NET object over 85K in size is considered a large object, and can cause performance issues. Prior to Garbage Collection, all threads apart from the thread that triggered Garbage Collection are suspended to allow the various Generations to be released. Releasing LOH objects can introduce performance bottlenecks. In a nutshell, it takes time to release such objects (170,000 cycles, at least), and can result in excessive Garbage Collection. You may have experienced this before, when for example, IIS inexplicably hangs, though that’s not always as a result of Garbage Collection.

In any case, it occurred to me that every time a large JSON object was returned during a HTTP request, it was cached in a string variable, and from there straight onto the LOH. I mentioned that a great deal of this JSON was superfluous, so I thought about a way to parse the large JSON object and extract the necessary embedded objects. The key to this is to avoid strings, and instead deal with raw bytes through streaming. Streamed objects are processed in small chunks, where each chunk will occupy a small section of memory. Processing the large JSON structure chunk by chunk until I find what I’m looking for, sounds like a good way to avoid the LOH.

My initial post on this provided a nice and simple class to achieve this. Since then, given the potential complexity of large JSON objects, I’ve put together a library to cover more complex scenarios. The library provides the following capabilities:

  • Parse JSON 7+ times faster than JSON.NET
  • Serialise JSON several hundred times faster than JSON.NET
  • Remove specific section(s) from large JSON structures while avoiding the LOH
  • Provide a simple interface to serialise and deserialise while avoiding reflection

Here are some example scenarios:

Retrieving an Embedded object, or Series of Objects from a Large JSON Structure

Let’s say you have a large JSON object that contains among other things, 1 or more embedded objects of the following structure:

"simpleObject": {
  "name": "Simple Object",
  "count": 1
}

You can retrieve all instances of these objects with the following code:

    byte[] largeJson = GetLargeJson();
    var parser = new JsonObjectParser();

    Json.Parse(parser, new MemoryStream(largeJson), "simpleObject");
    var embeddedObjects = parser.Objects;

We’ve provided a large JSON structure in byte-format, and instructed the API to retrieve all embedded objects called “simpleObject”. The Parse method contains a reference to JsonObjectParser. This is one of two types of parser. Its counterpart is JsonArrayParser. You can alternate between the two depending on the nature of the JSON you’re parsing, whether it be object or array.

Serialising an Object without Reflection

Traditional serialisation techniques that leverage custom Attributes involve Reflection. This comes with a performance overhead. While this may be nominal, we can avoid it altogether to serialise our objects. The API provides a way to achieve this through the IHaveSerialisableProperties interface. The following object extends this interface and explicitly exposes serialisation metadata:

class ComplexArrayObject : IHaveSerialisableProperties {
  public string Name { get; set; }
  public string Description { get; set; }

  public SerialisableProperties GetSerializableProperties() {
    return new SerialisableProperties("complexArrayObject", new List {
      new StringJsonProperty {
        Key = "name",
        Value = Name
      },
      new StringJsonProperty {
        Key = "description",
        Value = Description
      }
    });
  }
}

Now that the class exposes its serialisation metadata, we can serialise it as follows:

var writer = new BinaryWriter(new MemoryStream(), new UTF8Encoding(false));
var serialisableProperties = myObject.GetSerializableProperties();

using (var serialisor = new StandardJsonSerialisationStrategy(writer)) {
  Json.Serialise(serialisor, new JsonPropertiesSerialisor(serialisableProperties));
  return serialisor.SerialisedObject;
}

We use a BinaryWriter to serialise the object, and provide its serialiable properties. We use the StandardJsonSerialisationStrategy class to effect the serialisation. This is a Builder class. Its abstraction allows us to modify the serialisation process if necessary. The returned object is serialised to a byte array. The end result is a serialised object, processed without reflection. There is a full suite of BDD specifications included with the API, including some speed tests. Incidentally, the included serialisation speed-test indicates an overall processing time of at least 20, to several hundred times faster than JSON.NET.

Deserialising an Object without Reflection

The API provides a means to traverse large JSON structures in order to extract embedded objects, very quickly. Once our object(s) are extracted, we likely want to deserialise them. From a practical perspective, I’m assuming that the extracted objects are of a reasonable size. If not, they likely need to be segmented further by extracting their contents. Assuming that the resulting object(s) are ready for serialisation, we are ready to do so. Again, let’s avoid reflection in order to optimise performance. JSON.NET offers an efficient way to do this using the JsonTextReader class. Rather than reinvent the wheel, I’ve wrapped this class in a manner that reads serialised properties into a NameValueCollection, and allows you to extract them to a POCO as follows:

Let’s say we have the following POCO:

class SimpleObject {
  public string Name { get; set; }
  public int Count { get; set; }
}

We create an associated class to facilitate deserialization:

class SimpleObjectDeserialiser : Deserialiser {
  public SimpleObjectDeserialiser(SimpleJSONParser parser) : base(parser) {}

  public override SimpleObject Deserialise() {
    var properties = parser.Parse();

    return new SimpleObject {
      Name = properties.Get("simpleObject.name"),
      Count = Convert.ToInt32(properties.Get("simpleObject.count"))
    };
  }
}

Here, we inherit from an abstraction that parses our JSON object. The parsed results are loaded into a NameValueCollection, in dot-notation format to signify hierarchy. E.g.:

  • simpleObject.name
  • simpleObject.count

We override the default Deserialise method and map the parsed properties to our POCO.
Here is an example using the above classes:

    var simpleObjectDeserialiser = new SimpleObjectDeserialiser(new SimpleJSONParser(myJson));
    var simpleObject = Json.Deserialise(simpleObjectDeserialiser);

The end result is a POCO instantiated from our JSON object.

Summary

Sometimes you can’t avoid dealing with large JSON objects – but you can avoid falling prey to memory-related performance issues.

The purpose of this API is to provide a means of segmenting JSON files, and to serialise and deserialise JSON objects in a performance-optimised manner. Leveraging this API, you can

  • Avoid the Large Object Heap
  • Avoid reflection
  • Segment unmanageable JSON into manageable chunks
  • All at exceptionally fast speeds

“Should I only use the library if I deal with big JSON objects?”

“No! Even on smaller JSON objects, this library is faster than JSON.NET – there’s a performance-win either way. This API deals with raw bytes, and leverages a concept called deferred-execution. I’m happy to talk about these, and other elements of the design, in more detail if you contact me directly.”

For a step-by-step tutorial, check out the next post.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+

JSON Parsing Using JsonTextReader

Download the code on GitHub
Install with NuGet.

JSON.net is the de facto standard in terms of ASP.NET JSON parsing. Recently I began performance tuning an ASP.NET Web API application. Most of the work involved identifying bottlenecks and resolving them by leveraging the async and await operators in C#5, optimising IIS thread-management, etc., then I started looking at deserialisation.

James Newton King mentions that the fastest possible method of deserialising JSON is to leverage the JsonTextReader. Before I talk about that, let’s look at how a typical implementation works:

var proxy = WebRequest.Create("http://somefeed.com");

var response = proxy.GetResponse();

var stream = response.GetResponseStream();

Note that we’re pulling the request back as an IO.Stream, rather than a string. The problem with caching the response in a string is that in .NET, any object larger than 85KB is automatically assigned to the Large Object Heap. These objects require the Garbage Collector to suspend all threads in IIS in order to destroy them, which has major implications from a performance perspective. If the returned feed is reasonably large, and you cache it in a string, you’ll potentially introduce significant overhead in your web application. Caching to an IO.Stream avoids this issue, because the feed will be chunked and read in smaller portions as you parse it.

Now, let’s say our feed returns a list of people in JSON format:

[
    {
        firstName: "Paul",
        surname: "Mooney"
    },
    {
	    firstName: "Some",
        surname: "OtherGuy"
    }
]

We can parse this with the following:

var result = JsonConvert.DeserializeObject(stream.ReadToEnd());

Assuming that we have a C# class as follows:

class Person {
	public string FirstName { get; set; }
	public string Surname { get; set; }
}

JSON.net deserialises this under the hood using Reflection, a technique which involves reading the classes metadata and mapping corresponding JSON tags to each property, which is costly from a performance perspective. Another downside is the fact that if our JSON objects are embedded in parent objects from another proprietary system, or oData for example, the above method will fail on the basis that the JSON tags don’t match. In other words, our JSON feed needs to match our C# class verbatim.

JSON.net provides a handy mechanism to overcome this: Object Parsing. Instead of using reflection to automatically construct and bind our C# classes, we can parse the entire feed to a JObject, and then drill into this using LINQ, for example, to draw out the desired classes:

var json = JObject.Parse(reader.ReadToEnd());

var results = json["results"]
	.SelectMany(s => s["content"])
        .Select(person => new Person {
            FirstName = person["firstName"].ToString(),
	    Surname = person["surname"].ToString()
	});

Very neat. The problem with this is that we need to parse the entire feed to draw back a subset of data. Consider that if the feed is quite large, we will end up parsing much more than we need.

To go back to my original point, the quickest method of parsing JSON, using JSON.net, is to us the JsonTextReader. Below, you can find an example of a class I’ve put together which reads from a JSON feed and parses only the metadata that we require, ignoring the rest of the feed, without using Reflection:

public abstract class JsonParser<TParsable>; where TParsable : class, new() {
        private readonly Stream json;
        private readonly string jsonPropertyName;

        public List<T> Result { get; private set; }

        protected JsonParser(Stream json, string jsonPropertyName) {
            this.json = json;
            this.jsonPropertyName = jsonPropertyName;

            Result = new List<TParsable>();
        }

        protected abstract void Build(TParsable parsable, JsonTextReader reader);

        protected virtual bool IsBuilt(TParsable parsable, JsonTextReader reader) {
            return reader.TokenType.Equals(JsonToken.None);
        }

        public void Parse() {
            using (var streamReader = new StreamReader(json)) {
                using (var jsonReader = new JsonTextReader(streamReader)) {
                    do {
                        jsonReader.Read();
                        if (jsonReader.Value == null || !jsonReader.Value.Equals(jsonPropertyName)) continue;

                        var parsable = new TParsable();

                        do {
                            jsonReader.Read();
                        } while (!jsonReader.TokenType.Equals(JsonToken.PropertyName) && !jsonReader.TokenType.Equals(JsonToken.None));

                        do {
                            Build(parsable, jsonReader);
                            jsonReader.Read();
                        } while (!IsBuilt(parsable, jsonReader));

                        Result.Add(parsable);
                    } while (!jsonReader.TokenType.Equals(JsonToken.None));
                }
            }
        }
    }

This class is an implementation of the Builder pattern.

In order to consume it, you need only extend the class with a concrete implementation:

public class PersonParser : JsonParser
    {
        public PersonParser(Stream json, string jsonPropertyName) : base(json, jsonPropertyName) { }

        protected override void Build(Person parsable, JsonTextReader reader)
        {
            if (reader.Value.Equals("firstName"))
            {
                reader.Read();
                parsable.FirstName = (string)reader.Value;
            }
            else if (reader.Value.Equals("surname"))
            {
                reader.Read();
                parsable.Surname = (string)reader.Value;
            }
        }

        protected override bool IsBuilt(Person parsable, JsonTextReader reader)
        {
            var isBuilt = parsable.FirstName != null &amp;&amp; parsable.Surname != null;
            return isBuilt || base.IsBuilt(parsable, reader);
        }
    }

Here, we’re overriding two methods; Build and IsBuilt. The first tells the class how to map the JSON tags to our C# object. The second, how to determine when our object is fully built.

I’ve stress-tested this; worst case result was 18.75 times faster than alternative methods. Best case was 45.6 times faster, regardless of the size of the JSON feed returned (in my case, large – about 450KB).

Leveraging this across applications can massively reduce thread-consumption and overhead for each feed.

The JsonParser class accepts 2 parameters. First, the JSON stream returned from the feed, deliberately in stream format for performance reasons. Streams are chucked by default, so that we read them one section at a time, whereas strings will consume memory of equivalent size to the feed itself, potentially ending up in the Large Object Heap. Second, the jsonPropertyName, which tells the parser to target a specific serialised JSON object.

These classes are still in POC stage. I’ll be adding more functionality over the next few days. Any feedback welcome.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+