Monthly Archives: June 2013

JSON Parsing Using JsonTextReader

Download the code on GitHub
Install with NuGet.

JSON.net is the de facto standard in terms of ASP.NET JSON parsing. Recently I began performance tuning an ASP.NET Web API application. Most of the work involved identifying bottlenecks and resolving them by leveraging the async and await operators in C#5, optimising IIS thread-management, etc., then I started looking at deserialisation.

James Newton King mentions that the fastest possible method of deserialising JSON is to leverage the JsonTextReader. Before I talk about that, let’s look at how a typical implementation works:

var proxy = WebRequest.Create("http://somefeed.com");

var response = proxy.GetResponse();

var stream = response.GetResponseStream();

Note that we’re pulling the request back as an IO.Stream, rather than a string. The problem with caching the response in a string is that in .NET, any object larger than 85KB is automatically assigned to the Large Object Heap. These objects require the Garbage Collector to suspend all threads in IIS in order to destroy them, which has major implications from a performance perspective. If the returned feed is reasonably large, and you cache it in a string, you’ll potentially introduce significant overhead in your web application. Caching to an IO.Stream avoids this issue, because the feed will be chunked and read in smaller portions as you parse it.

Now, let’s say our feed returns a list of people in JSON format:

[
    {
        firstName: "Paul",
        surname: "Mooney"
    },
    {
	    firstName: "Some",
        surname: "OtherGuy"
    }
]

We can parse this with the following:

var result = JsonConvert.DeserializeObject(stream.ReadToEnd());

Assuming that we have a C# class as follows:

class Person {
	public string FirstName { get; set; }
	public string Surname { get; set; }
}

JSON.net deserialises this under the hood using Reflection, a technique which involves reading the classes metadata and mapping corresponding JSON tags to each property, which is costly from a performance perspective. Another downside is the fact that if our JSON objects are embedded in parent objects from another proprietary system, or oData for example, the above method will fail on the basis that the JSON tags don’t match. In other words, our JSON feed needs to match our C# class verbatim.

JSON.net provides a handy mechanism to overcome this: Object Parsing. Instead of using reflection to automatically construct and bind our C# classes, we can parse the entire feed to a JObject, and then drill into this using LINQ, for example, to draw out the desired classes:

var json = JObject.Parse(reader.ReadToEnd());

var results = json["results"]
	.SelectMany(s => s["content"])
        .Select(person => new Person {
            FirstName = person["firstName"].ToString(),
	    Surname = person["surname"].ToString()
	});

Very neat. The problem with this is that we need to parse the entire feed to draw back a subset of data. Consider that if the feed is quite large, we will end up parsing much more than we need.

To go back to my original point, the quickest method of parsing JSON, using JSON.net, is to us the JsonTextReader. Below, you can find an example of a class I’ve put together which reads from a JSON feed and parses only the metadata that we require, ignoring the rest of the feed, without using Reflection:

public abstract class JsonParser<TParsable>; where TParsable : class, new() {
        private readonly Stream json;
        private readonly string jsonPropertyName;

        public List<T> Result { get; private set; }

        protected JsonParser(Stream json, string jsonPropertyName) {
            this.json = json;
            this.jsonPropertyName = jsonPropertyName;

            Result = new List<TParsable>();
        }

        protected abstract void Build(TParsable parsable, JsonTextReader reader);

        protected virtual bool IsBuilt(TParsable parsable, JsonTextReader reader) {
            return reader.TokenType.Equals(JsonToken.None);
        }

        public void Parse() {
            using (var streamReader = new StreamReader(json)) {
                using (var jsonReader = new JsonTextReader(streamReader)) {
                    do {
                        jsonReader.Read();
                        if (jsonReader.Value == null || !jsonReader.Value.Equals(jsonPropertyName)) continue;

                        var parsable = new TParsable();

                        do {
                            jsonReader.Read();
                        } while (!jsonReader.TokenType.Equals(JsonToken.PropertyName) && !jsonReader.TokenType.Equals(JsonToken.None));

                        do {
                            Build(parsable, jsonReader);
                            jsonReader.Read();
                        } while (!IsBuilt(parsable, jsonReader));

                        Result.Add(parsable);
                    } while (!jsonReader.TokenType.Equals(JsonToken.None));
                }
            }
        }
    }

This class is an implementation of the Builder pattern.

In order to consume it, you need only extend the class with a concrete implementation:

public class PersonParser : JsonParser
    {
        public PersonParser(Stream json, string jsonPropertyName) : base(json, jsonPropertyName) { }

        protected override void Build(Person parsable, JsonTextReader reader)
        {
            if (reader.Value.Equals("firstName"))
            {
                reader.Read();
                parsable.FirstName = (string)reader.Value;
            }
            else if (reader.Value.Equals("surname"))
            {
                reader.Read();
                parsable.Surname = (string)reader.Value;
            }
        }

        protected override bool IsBuilt(Person parsable, JsonTextReader reader)
        {
            var isBuilt = parsable.FirstName != null &amp;&amp; parsable.Surname != null;
            return isBuilt || base.IsBuilt(parsable, reader);
        }
    }

Here, we’re overriding two methods; Build and IsBuilt. The first tells the class how to map the JSON tags to our C# object. The second, how to determine when our object is fully built.

I’ve stress-tested this; worst case result was 18.75 times faster than alternative methods. Best case was 45.6 times faster, regardless of the size of the JSON feed returned (in my case, large – about 450KB).

Leveraging this across applications can massively reduce thread-consumption and overhead for each feed.

The JsonParser class accepts 2 parameters. First, the JSON stream returned from the feed, deliberately in stream format for performance reasons. Streams are chucked by default, so that we read them one section at a time, whereas strings will consume memory of equivalent size to the feed itself, potentially ending up in the Large Object Heap. Second, the jsonPropertyName, which tells the parser to target a specific serialised JSON object.

These classes are still in POC stage. I’ll be adding more functionality over the next few days. Any feedback welcome.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+