Tag Archives: distributed design

JSON# – Tutorial #5: Deserialising Complex Objects

Fork on Github
Download the Nuget package

The previous tutorial focused on deserialising simple JSON objects. This tutorial describes the process of deserialising a more complex object using JSON#.

Let’s use the ComplexObject class that we’ve leveraged in earlier tutorials:

class ComplexObject : IHaveSerialisableProperties {
    public string Name { get; set; }
    public string Description { get; set; }
    public List<ComplexArrayObject> ComplexArrayObjects { get; set; }
    public List<double> Doubles { get; set; }

    public SerialisableProperties GetSerializableProperties() {
        return new SerialisableProperties("complexObject", new List<JsonProperty> {
            new StringJsonProperty {
                Key = "name",
                Value = Name
            },
            new StringJsonProperty {
                Key = "description",
                Value = Description
            }
            }, new List<JsonSerialisor> {
                    new ComplexJsonArraySerialisor("complexArrayObjects",
                        ComplexArrayObjects.Select(c => c.GetSerializableProperties())),
                    new JsonArraySerialisor("doubles",
                        Doubles.Select(d => d.ToString(CultureInfo.InvariantCulture)), JsonPropertyType.Numeric)
            });
        }
    }

Let’s instantiate this with some values, and serialise to JSON. I won’t bloat this post with details on how to serialise, covered in previous posts. Here is our serialised ComplexObject instance:

{"complexObject":{"name":"Complex Object","description":"A complex object","complexArrayObjects":[{"name":"Array Object #1","description":"The 1st array object"},{"name":"Array Object #2","description":"The 2nd array object"}],"doubles":[1,2.5,10.8]}}

Notice that we have 2 collections. A simple collection of Doubles, and a more complex collection of ComplexArrayObjects. Let’s start with those.

First, create a new class, ComplexObjectDeserialiser, and implement the required constructor and Deserialise method.

Remember this method from the previous tutorial?

    var properties = jsonNameValueCollection.Parse(mergeArrayValues);

This effectively parses the JSON and loads each element into a NameValueCollection. This is fine for simple properties, however collection-based properties would cause the deserialiser to load each collection-element as a separate item in the returned NameValueCollection, which may be somewhat cumbersome to manage:

Flattened JSON collection

Flattened JSON collection

This is where the Boolean parameter mergeArrayValues comes in. It will concatenate all collection-based values in a comma-delimited string, and load this value into the returned NameValueCollection. This is much more intuitive, and allows consuming code to simply split the comma-delimited values and iterate as required.

Compressed JSON Collection

Compressed JSON Collection

Here is the complete code-listing:

class ComplexObjectDeserialiser : Deserialiser<ComplexObject> {
        public ComplexObjectDeserialiser(JsonNameValueCollection jsonNameValueCollection) : base(jsonNameValueCollection) {}

        public override ComplexObject Deserialise(bool mergeArrayValues = false) {
            var properties = jsonNameValueCollection.Parse(mergeArrayValues);

            var complexObject = new ComplexObject {
                Name = properties["complexObject.name"],
                Description = properties["complexObject.description"],
                ComplexArrayObjects = new List<ComplexArrayObject>(),
                Doubles = new List<double>()
            };

            var complexArrayObjectNames = properties["complexObject.complexArrayObjects.name"].Split(',');
            var complexArrayObjectDescriptions = properties["complexObject.complexArrayObjects.description"].Split(',');

            for (var i = 0; i &lt; complexArrayObjectNames.Length; i++) {
                var complexArrayObjectName = complexArrayObjectNames[i];

                complexObject.ComplexArrayObjects.Add(new ComplexArrayObject {
                    Name = complexArrayObjectName,
                    Description = complexArrayObjectDescriptions[i]
                });
            }

            var complexArrayObjectDoubles = properties["complexObject.doubles"].Split(',');

            foreach (var @double in complexArrayObjectDoubles)
                complexObject.Doubles.Add(Convert.ToDouble(@double));

            return complexObject;
        }
    }

As before, we deserialise as follows:

    Json.Deserialise(new ComplexObjectDeserialiser(new StandardJsonNameValueCollection("<JSON string...>")), true);

JSON# does most of the work, loading each serialised element into a NameValueCollection. We simply read from that collection, picking and choosing each element to map to an associated POCO property.
For collection-based properties, we simply retrieve the value associated with the collection’s key, split that value into an array, and loop through the array, loading a new object for each item in the collection, building our ComplexObject step-by-step.

This is the final JSON# post covering tutorial-based material. I’m working on a more thorough suite of performance benchmarks, and will publish the results, as well as offering a more in-depth technical analysis of JSON# in future posts. Please contact me if you would like to cover a specific topic.

The next posts will feature a tutorial series outlining Object Oriented, Test Driven Development in C# and Java. Please follow this blog to receive updates.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+

Load Balancing a RabbitMQ Cluster

The Problem

Given a cluster of RabbitMQ nodes, we want to achieve effective load-balancing.

The Solution

First, let’s look at the feature at the core of RabbitMQ – the Queue itself.
RabbitMQ Queues are singular structures that do not exist on more than one Node. Let’s say that you have a load-balanced, HA (Highly Available) RabbitMQ cluster as follows:

RabbitMQ Cluster with Load Balancer

RabbitMQ Cluster with Load Balancer

Nodes 1 – 3 are replicated across each other, so that snapshots of all HA-compliant Queues are synchronised across each node. Suppose that we log onto the RabbitMQ Admin Console and create a new HA-configured Queue. Our Load Balancer is configured in a Round Robin manner, and in this instance, we are directed to Node #2, for argument’s sake. Our new Queue is created on Node #2. Note: it is possible to explicitly choose the node that you would like your Queue to reside on, but let’s ignore that for the purpose of this example.

Now our new Queue, “NewQueue”, exists on Node #2. Our HA policy kicks in, and the Queue is replicated across all nodes. We start adding messages to the Queue, and those messages are also replicated across each node. Essentially, a snapshot of the Queue is taken, and this snapshot is replicated across each node after a non-determinable period of time has elapsed (it actually occurs as an asynchronous background task, when the Queue’s state changes).

We may look at this from an intuitive perspective and conclude that each node now contains an instance of our new Queue. Thus, when we connect to RabbitMQ, targeting “NewQueue”, and are directed to an appropriate node by our Load Balancer, we might assume that we are connected to the instance of “NewQueue” residing on that node. This is not the case.

I mentioned that a RabbitMQ Queue is a singular structure. It exists only on the node that it was created, regardless of HA policy. A Queue is always its own master, and consists of 0…N slaves. Based on the above example, “NewQueue” on Node #2 is the Master-Queue, because this is the node on which the Queue was created. It contains 2 Slave-Queues – it’s counterparts on nodes #1 and #3. Let’s assume that Node #2 dies, for whatever reason; let’s say that the entire server is down. Here’s what will happen to “NewQueue”.

  1. Node #2 does not return a heartbeat, and is considered de-clustered
  2. The “NewQueue” master Queue is no longer available (it died with Node #2)
  3. RabbitMQ promotes the “NewQueue” slave instance on either Node #1 or #3 to master

This is standard HA behaviour in RabbitMQ. Let’s look at the default scenario now, where all 3 nodes are alive and well, and the “NewQueue” instance on Node #2 is still master.

  1. We connect to RabbitMQ, targeting “NewQueue”
  2. Our Load Balancer determines an appropriate Node, based on round robin
  3. We are directed to an appropriate node (let’s say, Node #3)
  4. RabbitMQ determines that the “NewQueue” master node is on Node #2
  5. RabbitMQ redirects us to Node #2
  6. We are successfully connected to the master instance of “NewQueue”

Despite the fact that our Queues are replicated across each HA node, there is only one available instance of each Queue, and it resides on the node on which it was created, or in the case of failure, the instance that is promoted to master. RabbitMQ is conveniently routing us to that node in this case:

RabbitMQ Cluster Exhibiting Extra Network-hop

RabbitMQ Cluster Exhibiting Extra Network-hop

Unfortunately for us, this means that we suffer an extra, unnecessary network hop in order to reach our intended Queue. This may not seem a major issue, but consider that in the above example, with 3 nodes and an evenly-balanced Load Balancer, we are going to incur that extra network hop on approximately 66% of requests. Only one in every three requests (assuming that in any grouping of three unique requests we are directed to a different node) will result in our request being directed to the correct node.

“Does this mean that we can’t implement round robin load-balancing?”
“No, but if we do, it will result in a less than optimal solution.”
“So, what’s the alternative?”

Well, in order to ensure that each request is routed to the correct node, we have two choices:

  • Explicitly connect to the node on which the target Queue resides
  • Spread Queues evenly as possible across nodes

Both solutions immediately introduce problems. In the first instance, our client application must be aware of all nodes in our RabbitMQ cluster, and must also know where each master Queue resides. If a Node goes down, how will our application know? Not to mention that this design breaks the Single Responsibility principle, and increases the level of coupling in the application.

The second solution offers a design in which our Queues are not linked to single nodes. Based on our “NewQueue” example, we would not simply instantiate a new Queue on a single node. Instead, in a 3-node scenario, we might instantiate 3 Queues; “NewQueue1”, “NewQueue2” and “NewQueue3”, where each Queue is instantiated on a separate node.

The second solution offers a design in which our Queues are not linked to single Nodes. Based on our “NewQueue” example, we would not simply instantiate a new Queue on a single Node. Instead, in a 3-node scenario, we might instantiate 3 Queues; “NewQueue1”, “NewQueue2” and “NewQueue3”, where each Queue is instantiated on a separate Node:

RabbitMQ Cluster with Separate Queues

RabbitMQ Cluster with Separate Queues


Our client application can now implement, for example, a simple randomising function that selects one of the above Queues and explicitly connects to it. In a web-based application, given 3 separate HTTP requests, each request would target one of the above Queues, and no Queue would feature more than once across all 3 requests. Now we’ve achieved a reasonable balance of load across our cluster, without the use of a traditional Load Balancer.
RabbitMQ Cluster with Randomiser

RabbitMQ Cluster with Randomiser


But we’re still faced with the same problem; our client application needs to know where our Queues reside. So let’s look at advancing the solution further, so that we can avoid this shortcoming.

Firstly, we need to provide mapping metadata that describes our RabbitMQ infrastructure. Specifically, where Queues reside. This should be a resilient data-source such as a database or cache, as opposed to something like a flat file, because multiple sources (2, at least) may access this data concurrently.

Now introduce an always-on service that polls RabbitMQ, determining whether or not nodes are alive. New Queues should also be registered with this service, and it should keep an up-to-date registry, providing metadata about Nodes and their Queues:

RabbitMQ Cluster with Monitor Service

RabbitMQ Cluster with Monitor Service


Our client application, on initial load, should poll this service and retrieve RabbitMQ metadata, which should then be retained for incoming requests. Should a request fail due to the fact that a Node becomes compromised, the client application can poll the Queue Metadata Store, return up-to-date RabbitMQ metadata, and re-route the message to a working Node.

This approach is a design that I’ve had some success with. From a conceptual point-of-view, it forms a small section of an overall Microservice Architecture, which I’ll talk about in a future post.

Connect with me:

RSSGitHubTwitter
LinkedInYouTubeGoogle+