Sunday, July 24, 2016

Cool illustrated guide to Kubernetes

If you want to understand the magic of Kubernetes and how it can be used to manage Docker containers at a high level, then the following illustrated guide is awesome :)

https://deis.com/blog/2016/kubernetes-illustrated-guide/

Cool DevOps Tools - Gerrit and Let's Chat

I would recommend all teams to leverage the following open source tools to add more juice to their DevOps operations and improve team collaboration.

Gerrit - A valuable web-based code review tool that comes with Git embedded. Can be very useful to help your junior team-mates learn about good coding practices and refactoring. A good introduction video is here - https://www.youtube.com/watch?v=Wxx8XndqZ7A

Let's Chat - Digital Natives don't like to write long emails and abhor email chains. Use this on-premise hosted web-based chat server to create discussion rooms and share knowledge and get questions answered.

Scaling Node-RED horizontally for high volume IoT event processing

We were pretty impressed with the ease of visual programming in Node-RED. Our productivity in prototyping actually increased by 40-50% using Node-RED. We used Node-RED both on the gateways as well as the server for sensor event processing.

But we were not sure if Node-RED can be used to ingest and process a large volume of events - i.e. thousands of events/sec. I posted the question on the Google Groups Node-RED forum and got interesting answers. Jotting down the various options below.

  1. If your input is over HTTP, then you can use any of the standard load-balancing techniques to load balance requests over a cluster of nodes running the same Node-RED flow - e.g. one can use HAProxy, Nginx, etc. It is important to note that since we are running the same flow over many nodes, we cannot store any state in context variables. We have to store state in an external service such as Redis. 
  2. If you are ingesting over MQTT, then we have multiple options:
    • Option A: Let each flow listen to a different topic. You can have different gateways publish to different topics on the MQTT broker - e.g. Flow instance 1 subscribes to device/a/#   Node-RED instance 2 subscribe to device/b/#  and so on.
    • Option B: Some MQTT brokers support the concept of 'Shared Subscription' (HiveMQ) that is equivalent to point-to-point messaging - i.e. each consumer in a subsciption group gets a message and then the broker load-balances using round-robin. A good explanation on how to enable this using HiveMQ is given here - http://www.hivemq.com/blog/mqtt-client-load-balancing-with-shared-subscriptions/. The good thing about the HiveMQ support for load-balancing consumers is that there is no change required in the consumer code. You can continue using any MQTT consumer - only the topic URL would change :)
    • Option C: You put a simple Node-RED flow for message ingestion that reads the payload and makes a HTTP request to a cluster of load-balanced Node-RED flows (similar to Option 1)
    • Option D: This is an extension to Option C and entails creating a buffer between message ingestion and message processing using Apache Kafka. We ingest the message from devices over MQTT and extract the payload and post it on a Kafka topic. Kafka can support a message-queue paradigm using the concept of consumer groups. Thus we can have multiple node-red flow instances subscribing to the Kafka topic using the same consumer group. This option also makes sense, if your message broker does not support load-balancing consumers. 

Thus, leveraging the above options we can scale Node-RED horizontally to handle a huge volume of events. 

Wednesday, July 20, 2016

Extending SonarQube with custom rules

SonarQube has today become our defacto standard for code analysis. We also use it for our migration projects when we define custom rules to check if the current application can be ported to the new technology stack.

The below links give a good overview of writing custom rules in SonarQube for Java, .NET and JS.

1. Custom Rules in Java
2. Custom Rules in .NET - using the Roslyn analyzer.
3. Custom Rules in JavaScript 

By leveraging the code templates and SDK given by these tools, it is easy to create new custom rules. Behind the scenes, the analysers first create a syntax tree of the code and then for each rule, a visitor design pattern is applied to run through all the nodes and apply the check/business logic.

After doing the analysis, it is also possible to auto-remediate / refactor the source code using predefined rules. The following open source tools can be used for auto-remediation.

http://autorefactor.org/html/samples.html
http://walkmod.com/
https://github.com/facebook/pfff

Friday, July 01, 2016

Ruminating on serverless execution environments

Since the past few months, I have been closely watching the serverless execution trends in the industry. I find the whole concept of writing serverless code on the cloud extremely exciting and a great paradigm shift.

Especially for mobile and IoT apps, I think the below serverless execution environments hold great promise. Developers don't have to worry about provisioning servers and horizontal scaling of their apps - everything is seamlessly handled by the cloud. And you only pay when your code is invoked !
  1. IBM OpenWhisk - http://www.ibm.com/cloud-computing/bluemix/openwhisk/
  2. Azure Functions - https://azure.microsoft.com/en-in/services/functions/
  3. Google Cloud Functions - https://cloud.google.com/functions/docs/
  4. AWS Lambda - https://aws.amazon.com/lambda/
It is also interesting to note that IBM has made the OpenWhisk platform open source under the Apache 2 license. The entire source code is available here - https://github.com/openwhisk/openwhisk
A good article explaining the underlying components of OpenWhisk is available here

Design Patterns for Legacy Migration and Digital Modernization

While designing the approach for any legacy migration, the following design patterns crafted by Martin Fowler can be very helpful.

Instead of a rip-n-replace approach to legacy modernization, the gist of the above patterns is to slowly build a new system around the edges of the old system. To do this, we leverage event driven architecture paradigms to capture inbound events to the old system and route these events to the new system. This is done incrementally till we can kill the old system. 

Having been in the architecture field for over a decade, I have realized that 'current state' and 'future state' architectures are just temporal states of reality!

It's impossible to predict the future; we can only be prepared for the future by designing our systems to be modular and highly flexible to change. Build an architecture that can evolve with time and be future-ready and not try to be future-proof. 

Another humble realization is that - the code we are writing today; is nothing but the legacy code of tomorrow :)
And in today's fast-paced world, systems become 'legacy' within a short period of time. Legacy need not just mean a 50-year-old mainframe program. Even a monolithic Java application can be considered legacy. Gartner now defines legacy as any system that is not sufficiently flexible to meet the changing demands of the business.