Wednesday, August 16, 2017

Cool open source tool for ER diagrams

Recently one of my colleagues introduced me to a cool Java tool called SchemaSpy.
SchemaSpy is a java tool that can be used to create beautiful database documentation and also ER diagrams.

Would highly recommend perusing the following links and utilizing this tool:

Monday, August 07, 2017

Ruminating on Telematics standards

While creating training material for our internal staff on telematics, we found the below site that gives a very good introduction to novices on the basics of telematics -

I had covered the business benefits of telematics in this blog post. One of the fundamental challenge for the wide spread adoption of telematics has been the lack of standards around aggregating data from multiple TCU (Telematics Control Unit) vendors.

But now, the Association of Equipment Managers (AEM) and Association of Equipment Management Professionals (AEMP) have created a ISO standard to help systems
from different manufacturers all speak the same language -

The AEMP 1.0 standard defined a XML data format that is available here.
An example of the AEMP REST API by John Deere is available here (in both XML and JSON formats) -

The newer version (v 2.0) of the AEMP telematics standard is now an ISO standard (ISO 15143-3).
An example of ISO 15143-3 telematics API (with XML data format) is here.

As part of this standard, equipment makers would report the following data points and 42 fault codes using a standard protocol that will allow mixed equipment fleets to be managed from a single application.

Data Elements
  1. Serial Number
  2. Asset ID
  3. Hours
  4. Location
  5. GPS Distance Traveled
  6. Machine Odometer
  7. Fault Codes
  8. Idle Time
  9. Fuel Consumption
  10. Fuel Level
  11. Engine Running Status (on/off)
  12. Switch Input Events
  13. PTO Hours
  14. Average Load Factor
  15. Max Speed
  16. Ambient Air Temp
  17. Load Counts
  18. Payload Totals
  19. Active Regen Hours

Fault Codes
  1. Engine coolant temperature
  2. Engine oil pressure
  3. Coolant level
  4. Engine oil temperature
  5. Hydraulic oil temperature
  6. Transmission oil temperature
  7. Engine overspeed
  8. Transmission oil pressure
  9. Water in fuel indicator
  10. Transmission oil filter, blocked
  11. Air cleaner
  12. Fuel supply pressure
  13. Air filter pressure drop
  14. Coolant system thermostat
  15. Crankcase pressure
  16. Cool water temp, outlet cooler
  17. Rail pressure system
  18. ECU temperature
  19. Axle oil temp, front
  20. Axle oil temp, rear
  21. Intake manifold temperature
  22. Secondary steering pressure
  23. After-treatment reagent internal filter heater
  24. Crankcase ventilation 
  25. Boost pressure
  26. Outgoing brake pressure status
  27. Brake pressure, output
  28. Brake pressure, output
  29. All wheel drive hydraulic filter, blocked
  30. All wheel drive hydraulic filter, blocked
  31. Brake pressure acc status
  32. Injection control pressure
  33. Brake circuit differential pressure
  34. Brake pressure acc charging
  35. Brake pressure actuator
  36. Reserve steering control pressure
  37. HVAC water temperature
  38. Rotor pump temperature
  39. Refrigerant temperature
  40. Alert and alarms help power down machines quickly
  41. Multiple pressure indicators
  42. Switching gear events

Monday, July 17, 2017

Performance benefits of HTTP/2

The HTTP/2 protocol brings in a lot of benefits in terms of performance for modern web applications. To understand what benefits HTTP/2 brings, it is important to understand the limitations of HTTP/1.1  protocol.

The HTTP protocol only allows one 'outstanding' request per TCP connection - i.e. if a page is downloading 100 assets, only one request can be made per TCP connection. Browsers typically open 4-8 TCP connections per page to load the page faster (parallel requests), but this results in network congestion.

The HTTP/2 protocol is fully multiplexed - i.e. it allows multiple parallel requests/responses on the same TCP connection. It also compresses HTTP headers (reduce payload size) and is a binary protocol (not textual). The protocol also allows servers to proactively push responses directly to the browsers cache, thus avoiding HTTP requests.

The following links give an excellent explanation of the HTTP/2 protocol and its benefits.

The following site shows the performance of HTTP/2 compared to HTTP/1.1 when loading tons of images from the server -

There is another site - which states that HTTPS is faster than HTTP, but that is because HTTPS was using HTTP/2 by default in the background.

Sunday, July 16, 2017

Ruminating on DMN - a new modeling notation for business rules

We all know how the BPMN standard helped in interoperability of business process definitions across lines of business and also across organizations. But there was one element missing in the BPMN standard - i.e. the ability to model business rules/decisions in a standard way so that they can be interoperable.

Most of the current business rule engines (aka BRMS) follow a proprietary standard for modeling rules and migration from one BRMS suite to another was usually painful. Hence, we are pretty excited about DMN (Decision Model and Notation), a standard from OMG that can be considered complimentary to BPMN.

Many Rule Modeling tools such as IBM Decision Composer, OpenRules and Drools already support the DMN standard. Part of the DMN standard is also a Friendly Enough Expression Language (FEEL). FEEL defines a syntax for embedding expressions in the rule.
An excellent tutorial about DMN using decision tables can be found here - A good video tutorial on IBM decision composer is here -

It is important to understand the difference between a decision modeling tool and a decision execution engine. A decision modeling tool would give a GUI to define the decision model using the DMN standard. This decision model can be exported as a *.dmn file (which is in XML format - example here). The decision execution engine actually is the runtime to execute the model. Most of the existing rule engines have extended their support to run rules defined in DMN model. For example, the new IBM Decision Composer is a web-based rule modeling tool, but the modeled rules can be exported to run in the existing ODM engines.

So in theory, the DMN model created by one tool can be used to execute the model in another tool . Some folks have tested this and noted down the challenges in this whitepaper - The Effectiveness of DMN Portability

Sunday, April 16, 2017

Ruminating on the single-threaded model of NodeJS and Node-RED

Many developers and architects have asked me questions on the single-threaded nature of NodeJS and whether we can use it effectively on multi-core machines. Since Node-RED is based on NodeJS, whatever is valid for NodeJS is also valid for Node-RED.

NodeJS has a single thread per process model. When you start NodeJS, it would start a single process with one thread in it. Due to it's non-blocking IO paradigm, it can easily handle multiple client requests concurrently, as there is no thread that is blocking for any IO operation.

Now the next question is around the optimal usage of multi-core machines. If you have 8 core or 16 core machines on your cloud and just run one NodeJS process on it, then you are obviously under-utilizing the resources you have paid for. In order to use all the cores effectively, we have the following options in NodeJS:

1) For compute-intensive tasks: NodeJS can fork out child processes for heavy-duty stuff - e.g. if you are processing images or video files, you can fork out child NodeJS processes from the parent process. Communication between parent and child processes can happen over IPC.

2) For scaling REST APIs: You can start multiple NodeJS processes on a single server - e.g. if you have 8 cores, start 8 NodeJS processes. Put a load balancer (e.g. nginx) in front of these processes. You would anyways have some kind of load-balancing setup in your production environment and the same can be leveraged.

3) Cluster support in newer versions on NodeJS: In the latest versions of NodeJS, you have support for Clusters. Cluster enables us to start multiple NodeJS processes that all share the same server port - e.g. 8 NodeJS processes all listening to port 80.  This is the best OOTB option available today in NodeJS.

Hence it is indeed possible to effectively utilize NodeJS on multi-core machines. A good discussion on the same is available on StackOverflow here.

Another interesting question that is often asked is whether Java developers should jump ship and start development in NodeJS because of its speed and simplicity. Also, NodeJS evangelists keep harping about the fact that the single-threaded nature of Node removes the complexity of multi-threading, etc.

Here are my thoughts on this:

1) Today Java has first class support for non-blocking IO, similar to NodeJ. Take a look at the superb open-source  Netty library that powers all the cloud services at Apple.

2) Most of the complexity of multithreading is abstracted away by popular frameworks such as Spring - e.g. Spring Cloud enables developers to write highly scalable and modular distributed applications that are cloud-native without dealing with the complexities of multi-threading.

3) The Servlet 3.0 specification introduced async requests and the Spring REST MVC framework also supports non-blocking REST services as described here.  

Thus today, all the goodies of NodeJS are available on the Java platform. Also plenty of  Java OSS to kick start your development with all the necessary plumbing code.

Ruminating on a nifty tool called 'ngrok'

Many a times developers want to test their APIs running on their development machine against a mobile app. To do so, developers typically do the following:

Option A) Create a hotspot Wifi on their development machine and connect the mobile phone to this Wifi. Since both the mobile phone and their development machine is on the same network, the mobile app can call the APIs running on the localhost of the developer machine.

Option B) We host the APIs on a public cloud (with a public IP), so that the mobile app can access it. For this, the developer needs to setup a automated CI/CD process that would deploy the REST APIs on the middleware.

There is a third option where-in, the developer can run the API server on his local machine and just create a proxy server on the internet that would securely tunnel the request to this local machine through the firewall. This can be done using ngrok -

ngrok can be setup on your local machine within minutes and can serve as a great debugging tool. It has a network sniffer built in that can be assessed from http://localhost:4040
We would highly recommend this tool for any mobile/API developer. 

Monday, February 13, 2017

IoT energy standards in Europe for Smart Home / Smart Grid

European governments are pretty energy conscious with a number of initiatives in the EMEA region around smart energy grids. Jotting down some of the energy standards that are relevant in Europe.

In order to successfully build a smart grid system, it is important to have standards through which home appliances can publish information on how much energy they are using in real-time, which in turn can be provided to consumers and energy management systems to manage energy demands more effectively.

Smart appliances are also technically very heterogeneous, and hence there is a need for standardized interfaces. The standardization can happen at two levels - the communication protocol level or the message-structure level.
  • SAREF (Smart Appliances REFerence) is a standard for exchange of energy related information between home appliances and the third-party energy management systems. SAREF creates a new reference language for energy-related data. 
  • EEBus - A non-profit organization for interoperability  in sectors of smart energy, smart home & building. EEBus has defined a message-structure standard called as SPINE (Smart Premises Interoperable Neutral-message Exchange). SPINE only defines message structure at the application level (OSI Layer 7) and is completely independent from the used transport protocol. SPINE can be considered a technical realization of the SAREF onthology. 
  • Energy@Home - Another non-profit organization that creating an ecosystem for smart-grid. 
  • SAREF4EE - The extension of SAREF for EEBus and Energy@Home initiatives. By using SAREF4EE, smart appliances from different manufacturers that support the EEBus or Energy@Home standards can easily communicate with each other using any energy management system at home or in the cloud.

A good presentation illustrating the current challenges in Smart Home is available here -

Thursday, December 22, 2016

Ruminating on AMQP internals and JMS equivalent semantics

Many Java developers often use JMS APIs to communicate with the message broker. The JMS API abstracts away the internal implementation complexities of the message broker and provides a unified interface for the developer.

But if you are interested in understanding the internals of AMQP, then the following old tutorial on the Spring site is still the best - and

By understanding the core concepts of exchange, queue and binding keys, you can envisage multiple integration patterns such as pub/sub, req/reply, etc.

Jotting down snippets from the above sites on the similarities between JMS and AMQP.

JMS has queues and topics. A message sent on a JMS queue is consumed by no more than one client. A message sent on a JMS topic may be consumed by multiple consumers. AMQP only has queues. AMQP producers don't publish directly to queues. A message is published to an exchange, which through its bindings may get sent to one queue or multiple queues, effectively emulating JMS queues and topics.

AMQP has exchanges, routes, and queues. Messages are first published to exchanges with a routing key. Routes define on which queue(s) to pipe the message. Consumers subscribing to that queue then receive a copy of the message. If more than one consumer subscribes to the same queue, the messages are dispensed in a round-robin fashion.

It is very important to understand the difference between routing key and binding key. Publishers publish messages to the AMQP Exchange by giving a routing key; which is typically in the form of {company}.{product-category}.{appliance-type}.{appliance-id}.....

You create a queue on AMQP by specifying the binding key. For e,g, if your binding key is {company}.{product-category}.# then all messages for that product category would come to this queue.

While creating subscribers, you have two options - either bind to an existing queue (by giving the queue name) or create a subscriber private queue by specifying a binding key. 

Thursday, December 01, 2016

Ruminating on non-blocking REST services

In a typical REST service environment, a thread is allocated to each incoming HTTP request for processing. So if you have configured your container to start 50 threads, then you can handle 50 concurrent HTTP requests and any additional HTTP request would be queued till a thread is free.

Today, using principles of non-blocking IO and reactive programming, we can break the tight coupling between a thread and a web request. The Servlet-3 specification also supports async requests as explained in this article - The core idea is to  delegate the long-running or asynchronous processing to another background thread (Task Executor), so that the HTTP handler threads are not starved.

One might argue that we are just moving the 'blocking thread' bottleneck from the HTTP threads to the backend thread pool (Task Executors). But this does result in better performance, as we can serve more HTTP clients.

The Spring MVC documentation on Async REST MVC is worth a perusal to understand the main concepts -

A good article that demonstrates how Sprint MVC can be used to build non-blocking REST services that call a backend service exposed via JMS - All the source code for this example is available here. One must understand the basics of Spring Integration framework to work with the above example.

The following discussion thread on StackOverFlow would also give a good idea on how to implement this -

The following blog-post shows another example using Spring Boot on Docker -

Concurrency and scaling strategies for MDPs and MDBs

Message Driven Beans offer a lot of advantages over a standalone JMS consumer as listed in this blog-post. The Spring framework provides us with another lightweight alternative called MDP (Message Driven POJOs) that has all the goodies of MDB, but without any heavy JEE server baggage.

The strategy for implementing a scalable and load-balanced message consumption solution is a bit different in MDBs vs. MDPs.

An MDB is managed by the JEE container and is 'thread-safe' by default. The JEE container maintains a pool of MDBs and allows only one thread to execute an MDB at one time. Thus if you configure your JEE container to spawn 10 MDBs, you can have 10 JMS consumers processing messages concurrently.

Spring MDPs are typically managed by the DMLC (DefaultMessageListenerContainer). Each MDP is typically a singleton, but can have multiple threads running through it. Hence an MDP is NOT 'thread-safe' by default and we have to make sure that our MDPs are stateless - e.g. do not have instance variables, etc. The DMLC can be configured for min/max concurrent consumers to dynamically scale the number of consumers. All connection, session and consumer objects are cached by Spring to improve performance. Jotting down some important stuff to remember regarding DMLC from the Spring Java Docs.

"On startup, DMLC obtains a fixed number of JMS Sessions to invoke the listener, and optionally allows for dynamic adaptation at runtime (up to a maximum number). Actual MessageListener execution happens in asynchronous work units which are created through Spring's TaskExecutor abstraction. By default, the specified number of invoker tasks will be created on startup, according to the "concurrentConsumers" setting."

It is also possible in Spring to create a pool of MDPs and give one to each TaskExecutor, but we have never tested this and never felt the need for this. Making your MDPs stateless and hence 'thread-safe' would suffice for almost all business needs. Nevertheless, if you want to create a pool of MDP objects, then this link would help -

Another good article on Spring MDP scaling that is worth perusing is here -

Implementing a Request Response with Messaging (JMS)

Quite often, we need to implement a request/response paradigm on JMS - e.g. calling a backend function on a mainframe or a third-party interface that has an AMQP endpoint. To implement this, we have the following 3 options:

Option 1: The client creates a temporary queue and embeds the name-address of this queue in the message header (JMSReplyTo message header) before sending it to the server. The server processes the request message and writes the response message on to the temp queue mentioned in the JMSReplyTo header.
The advantage of this approach is that the server does not need to know the destination of the client response message in advance.
An important point to remember is that a temporary queue is only valid till the connection object is open. Temporary queues are generally light-weight, but it depends on the implementation of the MOM (message-oriented middleware).

 Option 2: Create a permanent queue for response messages from the server. The client sets a correlation ID on the message before it is sent to the server. The client then listens on the response queue using a JMS selector - using the correlation ID header value as the selector property. This ensures that only the appropriate message from the response queue is delivered to the appropriate client.

Option 3: Use a combination of Option 1 and Option 2.  Each client creates a temporary queue and JMS Consumer on startup. We need to use this option if the client does not block for each request. So the client can keep on sending messages and listen to multiple response messages on the temp queue and then match the req/res using the correlation ID.

Spring has implemented this design pattern in the JMS Outbound Gateway class -

The JMS spec also contains an API for basic req/res using temporary queues - QueueRequestor and TopicRequestor. So if your MOM supports JMS, then you can use this basic implementation if it suffices your needs.

The following links would serve as a good read on this topic.

Monday, November 14, 2016

Ruminating on AMQP vs MQTT

While working on IoT projects, a lot of our customers ask us to recommend the open protocol to be used between the remote devices and the cloud-based IoT platform - e.g. MQTT or AMQP.

First and foremost, it is important to note that both MQTT and AMQP operate on top of TCP/IP and hence can use TLS to encrypt the transport layer. Jotting down some of the important characteristics of each protocol to help you make a decision.

  • Very low resource (CPU/memory) footprint - suitable to constrained IoT devices. 
  • Support for pub-sub model (topics), but NO support for peer-to-peer messaging (queues). 
  • Only supports binary message format. So your JSON strings would need to be converted to bytes by your MQTT client - (be cautious about encoding/decoding especially UTF)
  • Does not support advanced messaging features such as transactions, custom message headers, expiration, correlation ID, etc. 
  • Was designed for interoperability between different messaging middleware (MOM). 
  • Supports advanced features such as reliable queuing, transactions, etc.
  • A lot of fine-grained control available to control all the features. 
  • Useful for preventing vendor lock-in. 
Final verdict (recommendation) - Use MQTT for IoT applications and use AMQP for enterprise messaging.

Some good reading links provided below:

Monday, October 24, 2016

Printing an object's properties for debug purposes

Quite often, we override the 'toString()' method of an object to print its properties. But there are a number of reusable utilities that can be used for the same.

In Java, the popular Apache Commons Lang3 package contains a class called ReflectionToStringBuilder  that has static overloaded methods 'ToString()' to print all fields using reflection. You can also specify a few formatting options and the fields to exclude.

Another option is Java, is to use the JSON libraries to print the object as a json string. The Google librarly Gson can be used to print (or pretty print) an object as a json string.

In .NET, you can either use the ObjectDumper library or you can use the object json formatter - newtonsoft JSON library

Monday, October 17, 2016

Ruminating on staffing for 24*7 support

Many production applications need 24*7 support. So how many resources do we need to support an application round the clock? The below simple calculation can help - 

Assume each resource works for 40 hours in a week. 
In one week, you have to cover 24*7 = 168 hours / week.
So total FTE = 168/40 = 4.2 FTE

To accommodate for vacation and sick leave, we can round it to 5 FTEs for 24*7 support. 
The typical shift timings are - 7am-4pm, 3pm-12 midnight, 11pm-8am, This ensures that there is a 1-hour overlap between each shift. 

Wednesday, August 17, 2016

Load Testing Tools for IoT platforms over MQTT

While IoT platforms can be tested using traditional load testing tools if standard protocols such as HTTP are used, there are a suite of other tools that can be used if you need to test the MQTT throughput capacity of your IoT platforms.

Jotting down a list of tools that can be used to test the MQTT broker of IoT platforms.

Thursday, August 11, 2016

Ruminating on MQTT

MQTT is a lightweight messaging protocol over TCP/IP that supports the publish-subscribe paradigm. It is most suited for low bandwidth / high latency and unreliable networks and hence is a natural fit for field IoT devices.  A good MQTT primer is available here.

MQTT was the de-facto protocol for all our IoT applications, but of late we have started experimenting with MQTT even for mobile apps, after we learned that Facebook Messenger app uses MQTT :)

MQTT sessions can survive across TCP connection re-connects and thus is very useful in unreliable network conditions. Also in MQTT, you can specify the QoS level - e.g.
  • Fire and forget (QoS 0)
  • At least once (QoS 1)
  • Exactly once (QoS 2)
It is very important to check if the MQTT broker we choose supports the required QoS levels. 
MQTT supports a hierarchy of topics, so you can subscribe to a top level topic and get all the messages to the subscriber. 

Most of the popular open source message brokers such as ActiveMQ, RabbitMQ and HiveMQ already support MQTT. A good comparison of the various MQTT brokers is available here -

A performance benchmark of MQTT brokers is available here -

Sunday, July 24, 2016

Cool illustrated guide to Kubernetes

If you want to understand the magic of Kubernetes and how it can be used to manage Docker containers at a high level, then the following illustrated guide is awesome :)

Cool DevOps Tools - Gerrit and Let's Chat

I would recommend all teams to leverage the following open source tools to add more juice to their DevOps operations and improve team collaboration.

Gerrit - A valuable web-based code review tool that comes with Git embedded. Can be very useful to help your junior team-mates learn about good coding practices and refactoring. A good introduction video is here -

Let's Chat - Digital Natives don't like to write long emails and abhor email chains. Use this on-premise hosted web-based chat server to create discussion rooms and share knowledge and get questions answered.

Scaling Node-RED horizontally for high volume IoT event processing

We were pretty impressed with the ease of visual programming in Node-RED. Our productivity in prototyping actually increased by 40-50% using Node-RED. We used Node-RED both on the gateways as well as the server for sensor event processing.

But we were not sure if Node-RED can be used to ingest and process a large volume of events - i.e. thousands of events/sec. I posted the question on the Google Groups Node-RED forum and got interesting answers. Jotting down the various options below.

  1. If your input is over HTTP, then you can use any of the standard load-balancing techniques to load balance requests over a cluster of nodes running the same Node-RED flow - e.g. one can use HAProxy, Nginx, etc. It is important to note that since we are running the same flow over many nodes, we cannot store any state in context variables. We have to store state in an external service such as Redis. 
  2. If you are ingesting over MQTT, then we have multiple options:
    • Option A: Let each flow listen to a different topic. You can have different gateways publish to different topics on the MQTT broker - e.g. Flow instance 1 subscribes to device/a/#   Node-RED instance 2 subscribe to device/b/#  and so on.
    • Option B: Some MQTT brokers support the concept of 'Shared Subscription' (HiveMQ) that is equivalent to point-to-point messaging - i.e. each consumer in a subsciption group gets a message and then the broker load-balances using round-robin. A good explanation on how to enable this using HiveMQ is given here - The good thing about the HiveMQ support for load-balancing consumers is that there is no change required in the consumer code. You can continue using any MQTT consumer - only the topic URL would change :)
    • Option C: You put a simple Node-RED flow for message ingestion that reads the payload and makes a HTTP request to a cluster of load-balanced Node-RED flows (similar to Option 1)
    • Option D: This is an extension to Option C and entails creating a buffer between message ingestion and message processing using Apache Kafka. We ingest the message from devices over MQTT and extract the payload and post it on a Kafka topic. Kafka can support a message-queue paradigm using the concept of consumer groups. Thus we can have multiple node-red flow instances subscribing to the Kafka topic using the same consumer group. This option also makes sense, if your message broker does not support load-balancing consumers. 

Thus, leveraging the above options we can scale Node-RED horizontally to handle a huge volume of events. 

Wednesday, July 20, 2016

Extending SonarQube with custom rules

SonarQube has today become our defacto standard for code analysis. We also use it for our migration projects when we define custom rules to check if the current application can be ported to the new technology stack.

The below links give a good overview of writing custom rules in SonarQube for Java, .NET and JS.

1. Custom Rules in Java
2. Custom Rules in .NET - using the Roslyn analyzer.
3. Custom Rules in JavaScript 

By leveraging the code templates and SDK given by these tools, it is easy to create new custom rules. Behind the scenes, the analysers first create a syntax tree of the code and then for each rule, a visitor design pattern is applied to run through all the nodes and apply the check/business logic.

After doing the analysis, it is also possible to auto-remediate / refactor the source code using predefined rules. The following open source tools can be used for auto-remediation.

Friday, July 01, 2016

Ruminating on serverless execution environments

Since the past few months, I have been closely watching the serverless execution trends in the industry. I find the whole concept of writing serverless code on the cloud extremely exciting and a great paradigm shift.

Especially for mobile and IoT apps, I think the below serverless execution environments hold great promise. Developers don't have to worry about provisioning servers and horizontal scaling of their apps - everything is seamlessly handled by the cloud. And you only pay when your code is invoked !
  1. IBM OpenWhisk -
  2. Azure Functions -
  3. Google Cloud Functions -
  4. AWS Lambda -
It is also interesting to note that IBM has made the OpenWhisk platform open source under the Apache 2 license. The entire source code is available here -
A good article explaining the underlying components of OpenWhisk is available here

Design Patterns for Legacy Migration and Digital Modernization

While designing the approach for any legacy migration, the following design patterns crafted by Martin Fowler can be very helpful.

Instead of a rip-n-replace approach to legacy modernization, the gist of the above patterns is to slowly build a new system around the edges of the old system. To do this, we leverage event driven architecture paradigms to capture inbound events to the old system and route these events to the new system. This is done incrementally till we can kill the old system. 

Having been in the architecture field for over a decade, I have realized that 'current state' and 'future state' architectures are just temporal states of reality!

It's impossible to predict the future; we can only be prepared for the future by designing our systems to be modular and highly flexible to change. Build an architecture that can evolve with time and be future-ready and not try to be future-proof. 

Another humble realization is that - the code we are writing today; is nothing but the legacy code of tomorrow :)
And in today's fast-paced world, systems become 'legacy' within a short period of time. Legacy need not just mean a 50-year-old mainframe program. Even a monolithic Java application can be considered legacy. Gartner now defines legacy as any system that is not sufficiently flexible to meet the changing demands of the business. 

Thursday, June 23, 2016

Business benefits of Telematics

Telematics can provide tremendous value to OEMs and Tier-1 vendors in improving the quality of their products and also delivering a superior customer experience.

Jotting down my thoughts on the business value of implementing telematics.

1. Predictive Maintenance - Every organization wants to reduce the maintenance costs associated with sudden unexpected failure of components in a vehicle. The downtime that occurs due to a failure can result in huge losses for all parties involved. Thus, it is imperative to make the paradigm shift from reactive maintenance to preventive/predictive maintenance.
Telematics would provide organizations with the ability to discover issues before they cause downtime and take the appropriate proactive steps to reduce costs. Various machine learning techniques are used to identify patterns.

2. Improve Product Quality - The insights gathered from telematics can be used as a feedback loop for product development teams. It would also help the management prioritize R&D investments in the appropriate areas.

3. Optimize Warranty Costs - Telematics data can provide visibility of anticipated component/part recalls and accordingly forecast warranty reserve. Using the power of analytics on telematics data, we can also identify suspicious warranty claims and effectively structure future warranty contracts.


Friday, June 10, 2016

Ruminating on LoRa technology

LoRa (Long Range) is a wireless technology developed for the internet of things. It enables long-range data communications with very low power requirement - e.g. a transmitter battery can last for 10 years and communicate with nodes over 15-20 kms !

LoRa technology was initially developed by SemTech, but now has become a standard and is being further developed by the LoRa alliance.

A good tutorial to get the basic of LoRa is available here -

Most IoT applications also need only to exchange small data packets with low throughput. LoRa is designed for such low data-rate connectivity. 

Monday, May 30, 2016

Ruminating on IoT datastores

The most popular data-store choice for storing a high volume of IoT sensor data are NoSQL time-series databases. 

The following link contains a good list of NoSQL time-series databases that can be used in an IoT project. We have worked with both OpenTSDB and KairosDB and found both of them to be enterprise grade. 

Tuesday, May 24, 2016

Ruminating on Power BI

We were building our dashboard using Power BI and were looking at the various options available to refresh the data.

The following link would give a good overview of the various data refresh options -

It is also possible to pump in live streaming data to Power BI using it's REST APIs -

But we were a bit concerned about the dataset size limit of 10 GB in Power BI Pro version. An excellent article by Reza Rad -

Essentially in Power BI, you have two options - either import the entire dataset into memory OR establish a live connection between Power BI and your data-source.

Power BI uses some nifty compression techniques for all data that is imported into it - Reza observed a compression from 800 MB file to 8 MB Power BI file. Hence for all practical purposes, a 10 GB limit should suffice for most use-cases.
In case you are working with large volumes of data (GB, TB, PB), then a live connection with the data-source is the only option.

Some snippets from Reza's article:
"Live connection won’t import data into the model in Power BI. Live connection brings the metadata and data structure into Power BI, and then you can visualize data based on that. With every visualization, a query will be sent to the data source and brings the response.

Limitations of Live Connection - 
1. With Live connection, there won’t be any Data tab in Power BI to create calculated measures, columns or tables. You have to create all calculations at the data source level.
2. Multiple Data Sources is not supported.
3. No Power Q&A
4. Power Query still is available with Live Connection. This gives you ability to join tables, flatten them if you require, apply data transformation and prepare the data as you want. Power Query can also set the data types in a way that be more familiar for the Power BI model to understand.
5. You need to do proper index and query optimization at data-source."

Monday, May 23, 2016

Ruminating on API Key Security

Kristopher Sandoval has written an excellent blog post on the prevalent usage of using API keys to secure your APIs.

We must not rely solely on API keys to secure our APIs, but rather use open standards such as OAuth 2, OpenID Connect, etc. to secure access to our APIs. Many developers use insecure methods of storing API keys in mobile apps or pushing the API key to Github.

Snippets from the article ( -

"Most developers utilize API Keys as a method of authentication or authorization, but the API Key was only ever meant to serve as identification.
API Keys are best for two things: identification and analytics (API metrics).

If an API is limited specifically in functionality where “read” is the only possible command, an API Key can be an adequate solution. Without the need to edit, modify, or delete, security is a lower concern."

Another great article by NordicAPIs is on the core concepts of Authentication, Authorization, Federation and Delegation -
The next article demonstrates how these 4 core concepts can be implemented using OAuth and OpenID Connect protocols -

Serverless Options for Mobile Apps

A lot of MBaaS platforms today provide a mobile developer with tools that enable them to quickly roll out mobile apps without worrying about the backend.
In a traditional development project, we would first have to build the backend storage DB, develop the APIs and then build the mobile app.

But if you are looking for quick go-to-market approach, then you can use the following options:

  • Google Firebase Platform - Developers can use the Firebase SDK and directly work with JSON objects. All data would be stored (synched) with the server automatically. No need to write any server-side code. Also REST APIs are available to access data from the server for other purposes. 
  • AWS MBaaS: AWS Mobile SDK provides libraries for working with DynamoDB (AWS NoSQL Store). The developer just uses the DynamoDB object mapper to map objects to table columns. Again no need to write server-side code and everything is handled automatically. 
  • Other open source MBaaS platforms such as BassBox, Convertigo, etc. 

Open Source API Management Tools

For folks, who are interested in setting up their own API Management tools, given below are a few options:

HTTP proxy tools for capturing network traffic

In the past, we had used tools such as Fiddler and Wireshark to analyse the network traffic between clients and servers. But these tools need to be installed on the machine and within corporate networks, this would entail taking proper Infosec approvals.

If you are looking for a nifty network traffic capture tool that does not need installation - then 'TcpCatcher' is a good option. This is a simple jar file that can run on any m/c having Java.

Whenever we are using such proxy tools, we have two options -
1. Change the client to point to the IP of the tool, instead of the server. The tool would then forward the request to the server. (Explicit man in the middle)
2.  Configure the tool IP as a proxy in your browser.  (Implicit man in the middle)

Update: 25May2016
The TcpCatcher jar tool started behaving strangely today with an alert stating - "This version of TcpCatcher has expired. Please download the latest version". We had the latest version, but looks like this is a bug in the system.

We moved on to use Burp Suite free edition. This tool is also available as jar file and can run on any machine having Java. There is an excellent article by Oleg Nikiforov that explains how to setup burp proxy and use it to intercept all http requests. You can also download their root certificate and install it in your machine or mobile phone to log all HTTPS traffic.
We could setup Burp in under 20 mins to monitor all HTTPS traffic between our mobile apps and APIs.