Wednesday, July 18, 2018

Creating a RabbitMQ pipeline using listeners/publishers

In Event Driven Architectures, you often have to create a pipeline of event processing. One of my teams was using Spring AMQP libary and wanted to implement the following basic steps.
1. Read a message from the queue.
2. Do some processing and transform the message.
3. Publish the message downstream.

The sample code given below will help developers in implementing this.

Another scenario is when you have multi-threaded code that is publishing messages to RabbitMQ, then you can use the RabbitMQTemplate with Channel caching. Sample code given below.

Wednesday, July 04, 2018

Wednesday, June 27, 2018

Simple multi-threading code

Very often, we have to process a list of objects. Using a for-loop would process these objects in sequence. But if we want to process them in parallel, then the following code snippet will help.

Simple Utils for File & Stream IO

I still see many developers reinventing the wheel when it comes to IO/Stream operations. There are so many open-source libraries for doing this for you today :)

Given below are some code snippets for the most common use-cases for IO.

Wednesday, May 23, 2018

Ruminating on Agile estimates

Over the past few years, we have been using story points for estimation, rather than using man-hours.
For a quick introduction of agile estimation, please peruse the following links that give a good overview.

But still folks struggle to understand the advantages of estimating by story points. During the planning poker session, all team members discuss each story point and arrive at the story points through consensus. Thus each team member has skin in the game and is involved in the estimation process.

The time needed to complete a story point will vary based on a developer’s level of experience, but the amount of work is correctly estimated using story points.

IMHO, velocity should be calculated only after 2-3 sprints. This average velocity (#story-points/sprint) can be used to estimate the calendar timelines for the project.

Thursday, March 29, 2018

Cool Java client library for secure FTP

My team was looking for a library to PUT and GET files from a sftp server. We used the cool Java library called jSch (

Sample code to download and upload files to the SFTP server is given below..

Sunday, March 18, 2018

Spring WebSocket STOMP tips and tricks

Recently we successfully implemented secure Websockets in one of our projects and learned a lot of tricks to get things working together. Given below are some tips that would help teams embarking on implementing WebSockets in their programs.

1)  Spring uses the STOMP protocol for Web Sockets. The other popular protocol for Web Sockets is WAMP, but Spring does not support it. Hence if you are using Spring, make sure that your Android, iOS and JS  libraries support STOMP.

2)  Spring websocket library by default also supports SockJS as a fall-back for web JS clients. If your use-case only entails supporting Android and iOS clients, then disable SockJS in your Spring configuration. It might happen that a use-case might work on SockJS on a web-client, but fail in native mobile code.

3) The default SockJS implementation of Spring, sends a server heartbeat header (char 'h') every 25 seconds to the clients. Hence there is no timeout on the socket connection for JS clients. But on mobile apps (pure STOMP), there is no heartbeat configured by default. Hence we have to explicitly set the heartbeat on the server OR on the client to keep the connection alive when idle. Otherwise, idle connections get dropped after 1 minute. Reference Links:
Sample server side code below.

4) One Android websocket library that works very well with Spring websockets is But unfortunately, this library does not support heartbeats. Hence you have to explicitly send heartbeats using sample code like this -

5) On iOS, the following socket library worked very well with Spring websockets - We faced an issue, where this library was throwing Array Index Range exceptions on server side heartbeat messages, but we made small changes in this library to skip process for heart-beat header messages.

6)  It is recommended to give the URL scheme as wss://, although we found that https:// was also working fine. If you have SockJS enabled on Spring, then please append the URL with /websocket, as you would get a exception stating invalid protocol up-gradation. Hence clients should subscribe to/{endpoint}/websocket.

7)  Both libraries also support sending headers during the CONNECT step. Very often, teams send the authorization token during the CONNECT step in the header, that can be used to authenticate the client. To access these headers on the server we need to access the Nativeheaders hashmap. Sample code -


Tuesday, March 13, 2018

Ruminating on Jackson JSON Parsing

In my previous post, we discussed about how we can extract an arbitrary JSON value out of a JSON string. Very often, developers face another error while using Jackson - com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field error.

This happens when your JSON string has an attribute that is not present in your POJO and you are trying to deserialize it. Your POJO might not be interested in these fields or these fields could be optional.

To resolve this error, you have two options:

Option 1:  Disable the error checking on the Jackson Mapper class as follows.

Option 2: If you have access to the POJO object, then you can annotate it as follows
@JsonIgnoreProperties(ignoreUnknown = true)

Saturday, March 10, 2018

Identifying RabbitMQ consumers

One of the challenges my team was facing was to accurately identify the consumers of a RabbitMQ queue. Let's say if you have 5 consumers and you want to kill one of them through the dashboard, you need to identify the consumer.

Each consumer of RabbitMQ has a tag value that can be used to uniquely identify it. By default, RabbitMQ assigns some random number as the consumer tag, but just by looking at this random string tag there is no way to identify the actual consumer. 

To resolve this, you need to create a ConsumerTagStrategy and associate it with the MessageListenerContainer. Code snippet given below: 

Tuesday, February 27, 2018

Ruminating on BPM vs Case Management

Most BPM tools today support both BPMN (business process modeling notation) as well as CMMN (Case Management Modeling Notation). But when to use what?

It all depends on the process that you want to model. Given below are some tips that can be used to decide whether to model the process as a traditional BPM or a case management solution.

Traditional BPM: 
  • If your process is a predefined and ordered sequence of tasks - e.g. sending out a insurance renewal message, onboarding an employee, etc. 
  • The order of the steps rarely change - i.e. the process is repeatable.
  • Business users cannot dynamically change the process. The process determines the sequence of events. 
Case Management:
  • When the process does not have a strict ordering of steps - e.g. settling a claim.
  • The process depends on the knowledge worker, who decides the next steps. 
  • External events (submission of documents) determine what next step the knowledge worker will take.  
  • Case management empowers knowledge workers and provides them with access to all the information concerning the case. The knowledge worker then uses his discretion and control to move the case towards the next steps. 
Using the above guidelines, you can model your process using BPMN or CMMN. Business rules can be modeled as DMN (decision modeling notation). 

Monday, February 19, 2018

Adding custom filters to Spring Security

One of my teams was looking for an option for adding filters on the Spring Security OAuth Server.

As we know, the Spring Security OAuth2 Server is a complex mesh of filters that get the job done in implementing all the grant types of the OAuth specification -

The team wanted to add additional filters to this pipeline of Security filters. There are many ways of achieving this:

Option 1: Create a custom filter by extending the Spring GenericFilterBean class. You can set the order by using the @Order annotation.

Option 2: Register the filter manually in the WebSecurityConfigurerAdapter class using the addFilterAfter/addFilterBefore methods.

Option 3: Set the property "security.filter-order=5" in your Now you can add upto 4 custom filters and set the order as either 1,2,3,4.
Another option is to manually set the order (without annotations) using FilterRegistrationBean in any @Configuration

The following 2 blogs helped us explore all options and use the appropriate one.

Monday, January 29, 2018

Ruminating on the V model of software testing

In the V model of software testing, the fundamental concept in to interweave testing activities into each and every step of the development cycle. Testing is NOT a separate phase in the SDLC, but rather testing activities are carried out right from the start of the requirements phase.

  • During requirements analysis, the UAT test cases are written. In fact, many teams have started using user stories and acceptance criteria as test cases. 
  • System test cases and Performance test cases are written during the Architecture definition and Design phase. 
  • Integration test cases are written during the coding and unit testing phase. 
A good illustration of this is given at

Monday, January 22, 2018

Ruminating on Netflix Conductor Orchestration

We have been evaluating the Netflix open source Conductor project and are intrigued by the design decisions made in it.

Netflix Conductor can be used for orchestration of microservices. Microservices are typically loosely coupled using pub/sub semantics and leveraging a resilient message broker such as Kafka. But this simple and proven event driven architecture did not suffice the needs for Netflix.

A good article on the challenges faced by the Netflix team and the genesis of Conductor is available here. Snippets from the article:

"Pub/sub model worked for simplest of the flows, but quickly highlighted some of the issues associated with the approach:

  • Process flows are “embedded” within the code of multiple applications - e.g. If you have a pipeline of publishers and subscribers, it becomes difficult to understand the big picture without proper design documentation. 
  • There is tight coupling and assumptions around input/output, SLAs etc, making it harder to adapt to changing needs  - i.e. How will you monitor that the response for a particular task is completed within a 3 second SLA? If not done within this timeframe, we need to mark this task as a failure. Doing this is very difficult with pub/sub unless we code this ourselves.
  • No easy way to monitor the progress - When did the process complete? At what step did the transaction fail?

To address all the above issues, the Netflix team created Conductor. Conductor servers are also stateless and can be deployed on multiple servers to handle scale and availability needs.

Ruminating on idempotency of REST APIs

Most REST services are stateless - i.e. they do not store state in memory and can be scaled out horizontally.

There is another concept called a 'idempotent' operation. An operation is considered to be idempotent if making the same call with the same input parameters repeatedly gives the same results. In other words, the service does not differentiate between one call or a million calls.

Typically all GET, HEAD, OPTIONS calls are safe and idempotent. PUT and DELETE are also considered idempotent, but they can return different responses, but with no state change. A good video describing this is here  -

POST calls are NOT idempotent, as every call creates a new record. 

Wednesday, January 17, 2018

Why GPU computing and Deep Learning are a match made in heaven?

Deep learning is a branch of machine learning that uses multi-layered neural networks for solving a number of challenging AI problems.

GPU (Graphics Processing Unit) architecture are fundamentally different from CPU architecture. A GPU chip would be significantly slower that a CPU, but a single GPU might have thousands of cores while a CPU usually has not more than 12 cores.

Hence any task that can be parallelized over multiple core is a perfect fit for GPUs. Now it so happens, that deep learning algorithms involve a lot of matrix multiplications that are an excellent candidate for parallel processing over the cores of a GPU. Hence GPUs make deep learning algorithms run faster by an order of magnitude. Even training of models is much faster and hence you can expedite the GTM of your AI solutions.

An example of how GPUs are better for AI solutions is considering AlexNet, a well known image classification deep network. A modern GPU costing about $1000 takes 2.5 days to fully train AlexNet on the very large ImageNet dataset. On the otherhand, it takes a CPU costing several thousand dollars nearly 43 days.

A good video demonstrating the difference between GPU and CPU is here:

Timeout for REST calls

Quite often, we need a REST API to respond within a specified time frame, say 1500 ms.
Implementing this using the excellent Spring RestTemplate is very simple. A good blog explaining how to use RestTemplate is available here -


Monday, January 08, 2018

The curious case of Java Heap Memory settings on Docker containers

On docker containers, developers often come across OOM (out-of-memory) errors and get baffled because most of their applications are stateless applications such as REST services.

So why do docker containers throw OOM errors even when there are no long-living objects in memory and all short-lived objects should be garbage collected?
The answer lies in how the JVM calculates the max heap size that should be allocated to the Java process.

By default, the JVM assigns 1/4 of the total physical memory as the max heap size for the Java runtime. But this is the physical memory of the server (or VM) and not of the docker container.
So let's say your server has a memory of 16 GB on which you are running 8 docker containers with each container configured for 2 GB. But your JVM has no understanding of the docker max memory size. The JVM will allocate 1/4 of 16 GB = 4 GB as the max heap size. Hence garbage collection MAY not run unless the full heap is utilized and your JVM memory may go beyond 2 GB.
When this happens, docker or Kubernetes would kill your JVM process with an OOM error.

You can simulate the above scenario and print the Java heap memory stats to understand the issue. Print the runtime.freeMemory(), runtime.MaxMemory() and runtime.totalMemory() to understand the patterns of memory allocation and garbage collection.

The diagram below gives a good illustration of the memory stats of JVM.

So how do we solve the problem? There is a very simple solution - Just explicitly set the max memory of the JVM when launching the program - e.g. java -Xmx 2000m -jar myjar.jar
This will ensure that the garbage collection runs appropriately and an OOM error does not occur.
A good article throwing more details on this is available here.

Also, it is important to understand that the total memory consumed by a Java process (called as Resident Set Size RSS) is equal to the heap size + Perm Size (MetaSpace) + native memory (required for thread stacks, file pointers). If you have a large number of threads, then native memory consumption can also be high (No. of threads * -Xss)

Max memory = [-Xmx] + [-XX:MaxPermSize/MaxMetaSpace] + [number_of_threads * (-Xss)] 

Since JDK 8, you can also utilize the following -XX parameters:
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap //This allows the JDK to understand the CGroup memory limits on the Linux kernel
-XX:MaxRAM //This specifies the max memory allocated to the JVM...includes both on-heap and off-heap memory. 

In Java 10, a lot of changes have been made to the JDK to make it container friendly and you would not need to specify any parameters.

Other articles worth perusing are:

If you wish to check all the -XX options for a JVM, then you can specify the following command.
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version

If you have a debug build of Java, then you can also try:
java -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+PrintFlagsFinal -XX:+PrintFlagsWithComments -version

There are also tools that you can utilize to understand the best memory options to configure, such as the Java Memory Build Pack.

Monday, December 04, 2017

Cool online browser based BPMN, DMN and CMMN editor

If you are looking for an extremely lightweight BPMN tool then please try out
It is made by the team at Camunda BPM and I was pleased with the simplicity of use.

The JS based tool also provides an UI editor for DMN (Decision Model Notation) and CMMN (Case Management Model and Notation).

You can even download the JS package and run it locally on your browser. 

Sunday, November 19, 2017

Encrypting sensitive data in Spring file

If you want to encrypt passwords, keys and other sensitive information in your file, then you have a nifty solution from an open source encryption library called as Jasypt.

We begin by adding the maven dependency of jasypt-spring-boot-starter to our Spring Boot application. The steps involved in integrating Jasypt into your Spring Boot application is as follows:

1. First using Jasypy and a secret password, created encrypted tokens of all sensitive information.

2. Put this encrypted token in your properties file with the value enclosed with string 'ENC' - e.g. password=ENC(encrypted-token)

3. Retrieve your properties in Spring classes the same old way - e.g. using the @Value annotation or env.getProperty() method.

A good example explaining this is here -  with source code available here.

Friday, November 17, 2017

Changing the hashing algorithm for passwords

Recently, one of my teams wanted to update their password hashing algorithm that was in use for more than 2 years. As discussed here, passwords should always be hashed in the database.

If we directly employ the new hashing algorithm, then all users would lose the ability to login with their old passwords and would be forced to change their password.

Hence, we need to follow the following approach:

1. Add a new column to the user table that stores the hashing algorithm name (or version).
2. Set this version to 1 for all old passwords.
3. When the user logs in, first check the hash using the old algorithm. If it is correct, then hash the password using the new algorithm and save the hash.
4. Update the version column to 2.  

Saturday, October 21, 2017

Object pooling made simple using Apache Commons Pool2

If you are looking for a quick implementation of an object pool, then look no further than the excellent Apache Commons Pool2 implementation. Pool2 is far better and faster than the original commons pool library.

Object pool can be used to cache those objects that are expensive to setup and cannot be created for every request or thread - e.g. DB connections, MQTT broker connections, AMQP broker connections, etc. 

The following code snippets would show you how simple it is to create a pool for your 'expensive-to-create' objects. 
Any object pool typically requires 2 parameters  [] --- 
1) A factory object to handle creation and destruction of your objects []
2) A configuration object to configure your pool. [GenericObjectPoolConfig,java]

Friday, October 20, 2017

Caching Simplified - Magic of Spring Annotations - Part 2

In the previous blog-post, we saw how we can enable caching with the magic of annotations.
Now, let's consider how to evict the cache based on our application needs.

Spring provides a simple annotation called @CacheEvict that can be used to evict the cache. In the below example, we add two methods that can be used to evict the cache 'SomeStaticData' and 'CountryList'.

But when do we call the evict method? We can schedule it to be called by the Spring Scheduler as per defined schedule. In the above example, we have scheduled the evict method once per day. Again so simple using the magic of annotations in Spring !

To enable scheduling by a background thread in Spring Boot, we need to just add the @EnableScheduling annotation on the Spring Boot main class. If Actuator is enabled, then it automatically creates a background thread and the @EnableScheduling annotation is then NOT required. 

 Another option is to have a "/evict/{CacheName}" endpoint registered in Spring MVC and call it from a browser to manually evict the cache.

Caching Simplified - Magic of Spring Annotations

Spring Boot makes is super simple to add caching abilities to your application. With just a few lines of code and some annotation magic, you have a complete caching solution available with you.

Spring provides wrappers for many cache implementations - Google Guava, ehcache, MongoDB, Redis, etc. At the simplest level, you can also use a ConcurrentHashMap as the cache manager.

Given below are three code snippets that shows how simple it is to enable caching for static data in a Spring Boot application - Welcome to the power of Spring !

Step 1:  []  Configure a ConcurrentHashMap cache manager

Step 2:  [] Annotate your Spring Boot application with @EnableCaching.

Magic behind the scenes: The @EnableCaching annotation triggers a post processor that inspects every Spring bean for the presence of caching annotations  on public methods. If such an annotation is found, a proxy is automatically created to intercept the method call  and handle the caching behavior accordingly.

Step 3: [] Annotate your methods that return the static data with  @Cacheable annotation. You can pass the cache name in the annotation - e.g.  @Cacheable("SomeStaticData")


That's it! You have configured caching. In the next blog-post, we will see how we can evict items from the cache when necessary.

Tuesday, October 17, 2017

Circular Dependencies in Spring Boot

Recently one of my team member was struggling with a strange issue in Spring Boot. The application was running fine on his local machine, but when deployed to the cloud it started giving errors - throwing BeanCurrentlyInCreationException.

 The core issue was identified as Circular Dependencies in Spring. An excellent article on how to identify such dependencies and resolve them is here -

It is recommended best to avoid circular dependencies, but in worst case if you must have them, then there are multiple options such as:
1. Using the @Lazy annotation on one of the dependency.
2. Not use Constructor injection, but use a Setter injection
3. Use @PostConstruct to set the Bean

Tuesday, October 10, 2017

Generating secure keys for encryption

Very often, we need to create secure keys that can be used in digital signatures or for signing a JWT.
It is very important to create a secure key so that the encryption is strong. A good post about this is here -

Jotting down some snippets from the article:

The strength of encryption is related to the difficulty of discovering the key, which in turn depends on both the cipher used and the length of the key. 

Encryption strength is often described in terms of the size of the keys used to perform the encryption: in general, longer keys provide stronger encryption. Key length is measured in bits.

Different ciphers may require different key lengths to achieve the same level of encryption strength

A key is nothing but a byte array that is passed to the encryption algorithm.  Hence if you are storing the key in a properties file, please makes sure that you have stored it in Base64 format.

Spring provides very simple methods for generating secure keys as shown below.

Monday, October 09, 2017

Handling RabbitMQ broker re-connections

The RabbitMQ Java Client Library has default support for auto-recovery (since version 4.0.0). Hence the client will try to recover a broken connection unless you explicitly disable it.

If you want to fine-tune the retry mechanism, then the examples given in the below link would help.

Or alternatively, you can use the super easy Spring RabbitTemplate class that has retry options. The RabbitTemplate class wraps the RabbitMQ Java client and provides all the goodies of Spring. 

Handling MQTT broker re-connections

Any client that connects to an MQTT broker needs the ability to handle a connection failure.

The popular Eclipse Paho library now has support for reconnects as described here -
Sample client code available here -

If you want to understand how Paho implements reconnect, then have a look at this source file -

Alternatively, we can use the Spring Integration Framework that encapsulates the Paho library and provides options to configure connection retry logic.

Sunday, October 08, 2017

Ruminating on MQTT load balancing for scalable IoT event processing

MQTT brokers support the publish/subscribe paradigm. But what if you need to scale out the processing of MQTT messages over a cluster of nodes (message consumers)?

In IoT environments, we need to process thousands of events per second and hence need to load-balance incoming messages across multiple processing nodes. 
Unfortunately, the standard MQTT specification does not support this concept.

But many MQTT brokers support this as a non-standard feature - e.g. HiveMQ supports Shared Subscriptions

The IBM Watson IoT platform also supports shared subscriptions as described here -
If you are using the IBM Watson IoT java libraries, you need to "Shared-Subscription” property is set to “true”. If you are using any other client like Eclipse Paho, then you must use the client id of the form “A:org_id:app_id”

Note: Please note the capital 'A' in the client ID. This marks the application as a scalable application for load-balancing. We just changed the small 'a' to a capital 'A' and could load-balance our mqtt consumers. 

Ruminating on CORS in REST APIs

Of all the articles I have studied on CORS, the below article by Derric Gilling is the most awesome. It is highly recommended to peruse this article to understand the fundamentals of CORS and how to enable REST APIs to support this.

Jotting down snippets from the above article:

CORS is a security mechanism that allows a web page from one domain or Origin to access a resource with a different domain (a cross-domain request). CORS is a relaxation of the same-origin policy implemented in modern browsers. Without features like CORS, websites are restricted to accessing resources from the same origin through what is known as same-origin policy.

The cross-domain vulnerability existed earlier because a hacker website could make authenticated malicious AJAX calls to to POST /withdraw even though the hacker website doesn’t have direct access to the bank’s cookiesThis is due to the browser behavior of automatically attaching any cookies bounded to for any HTTP calls to that domain, including AJAX calls from to

Why was CORS created?

There are legitimate reasons for a website to make cross-origin HTTP requests. Maybe a single-page app at needs to make AJAX calls to; or maybe incorporates some 3rd party fonts or analytics providers like Google Analytics or MixPanel. Cross-Origin Resource Sharing (CORS) enables these cross-domain requests. 

The CORS standard specifies the handshake between the browser and the server. The server has control over whether to allow the request or not depending on the origin of the request (Origin Header). The browser guarantees that the Origin request header is set reliably and accurately. Hence the server can restrict access to only selected URLs. 

Implementing CORS in Spring Boot is very easy. The following article shows the various options available in a Spring MVC REST service to enable CORS -

Database connection reconnection strategy

In any database connection pool, there is a risk of stale connections due to network outages or database server restarts. How do we refresh the connection pool without restarting the client application?

The standard technique used is to plug-in a validation query; that gets fired every time a connection is requested from the pool. This validation query is typically a default test query that does not result in any IO / disk access.  Examples of validation queries for different databases are given below:
  • Oracle - select 1 from dual
  • SQL Server - select 1 (tested on SQL-Server 9.0, 10.5 [2008])
  • Postgresql - select 1
  • hsqldb - select 1 from INFORMATION_SCHEMA.SYSTEM_USERS
  • derby - values 1
  • H2 - select 1
  • DB2 - select 1 from sysibm.sysdummy1
  • Mysql - select 1
We were using Spring Boot that uses the Tomcat JDBC Connection Pool by default. We tried setting all the parameters required for the validation check as given here, but in vain. 
Finally we decided to go with HikariCP connection pool as suggested here

First, we added the dependency in our maven pom as shown below. 

Next we added the following properties in our file. We will pick up these properties to create our HikariCP connection. pool.
Finally we wrote a @Configuration bean to wire up the HirakiCP datasource as below: