Sunday, November 19, 2017

Encrypting sensitive data in Spring application.properties file

If you want to encrypt passwords, keys and other sensitive information in your application.properties file, then you have a nifty solution from an open source encryption library called as Jasypt.

We begin by adding the maven dependency of jasypt-spring-boot-starter to our Spring Boot application. The steps involved in integrating Jasypt into your Spring Boot application is as follows:

1. First using Jasypy and a secret password, created encrypted tokens of all sensitive information.

2. Put this encrypted token in your properties file with the value enclosed with string 'ENC' - e.g. password=ENC(encrypted-token)

3. Retrieve your properties in Spring classes the same old way - e.g. using the @Value annotation or env.getProperty() method.

A good example explaining this is here - https://www.ricston.com/blog/encrypting-properties-in-spring-boot-with-jasypt-spring-boot/  with source code available here.


Friday, November 17, 2017

Changing the hashing algorithm for passwords

Recently, one of my teams wanted to update their password hashing algorithm that was in use for more than 2 years. As discussed here, passwords should always be hashed in the database.

If we directly employ the new hashing algorithm, then all users would lose the ability to login with their old passwords and would be forced to change their password.

Hence, we need to follow the following approach:

1. Add a new column to the user table that stores the hashing algorithm name (or version).
2. Set this version to 1 for all old passwords.
3. When the user logs in, first check the hash using the old algorithm. If it is correct, then hash the password using the new algorithm and save the hash.
4. Update the version column to 2.  

Saturday, October 21, 2017

Object pooling made simple using Apache Commons Pool2

If you are looking for a quick implementation of an object pool, then look no further than the excellent Apache Commons Pool2 implementation. Pool2 is far better and faster than the original commons pool library.

Object pool can be used to cache those objects that are expensive to setup and cannot be created for every request or thread - e.g. DB connections, MQTT broker connections, AMQP broker connections, etc. 

The following code snippets would show you how simple it is to create a pool for your 'expensive-to-create' objects. 
Any object pool typically requires 2 parameters  [GenericObjectPool.java] --- 
1) A factory object to handle creation and destruction of your objects [MyObjectFactory.java]
2) A configuration object to configure your pool. [GenericObjectPoolConfig,java]

Friday, October 20, 2017

Caching Simplified - Magic of Spring Annotations - Part 2

In the previous blog-post, we saw how we can enable caching with the magic of annotations.
Now, let's consider how to evict the cache based on our application needs.

Spring provides a simple annotation called @CacheEvict that can be used to evict the cache. In the below example, we add two methods that can be used to evict the cache 'SomeStaticData' and 'CountryList'.

But when do we call the evict method? We can schedule it to be called by the Spring Scheduler as per defined schedule. In the above example, we have scheduled the evict method once per day. Again so simple using the magic of annotations in Spring !

To enable scheduling by a background thread in Spring Boot, we need to just add the @EnableScheduling annotation on the Spring Boot main class. If Actuator is enabled, then it automatically creates a background thread and the @EnableScheduling annotation is then NOT required. 

 Another option is to have a "/evict/{CacheName}" endpoint registered in Spring MVC and call it from a browser to manually evict the cache.

Caching Simplified - Magic of Spring Annotations

Spring Boot makes is super simple to add caching abilities to your application. With just a few lines of code and some annotation magic, you have a complete caching solution available with you.

Spring provides wrappers for many cache implementations - Google Guava, ehcache, MongoDB, Redis, etc. At the simplest level, you can also use a ConcurrentHashMap as the cache manager.

Given below are three code snippets that shows how simple it is to enable caching for static data in a Spring Boot application - Welcome to the power of Spring !

Step 1:  [CachingConfiguration.java]  Configure a ConcurrentHashMap cache manager

Step 2:  [SpringBootCachingApplication.java] Annotate your Spring Boot application with @EnableCaching.

Magic behind the scenes: The @EnableCaching annotation triggers a post processor that inspects every Spring bean for the presence of caching annotations  on public methods. If such an annotation is found, a proxy is automatically created to intercept the method call  and handle the caching behavior accordingly.

Step 3: [StaticDataService.java] Annotate your methods that return the static data with  @Cacheable annotation. You can pass the cache name in the annotation - e.g.  @Cacheable("SomeStaticData")


 

That's it! You have configured caching. In the next blog-post, we will see how we can evict items from the cache when necessary.

Tuesday, October 17, 2017

Circular Dependencies in Spring Boot

Recently one of my team member was struggling with a strange issue in Spring Boot. The application was running fine on his local machine, but when deployed to the cloud it started giving errors - throwing BeanCurrentlyInCreationException.

 The core issue was identified as Circular Dependencies in Spring. An excellent article on how to identify such dependencies and resolve them is here - http://www.baeldung.com/circular-dependencies-in-spring

It is recommended best to avoid circular dependencies, but in worst case if you must have them, then there are multiple options such as:
1. Using the @Lazy annotation on one of the dependency.
2. Not use Constructor injection, but use a Setter injection
3. Use @PostConstruct to set the Bean

Tuesday, October 10, 2017

Generating secure keys for encryption

Very often, we need to create secure keys that can be used in digital signatures or for signing a JWT.
It is very important to create a secure key so that the encryption is strong. A good post about this is here - https://docs.oracle.com/cd/E19424-01/820-4811/aakfw/index.html

Jotting down some snippets from the article:

The strength of encryption is related to the difficulty of discovering the key, which in turn depends on both the cipher used and the length of the key. 

Encryption strength is often described in terms of the size of the keys used to perform the encryption: in general, longer keys provide stronger encryption. Key length is measured in bits.

Different ciphers may require different key lengths to achieve the same level of encryption strength

------------------------------------------------------------------------
A key is nothing but a byte array that is passed to the encryption algorithm.  Hence if you are storing the key in a properties file, please makes sure that you have stored it in Base64 format.

Spring provides very simple methods for generating secure keys as shown below.

Monday, October 09, 2017

Handling RabbitMQ broker re-connections

The RabbitMQ Java Client Library has default support for auto-recovery (since version 4.0.0). Hence the client will try to recover a broken connection unless you explicitly disable it.

If you want to fine-tune the retry mechanism, then the examples given in the below link would help.
http://www.rabbitmq.com/api-guide.html#recovery

Or alternatively, you can use the super easy Spring RabbitTemplate class that has retry options. The RabbitTemplate class wraps the RabbitMQ Java client and provides all the goodies of Spring. 

Handling MQTT broker re-connections

Any client that connects to an MQTT broker needs the ability to handle a connection failure.

The popular Eclipse Paho library now has support for reconnects as described here - http://www.eclipse.org/paho/files/mqttdoc/MQTTAsync/html/auto_reconnect.html
Sample client code available here - https://github.com/SolaceLabs/mqtt-3.1.1-test

If you want to understand how Paho implements reconnect, then have a look at this source file - https://github.com/eclipse/paho.mqtt.java/blob/master/org.eclipse.paho.client.mqttv3/src/main/java/org/eclipse/paho/client/mqttv3/MqttAsyncClient.java

Alternatively, we can use the Spring Integration Framework that encapsulates the Paho library and provides options to configure connection retry logic. https://docs.spring.io/spring-integration/reference/html/mqtt.html


Sunday, October 08, 2017

Ruminating on MQTT load balancing for scalable IoT event processing

MQTT brokers support the publish/subscribe paradigm. But what if you need to scale out the processing of MQTT messages over a cluster of nodes (message consumers)?

In IoT environments, we need to process thousands of events per second and hence need to load-balance incoming messages across multiple processing nodes. 
Unfortunately, the standard MQTT specification does not support this concept.

But many MQTT brokers support this as a non-standard feature - e.g. HiveMQ supports Shared Subscriptions

The IBM Watson IoT platform also supports shared subscriptions as described here - https://developer.ibm.com/recipes/tutorials/shared-subscription-in-ibm-iot-foundation/
If you are using the IBM Watson IoT java libraries, you need to "Shared-Subscription” property is set to “true”. If you are using any other client like Eclipse Paho, then you must use the client id of the form “A:org_id:app_id”

Note: Please note the capital 'A' in the client ID. This marks the application as a scalable application for load-balancing. We just changed the small 'a' to a capital 'A' and could load-balance our mqtt consumers. 

Ruminating on CORS in REST APIs

Of all the articles I have studied on CORS, the below article by Derric Gilling is the most awesome. It is highly recommended to peruse this article to understand the fundamentals of CORS and how to enable REST APIs to support this.

https://www.moesif.com/blog/technical/cors/Authoritative-Guide-to-CORS-Cross-Origin-Resource-Sharing-for-REST-APIs/

Jotting down snippets from the above article:

-------------------------------------------------------
CORS is a security mechanism that allows a web page from one domain or Origin to access a resource with a different domain (a cross-domain request). CORS is a relaxation of the same-origin policy implemented in modern browsers. Without features like CORS, websites are restricted to accessing resources from the same origin through what is known as same-origin policy.

The cross-domain vulnerability existed earlier because a hacker website could make authenticated malicious AJAX calls to https://examplebank.com/api to POST /withdraw even though the hacker website doesn’t have direct access to the bank’s cookiesThis is due to the browser behavior of automatically attaching any cookies bounded to https://examplebank.com for any HTTP calls to that domain, including AJAX calls from https://evilunicorns.com to https://examplebank.com.

Why was CORS created?

There are legitimate reasons for a website to make cross-origin HTTP requests. Maybe a single-page app at https://mydomain.com needs to make AJAX calls to https://api.mydomain.com; or maybe https://mydomain.com incorporates some 3rd party fonts or analytics providers like Google Analytics or MixPanel. Cross-Origin Resource Sharing (CORS) enables these cross-domain requests. 

The CORS standard specifies the handshake between the browser and the server. The server has control over whether to allow the request or not depending on the origin of the request (Origin Header). The browser guarantees that the Origin request header is set reliably and accurately. Hence the server can restrict access to only selected URLs. 
-------------------------------------------------------

Implementing CORS in Spring Boot is very easy. The following article shows the various options available in a Spring MVC REST service to enable CORS - http://www.baeldung.com/spring-cors

Database connection reconnection strategy

In any database connection pool, there is a risk of stale connections due to network outages or database server restarts. How do we refresh the connection pool without restarting the client application?

The standard technique used is to plug-in a validation query; that gets fired every time a connection is requested from the pool. This validation query is typically a default test query that does not result in any IO / disk access.  Examples of validation queries for different databases are given below:
  • Oracle - select 1 from dual
  • SQL Server - select 1 (tested on SQL-Server 9.0, 10.5 [2008])
  • Postgresql - select 1
  • hsqldb - select 1 from INFORMATION_SCHEMA.SYSTEM_USERS
  • derby - values 1
  • H2 - select 1
  • DB2 - select 1 from sysibm.sysdummy1
  • Mysql - select 1
We were using Spring Boot that uses the Tomcat JDBC Connection Pool by default. We tried setting all the parameters required for the validation check as given here, but in vain. 
Finally we decided to go with HikariCP connection pool as suggested here

First, we added the dependency in our maven pom as shown below. 

Next we added the following properties in our application.properties file. We will pick up these properties to create our HikariCP connection. pool.
Finally we wrote a @Configuration bean to wire up the HirakiCP datasource as below:

Saturday, October 07, 2017

Accessing Spring beans in a static method

Very often, we need to access a Spring bean in a static method. This is a bit tricky as you can only directly access static fields in a static method and Spring beans are not static.

But there are a couple of neat tricks that we can employ. One option is given below wherein we create a static field and populate it in the constructor when Spring auto-wires the bean. In the below example, MyThing is a spring managed (and created) bean.

Another option is to store the ApplicationContext instance in a static field and use that to get any bean managed by Spring. Sample code given below.


Spring beans annotations, scope and loading sequence

Spring beans can have different scopes as listed here . By default, the scope is singleton and the bean is eagerly initialized on start-up. If you want to lazy load a bean, then you have to decorate it with the @Lazy annotation.

By default, a bean has singleton scope and does not have the lazy-init property set to true. Hence by default all beans are eagerly created by the application context. This will affect the time taken for your Spring Boot application to start-up. 

The @Lazy annotation can be used on both @Component and @Bean. It is also important to understand how objects are constructed in Spring. The whole Spring framework works on the fundamental design pattern of 'Inversion of Control'or 'Dependency Injection'. Using Spring, you just declare components and Spring wires them up and manages the dependency between them.

When we mark a class with a @Component annotation, Spring will automatically create an instance of this class (make a bean) by scanning the classpath. This bean can be assessed anywhere in your application using the @Autowired annotation on fields, constructors, or methods in a component.

But what if you don't want Spring to auto-create the bean and want to control how the bean is constructed. To do this, you write a method that constructs your bean as you want and annotate that method with the @Bean tag. Please note that the @Bean annotation decorates a method and not a class. Thus a method marked with the @Bean annotation is a bean producer.

But how will Spring know where you have created @Bean annotated methods? Scanning all the classes will take a lot of time. Hence you annotate these classes which have @Bean methods with two annotations - @Configuration and @ComponentScan.
@Configuration annotation marks a class as a source of the bean definitions.
@ComponentScan annotation is used to declare the packages where Spring should scan for annotated components.

An excellent cheat sheet on Spring Annotations is available here.

Ruminating on Builder design pattern in Spring Security

While configuring Spring Security, you can use the properties file to configure the filters and other classes, but another popular option is declaring the configuration using code - using the Builder design pattern.

Understanding the way to use the builder is a bit tricky. Lets look at the below code and try to dissect it.

The 'http' object represents the org.springframework.security.config.annotation.web.builders.HttpSecurity object and is similar to Spring Security's XML element.
The .authorizeRequests() method returns the ExpressionInterceptUrlRegistry method that can be used to add HTTP URL matching patterns. After this, you can add additional information as required. 

A good blog-post explaining more such options is given here - https://www.javabullets.com/securing-urls-using-spring-security/

Wednesday, October 04, 2017

Spring Sleuth Tips and Tricks

Spring Sleuth coupled with Zipkin in an excellent tool to measure the performance bottlenecks in your distributed microservices architecture.

Spring Sleuth automatically creates HTTP and Messaging interceptors when calls are made to other microservices over REST or RabbitMQ. But what if you want to instrument the time taken to execute a method or time take for a third-party service to respond?

Initially we tried with the @NewSpan annotation on methods, but soon realized that this annotation DOES NOT work in the same class. A detailed discussion thread on this limitation is available here - https://github.com/spring-cloud/spring-cloud-sleuth/issues/617.

The core reason is that the proxies that Spring creates do not work for the method calls in the same object. Sleuth creates a proxy of the object so if we are calling a proxied method from the proxy itself then the method will be called directly without going through the proxy logic, hence no span will be created. Hence we need to manually add the span as follows and it would show up in the beautiful Zipkin UI.

Another great blog post around these techniques is here - http://www.baeldung.com/spring-cloud-sleuth-single-application

Tuesday, October 03, 2017

Parsing an arbitrary JSON string and extracting values

Developers use libraries such as Jackson to convert JSON strings to Java objects and vice-versa.
But what if you need to parse a JSON string and do not know the complete schema of it. Since you do not have the complete schema of the JSON, you cannot create the Java class.

Another situation could be when you don't want to create a Java class and just want to extract a few values from the JSON string. Given below are 2 techniques for achieving this:

Option 1:

Convert the JSON string to a HashMap. All libraries such as Jackson, Gson provide methods to convert the JSON string into a map. Then by using standard map functions, you can extract the values you are interested in. Sample code available here.

Option 2: 

Convert the JSON string into a 'tree model' object. Jackson has a default tree model object called JsonNode. Jackson also supports JSON Pointer Expression, so you can directly access a node value like this:


Monday, September 04, 2017

REST APIs for IoT?

Currently MQTT continues to be the most popular standard for IoT devices to communicate to the cloud. But for many scenarios, even REST can be a simple answer - especially if the network latency is low.

IoT Device as REST Server
Zetta (http://www.zettajs.org/) is a NodeJS framework that can convert any IoT device into an API endpoint (REST server). It is so lightweight that it can run on Raspberry Pi and other gateways. The minimum hardware requirements to run Zetta are 500MB RAM, 1 GB storage and 500 MHz CPU.

IoT Device as REST Client
dweet.io is another new project that simplifies publishing of device data to the cloud.  dweet.io enables your machine and sensor data to become easily accessible through a web based RESTful API, allowing you to quickly make apps or simply share data. 

Wednesday, August 16, 2017

Cool open source tool for ER diagrams

Recently one of my colleagues introduced me to a cool Java tool called SchemaSpy.
SchemaSpy is a java tool that can be used to create beautiful database documentation and also ER diagrams.

Would highly recommend perusing the following links and utilizing this tool:

Monday, August 07, 2017

Ruminating on Telematics standards

While creating training material for our internal staff on telematics, we found the below site that gives a very good introduction to novices on the basics of telematics -
http://www.equipmentworld.com/tag/telematics-101/

I had covered the business benefits of telematics in this blog post. One of the fundamental challenge for the wide spread adoption of telematics has been the lack of standards around aggregating data from multiple TCU (Telematics Control Unit) vendors.

But now, the Association of Equipment Managers (AEM) and Association of Equipment Management Professionals (AEMP) have created a ISO standard to help systems
from different manufacturers all speak the same language - http://www.aemp.org/?page=AEMPTelematicsAPI

The AEMP 1.0 standard defined a XML data format that is available here.
An example of the AEMP REST API by John Deere is available here (in both XML and JSON formats) - https://developer.deere.com/content/documentation/aemptwo/fleet.htm#

The newer version (v 2.0) of the AEMP telematics standard is now an ISO standard (ISO 15143-3).
An example of ISO 15143-3 telematics API (with XML data format) is here.

As part of this standard, equipment makers would report the following data points and 42 fault codes using a standard protocol that will allow mixed equipment fleets to be managed from a single application.

Data Elements
  1. Serial Number
  2. Asset ID
  3. Hours
  4. Location
  5. GPS Distance Traveled
  6. Machine Odometer
  7. Fault Codes
  8. Idle Time
  9. Fuel Consumption
  10. Fuel Level
  11. Engine Running Status (on/off)
  12. Switch Input Events
  13. PTO Hours
  14. Average Load Factor
  15. Max Speed
  16. Ambient Air Temp
  17. Load Counts
  18. Payload Totals
  19. Active Regen Hours

Fault Codes
  1. Engine coolant temperature
  2. Engine oil pressure
  3. Coolant level
  4. Engine oil temperature
  5. Hydraulic oil temperature
  6. Transmission oil temperature
  7. Engine overspeed
  8. Transmission oil pressure
  9. Water in fuel indicator
  10. Transmission oil filter, blocked
  11. Air cleaner
  12. Fuel supply pressure
  13. Air filter pressure drop
  14. Coolant system thermostat
  15. Crankcase pressure
  16. Cool water temp, outlet cooler
  17. Rail pressure system
  18. ECU temperature
  19. Axle oil temp, front
  20. Axle oil temp, rear
  21. Intake manifold temperature
  22. Secondary steering pressure
  23. After-treatment reagent internal filter heater
  24. Crankcase ventilation 
  25. Boost pressure
  26. Outgoing brake pressure status
  27. Brake pressure, output
  28. Brake pressure, output
  29. All wheel drive hydraulic filter, blocked
  30. All wheel drive hydraulic filter, blocked
  31. Brake pressure acc status
  32. Injection control pressure
  33. Brake circuit differential pressure
  34. Brake pressure acc charging
  35. Brake pressure actuator
  36. Reserve steering control pressure
  37. HVAC water temperature
  38. Rotor pump temperature
  39. Refrigerant temperature
  40. Alert and alarms help power down machines quickly
  41. Multiple pressure indicators
  42. Switching gear events

Monday, July 17, 2017

Performance benefits of HTTP/2

The HTTP/2 protocol brings in a lot of benefits in terms of performance for modern web applications. To understand what benefits HTTP/2 brings, it is important to understand the limitations of HTTP/1.1  protocol.

The HTTP protocol only allows one 'outstanding' request per TCP connection - i.e. if a page is downloading 100 assets, only one request can be made per TCP connection. Browsers typically open 4-8 TCP connections per page to load the page faster (parallel requests), but this results in network congestion.

The HTTP/2 protocol is fully multiplexed - i.e. it allows multiple parallel requests/responses on the same TCP connection. It also compresses HTTP headers (reduce payload size) and is a binary protocol (not textual). The protocol also allows servers to proactively push responses directly to the browsers cache, thus avoiding HTTP requests.

The following links give an excellent explanation of the HTTP/2 protocol and its benefits.
https://http2.github.io/faq/
https://bagder.gitbooks.io/http2-explained/en/

The following site shows the performance of HTTP/2 compared to HTTP/1.1 when loading tons of images from the server - https://www.tunetheweb.com/performance-test-360/

There is another site - https://www.httpvshttps.com/ which states that HTTPS is faster than HTTP, but that is because HTTPS was using HTTP/2 by default in the background.

Sunday, July 16, 2017

Ruminating on DMN - a new modeling notation for business rules

We all know how the BPMN standard helped in interoperability of business process definitions across lines of business and also across organizations. But there was one element missing in the BPMN standard - i.e. the ability to model business rules/decisions in a standard way so that they can be interoperable.

Most of the current business rule engines (aka BRMS) follow a proprietary standard for modeling rules and migration from one BRMS suite to another was usually painful. Hence, we are pretty excited about DMN (Decision Model and Notation), a standard from OMG that can be considered complimentary to BPMN.

Many Rule Modeling tools such as IBM Decision Composer, OpenRules and Drools already support the DMN standard. Part of the DMN standard is also a Friendly Enough Expression Language (FEEL). FEEL defines a syntax for embedding expressions in the rule.
An excellent tutorial about DMN using decision tables can be found here - https://camunda.org/dmn/tutorial/. A good video tutorial on IBM decision composer is here - https://www.youtube.com/watch?v=hG2rGDhowcU

It is important to understand the difference between a decision modeling tool and a decision execution engine. A decision modeling tool would give a GUI to define the decision model using the DMN standard. This decision model can be exported as a *.dmn file (which is in XML format - example here). The decision execution engine actually is the runtime to execute the model. Most of the existing rule engines have extended their support to run rules defined in DMN model. For example, the new IBM Decision Composer is a web-based rule modeling tool, but the modeled rules can be exported to run in the existing ODM engines.

So in theory, the DMN model created by one tool can be used to execute the model in another tool . Some folks have tested this and noted down the challenges in this whitepaper - The Effectiveness of DMN Portability

Sunday, April 16, 2017

Ruminating on the single-threaded model of NodeJS and Node-RED

Many developers and architects have asked me questions on the single-threaded nature of NodeJS and whether we can use it effectively on multi-core machines. Since Node-RED is based on NodeJS, whatever is valid for NodeJS is also valid for Node-RED.

NodeJS has a single thread per process model. When you start NodeJS, it would start a single process with one thread in it. Due to it's non-blocking IO paradigm, it can easily handle multiple client requests concurrently, as there is no thread that is blocking for any IO operation.

Now the next question is around the optimal usage of multi-core machines. If you have 8 core or 16 core machines on your cloud and just run one NodeJS process on it, then you are obviously under-utilizing the resources you have paid for. In order to use all the cores effectively, we have the following options in NodeJS:

1) For compute-intensive tasks: NodeJS can fork out child processes for heavy-duty stuff - e.g. if you are processing images or video files, you can fork out child NodeJS processes from the parent process. Communication between parent and child processes can happen over IPC.

2) For scaling REST APIs: You can start multiple NodeJS processes on a single server - e.g. if you have 8 cores, start 8 NodeJS processes. Put a load balancer (e.g. nginx) in front of these processes. You would anyways have some kind of load-balancing setup in your production environment and the same can be leveraged.

3) Cluster support in newer versions on NodeJS: In the latest versions of NodeJS, you have support for Clusters. Cluster enables us to start multiple NodeJS processes that all share the same server port - e.g. 8 NodeJS processes all listening to port 80.  This is the best OOTB option available today in NodeJS.

Hence it is indeed possible to effectively utilize NodeJS on multi-core machines. A good discussion on the same is available on StackOverflow here.

Another interesting question that is often asked is whether Java developers should jump ship and start development in NodeJS because of its speed and simplicity. Also, NodeJS evangelists keep harping about the fact that the single-threaded nature of Node removes the complexity of multi-threading, etc.

Here are my thoughts on this:

1) Today Java has first class support for non-blocking IO, similar to NodeJ. Take a look at the superb open-source  Netty library that powers all the cloud services at Apple.

2) Most of the complexity of multithreading is abstracted away by popular frameworks such as Spring - e.g. Spring Cloud enables developers to write highly scalable and modular distributed applications that are cloud-native without dealing with the complexities of multi-threading.

3) The Servlet 3.0 specification introduced async requests and the Spring REST MVC framework also supports non-blocking REST services as described here.  

Thus today, all the goodies of NodeJS are available on the Java platform. Also plenty of  Java OSS to kick start your development with all the necessary plumbing code.

Ruminating on a nifty tool called 'ngrok'

Many a times developers want to test their APIs running on their development machine against a mobile app. To do so, developers typically do the following:

Option A) Create a hotspot Wifi on their development machine and connect the mobile phone to this Wifi. Since both the mobile phone and their development machine is on the same network, the mobile app can call the APIs running on the localhost of the developer machine.

Option B) We host the APIs on a public cloud (with a public IP), so that the mobile app can access it. For this, the developer needs to setup a automated CI/CD process that would deploy the REST APIs on the middleware.

There is a third option where-in, the developer can run the API server on his local machine and just create a proxy server on the internet that would securely tunnel the request to this local machine through the firewall. This can be done using ngrok - https://ngrok.com/

ngrok can be setup on your local machine within minutes and can serve as a great debugging tool. It has a network sniffer built in that can be assessed from http://localhost:4040
We would highly recommend this tool for any mobile/API developer. 

Monday, February 13, 2017

IoT energy standards in Europe for Smart Home / Smart Grid

European governments are pretty energy conscious with a number of initiatives in the EMEA region around smart energy grids. Jotting down some of the energy standards that are relevant in Europe.

In order to successfully build a smart grid system, it is important to have standards through which home appliances can publish information on how much energy they are using in real-time, which in turn can be provided to consumers and energy management systems to manage energy demands more effectively.

Smart appliances are also technically very heterogeneous, and hence there is a need for standardized interfaces. The standardization can happen at two levels - the communication protocol level or the message-structure level.
  • SAREF (Smart Appliances REFerence) is a standard for exchange of energy related information between home appliances and the third-party energy management systems. SAREF creates a new reference language for energy-related data. 
  • EEBus - A non-profit organization for interoperability  in sectors of smart energy, smart home & building. EEBus has defined a message-structure standard called as SPINE (Smart Premises Interoperable Neutral-message Exchange). SPINE only defines message structure at the application level (OSI Layer 7) and is completely independent from the used transport protocol. SPINE can be considered a technical realization of the SAREF onthology. 
  • Energy@Home - Another non-profit organization that creating an ecosystem for smart-grid. 
  • SAREF4EE - The extension of SAREF for EEBus and Energy@Home initiatives. By using SAREF4EE, smart appliances from different manufacturers that support the EEBus or Energy@Home standards can easily communicate with each other using any energy management system at home or in the cloud.

A good presentation illustrating the current challenges in Smart Home is available here - http://www.slideshare.net/semanticsconference/laura-daniele-saref-and-saref4ee-towards-interoperability-for-smart-appliances-in-the-iot-world

Thursday, December 22, 2016

Ruminating on AMQP internals and JMS equivalent semantics

Many Java developers often use JMS APIs to communicate with the message broker. The JMS API abstracts away the internal implementation complexities of the message broker and provides a unified interface for the developer.

But if you are interested in understanding the internals of AMQP, then the following old tutorial on the Spring site is still the best - https://spring.io/blog/2010/06/14/understanding-amqp-the-protocol-used-by-rabbitmq/ and https://spring.io/understanding/AMQP

By understanding the core concepts of exchange, queue and binding keys, you can envisage multiple integration patterns such as pub/sub, req/reply, etc.

Jotting down snippets from the above sites on the similarities between JMS and AMQP.

JMS has queues and topics. A message sent on a JMS queue is consumed by no more than one client. A message sent on a JMS topic may be consumed by multiple consumers. AMQP only has queues. AMQP producers don't publish directly to queues. A message is published to an exchange, which through its bindings may get sent to one queue or multiple queues, effectively emulating JMS queues and topics.

AMQP has exchanges, routes, and queues. Messages are first published to exchanges with a routing key. Routes define on which queue(s) to pipe the message. Consumers subscribing to that queue then receive a copy of the message. If more than one consumer subscribes to the same queue, the messages are dispensed in a round-robin fashion.

It is very important to understand the difference between routing key and binding key. Publishers publish messages to the AMQP Exchange by giving a routing key; which is typically in the form of {company}.{product-category}.{appliance-type}.{appliance-id}.....

You create a queue on AMQP by specifying the binding key. For e,g, if your binding key is {company}.{product-category}.# then all messages for that product category would come to this queue.

While creating subscribers, you have two options - either bind to an existing queue (by giving the queue name) or create a subscriber private queue by specifying a binding key. 

Thursday, December 01, 2016

Ruminating on non-blocking REST services

In a typical REST service environment, a thread is allocated to each incoming HTTP request for processing. So if you have configured your container to start 50 threads, then you can handle 50 concurrent HTTP requests and any additional HTTP request would be queued till a thread is free.

Today, using principles of non-blocking IO and reactive programming, we can break the tight coupling between a thread and a web request. The Servlet-3 specification also supports async requests as explained in this article - http://www.journaldev.com/2008/async-servlet-feature-of-servlet-3. The core idea is to  delegate the long-running or asynchronous processing to another background thread (Task Executor), so that the HTTP handler threads are not starved.

One might argue that we are just moving the 'blocking thread' bottleneck from the HTTP threads to the backend thread pool (Task Executors). But this does result in better performance, as we can serve more HTTP clients.

The Spring MVC documentation on Async REST MVC is worth a perusal to understand the main concepts - http://docs.spring.io/spring/docs/current/spring-framework-reference/htmlsingle/#mvc-ann-async

A good article that demonstrates how Sprint MVC can be used to build non-blocking REST services that call a backend service exposed via JMS - https://dzone.com/articles/non-blocking-rest-services-with-spring. All the source code for this example is available here. One must understand the basics of Spring Integration framework to work with the above example.

The following discussion thread on StackOverFlow would also give a good idea on how to implement this - http://stackoverflow.com/questions/14134086/request-queue-in-front-of-a-rest-service

The following blog-post shows another example using Spring Boot on Docker - http://spiral-architect.pl/2016/04/25/asynchronous-processing-of-restful-web-services-requests-in-spring-and-java-ee-worlds-wrapped-in-docker-part-1-spring/




Concurrency and scaling strategies for MDPs and MDBs

Message Driven Beans offer a lot of advantages over a standalone JMS consumer as listed in this blog-post. The Spring framework provides us with another lightweight alternative called MDP (Message Driven POJOs) that has all the goodies of MDB, but without any heavy JEE server baggage.

The strategy for implementing a scalable and load-balanced message consumption solution is a bit different in MDBs vs. MDPs.

An MDB is managed by the JEE container and is 'thread-safe' by default. The JEE container maintains a pool of MDBs and allows only one thread to execute an MDB at one time. Thus if you configure your JEE container to spawn 10 MDBs, you can have 10 JMS consumers processing messages concurrently.

Spring MDPs are typically managed by the DMLC (DefaultMessageListenerContainer). Each MDP is typically a singleton, but can have multiple threads running through it. Hence an MDP is NOT 'thread-safe' by default and we have to make sure that our MDPs are stateless - e.g. do not have instance variables, etc. The DMLC can be configured for min/max concurrent consumers to dynamically scale the number of consumers. All connection, session and consumer objects are cached by Spring to improve performance. Jotting down some important stuff to remember regarding DMLC from the Spring Java Docs.

"On startup, DMLC obtains a fixed number of JMS Sessions to invoke the listener, and optionally allows for dynamic adaptation at runtime (up to a maximum number). Actual MessageListener execution happens in asynchronous work units which are created through Spring's TaskExecutor abstraction. By default, the specified number of invoker tasks will be created on startup, according to the "concurrentConsumers" setting."

It is also possible in Spring to create a pool of MDPs and give one to each TaskExecutor, but we have never tested this and never felt the need for this. Making your MDPs stateless and hence 'thread-safe' would suffice for almost all business needs. Nevertheless, if you want to create a pool of MDP objects, then this link would help - http://forum.spring.io/forum/spring-projects/integration/jms/25794-mdb-vs-mdp-concurrency

Another good article on Spring MDP scaling that is worth perusing is here - http://bsnyderblog.blogspot.in/2010/05/tuning-jms-message-consumption-in.html


Implementing a Request Response with Messaging (JMS)

Quite often, we need to implement a request/response paradigm on JMS - e.g. calling a backend function on a mainframe or a third-party interface that has an AMQP endpoint. To implement this, we have the following 3 options:

Option 1: The client creates a temporary queue and embeds the name-address of this queue in the message header (JMSReplyTo message header) before sending it to the server. The server processes the request message and writes the response message on to the temp queue mentioned in the JMSReplyTo header.
The advantage of this approach is that the server does not need to know the destination of the client response message in advance.
An important point to remember is that a temporary queue is only valid till the connection object is open. Temporary queues are generally light-weight, but it depends on the implementation of the MOM (message-oriented middleware).

 Option 2: Create a permanent queue for response messages from the server. The client sets a correlation ID on the message before it is sent to the server. The client then listens on the response queue using a JMS selector - using the correlation ID header value as the selector property. This ensures that only the appropriate message from the response queue is delivered to the appropriate client.

Option 3: Use a combination of Option 1 and Option 2.  Each client creates a temporary queue and JMS Consumer on startup. We need to use this option if the client does not block for each request. So the client can keep on sending messages and listen to multiple response messages on the temp queue and then match the req/res using the correlation ID.

Spring has implemented this design pattern in the JMS Outbound Gateway class - http://docs.spring.io/spring-integration/reference/html/jms.html#jms-outbound-gateway

The JMS spec also contains an API for basic req/res using temporary queues - QueueRequestor and TopicRequestor. So if your MOM supports JMS, then you can use this basic implementation if it suffices your needs.

The following links would serve as a good read on this topic.
http://www.enterpriseintegrationpatterns.com/patterns/messaging/RequestReplyJmsExample.html
http://activemq.apache.org/how-should-i-implement-request-response-with-jms.html

Monday, November 14, 2016

Ruminating on AMQP vs MQTT

While working on IoT projects, a lot of our customers ask us to recommend the open protocol to be used between the remote devices and the cloud-based IoT platform - e.g. MQTT or AMQP.

First and foremost, it is important to note that both MQTT and AMQP operate on top of TCP/IP and hence can use TLS to encrypt the transport layer. Jotting down some of the important characteristics of each protocol to help you make a decision.

MQTT
  • Very low resource (CPU/memory) footprint - suitable to constrained IoT devices. 
  • Support for pub-sub model (topics), but NO support for peer-to-peer messaging (queues). 
  • Only supports binary message format. So your JSON strings would need to be converted to bytes by your MQTT client - (be cautious about encoding/decoding especially UTF)
  • Does not support advanced messaging features such as transactions, custom message headers, expiration, correlation ID, etc. 
AMQP
  • Was designed for interoperability between different messaging middleware (MOM). 
  • Supports advanced features such as reliable queuing, transactions, etc.
  • A lot of fine-grained control available to control all the features. 
  • Useful for preventing vendor lock-in. 
Final verdict (recommendation) - Use MQTT for IoT applications and use AMQP for enterprise messaging.

Some good reading links provided below:

https://lists.oasis-open.org/archives/amqp/201202/msg00086/StormMQ_WhitePaper_-_A_Comparison_of_AMQP_and_MQTT.pdf

https://www.linkedin.com/pulse/mqtt-enterprise-matt-pavlovich?articleId=8012630610818408760