Tech Talk

Sunday, May 31, 2020

Ruminating on Mutual Authentication

In mutual authentication, both the server as well as the client have digital certificates and authenticate each other. If both the server and client are using CA signed certificates, then everything would work OOTB and there would be no need to import any certificates. This is because, both the server and client default trust stores would have the root certificates of most CAs.

But during testing and in lower environments, teams often use self-signed certificates. To enable mutual authentication using self-signed certificates, we have 2 options.

Peer-2-Peer: Create a client certificate for each agent. Import this cert into the trust store of the server.
Root cert derived client certifications: Create a client root certificate and using this root certificate, create/derive client certs for each agent. Then you just have to import the client root certificate into the server trust store ( and not of all the agents).

The following links are a good read for implementing the above.

https://docs.oracle.com/cd/E19879-01/819-3669/bnbyi/index.html

https://gist.github.com/granella/01ba0944865d99227cf080e97f4b3cb6

https://sites.google.com/site/ddmwsst/create-your-own-certificate-and-ca

https://docs.couchbase.com/server/current/manage/manage-security/configure-client-certificates.html

https://docs.microsoft.com/en-us/previous-versions/msp-n-p/ff650751(v=pandp.10)?redirectedfrom=MSDN

Thursday, May 28, 2020

Ruminating on Azure RTOS

Microsoft acquired ThreadX from Express Logic and re-branded it as Azure RTOS. ThreadX was already a popular RTOS that is being used by more than 6.5B devices worldwide.

** Gartner predicts that by 2021, one million new IoT devices will come online every hour of every day. In 2019, there were approx 27B IoT devices.

Besides ThreadX, Azure RTOS has also packaged other modules such as GuiX, FileX, NetX, USBX, etc.

The below link points to an interesting conversation with Bill Lamie - founder of ThreadX.

https://youtu.be/TGWg7yQATqQ

Jotting down some interesting points below.

The most important characteristic of an RTOS is size. RTOS size is typically in KB, whereas general purpose OS is in MB or GB. Because of this size, RTOS can be used in the smallest of devices...even battery powered ones - e.g. fitness wearables, medical implants, etc. So essentially RTOS is great for constrained/smaller devices.
RTOS is "real-time" because the OS responds to real time events in a deterministic time frame. An RTOS guarantees that certain actions can happen on IoT devices within defined time limits - a feature called as determinism.
The size of Azure RTOS can scale down all the way to 2KB. A cloud connected RTOS would take 50KB.
Azure RTOS also brings in best-of-class security with multiple security certifications.
The complete source code of Azure RTOS is open-source and available on GitHub at https://github.com/azure-rtos

Before the acquisition of Express Logic, Microsoft had an offering called Azure Sphere OS that was positioned as an OS for edge devices. Azure Sphere is more secure and is Linux kernel based, but cannot run on highly constrained devices. Also it has a Linux kernel and is not an RTOS and hence cannot provide deterministic execution.

Though Microsoft is currently stating that Azure RTOS and Azure Sphere are complementary, only time will tell which OS the industry adopts.

Sunday, May 24, 2020

Open DataSets

The following sites offer many free data-sets that can be used to train our AI models and learn new stuff on AIML. Happy coding !

Saturday, April 18, 2020

Performance instrumentation via DataDog

Recently my team was looking for a solution to implement custom metrics in Java microservices that would then ultimately be fed to DataDog. We explored the following multiple options to add custom performance instrumentation.

Using StatsD: StatsD is a network daemon that runs on the Node.js platform and listens for statistics, like counters and timers, sent over UDP or TCP and sends aggregates to one or more pluggable backend services (e.g., Graphite, DataDog). StatsD is very popular and has become a de facto standard for collecting metrics. Opensource libraries are available in all popular languages to define and collect metrics. More information on StatsD can be found here - https://github.com/statsd/statsd.

Using DogStatsD: DogStatsD is a custom daemon by DataDog. You can consider it as an extension over StatsD with support for many more metric types. This daemon needs to be installed on the node where you need to collect metrics. If a DataDog agent is already installed on the node, then this daemon is started by default. DataDog has also provided a java library for interfacing with DogStatsD. More information can be found here - https://docs.datadoghq.com/developers/dogstatsd/

Using DataDog HTTP API: DataDog also exposes a REST API that can be used to push metrics to the DataDog server. But it does not make sense to push each and every metric using HTTP. We would need some kind of aggregator on the client side that would collate all data for a time period and then make a HTTP call to DataDog server. https://docs.datadoghq.com/api/

Using JMX: DataDog agent can read data pushed to JMX console from JMX beans. Hence if you are already using JMX, then this is a good option to explore. https://docs.datadoghq.com/integrations/java/

Using DropWizard bridge: If you are already using the popular DropWizard metrics library, then the developers at Coursera have created a neat opensource library that acts as a bridge between DropWizard and DataDog - https://github.com/coursera/metrics-datadog

Using Micrometer Metrics Facade: If you are using Spring Boot, then this is the best seamless option available for you. Spring Boot Actuator has default support for Micrometer facade library and already provides a DataDogRepository implementation that can be used to push metrics to DataDog. The advantage of using Micrometer facade library is that we can switch to any other metrics backend easily - e.g. switching from DataDog to AWS CloudWatch. Also we can have composite repository wherein we can publish the same metrics to multiple backends.

We finally decided to use the Micrometer metrics library, as all our microservices were on Spring Boot. Spring Boot 2 has many OOTB metrics configured in micrometer that are of tremendous value for DevOps teams - https://spring.io/blog/2018/03/16/micrometer-spring-boot-2-s-new-application-metrics-collector

Behind the scenes, the micrometer DataDog repository uses the DataDog HTTP APIs to push metrics to the server. There is a background thread that collects/aggregates data and then makes a periodic call to the DataDog server. Perusing the following source code files would give a good overview of how this works:

https://git.io/JfJDC
https://git.io/JfJD8

To configure DataDog in Spring Boot, you just need to enable the following 2 properties.

management.metrics.export.datadog.api-key=YOUR_KEY //API key

management.metrics.export.datadog.step=30s //the interval at which metrics are sent to Datadog

It is also very easy to implement micrometer code in Spring Boot. Sample code below:

Wednesday, April 15, 2020

Kafka poll() vs heatbeat()

In older versions of Kafka, the consumer was responsible for polling the broker frequently to prove that it is still alive. If the consumer does not poll() within a specified time-limit, then the broker considers that consumer to be dead and starts re-balancing the messages to other consumers.

But in latest versions of Kafka Consumer, a dedicated background heartbeat thread is started. This heartbeat thread sends periodic heartbeats to the broker to say -"Hey, I am alive and kicking!..I am processing messages and will poll() soon again".

Thus the newer versions of Kafka decouple polling functionality and heartbeat functionality. So now we have two threads running, the heartbeat thread and the processing thread (polling thread).
The heartbeat frequency is defined by the session.timeout.ms property (default = 10 secs)

Since there is a separate heartbeat thread now, the authors of Kafka Consumer decided to set the default for the polling timeout as INTEGER_MAX. (attribute: max.poll.interval.ms)
Hence no matter how long the processing takes (on the processing/polling thread), the Kafka broker will never consider the consumer to be dead. Only if no poll() request is received after INTERGER_MAX time, then the consumer would be considered dead.
.
Caveat: If your processing has a bug - (e.g. infinite loop, processing has called a third-party webservice and is stuck, etc.), then the consumer will never be pronounced dead and the messages will start getting piled up in that partition. Hence, it may be a good idea to set a realistic time for the polling() interval, so that we can rebalance the messages to other consumers.

The following 2 stackoverflow discussions were extremely beneficial to us to help us understand the above.

https://stackoverflow.com/questions/47906485/max-poll-intervals-ms-set-to-int-max-by-default
https://stackoverflow.com/questions/39730126/difference-between-session-timeout-ms-and-max-poll-interval-ms-for-kafka-0-10-0

Wednesday, January 22, 2020

Converting Java libraries to .NET DLLs

If you have a nifty java library that you love and would want to use it in your .NET program, then please have a look at this useful toolkit called IKVM.NET - https://www.ikvm.net/uses.html

ikvmc -target:library {mylib.jar} ------- will create mylib.dll

Java libraries for SSH and Powershell automation

If you are doing some basic automation and want to execute commands on Linux or Windows, then the following open source libraries would help.

JSCH : http://www.jcraft.com/jsch/
JSch is a pure Java implementation of SSH2 and once you connect to a Linux server, you can execute all commands. A good tutorial is available here - https://linuxconfig.org/executing-commands-on-a-remote-machine-from-java-with-jsch

jPowerShell: https://github.com/profesorfalken/jPowerShell
This is a simple Java API that allows to interact with PowerShell console. Sample code below:

Saturday, June 08, 2019

On @Cacheable annotation of Spring

The @Cacheable annotation of Spring works like magic, allowing us to create caches easily with just one line of code.
But it is important to not that the caching will NOT work if you are calling method is in the same class !
This happens because Spring injects a proxy at runtime to implement caching and if you are making calls to cacheable methods from the same class, the proxy is not used.

More information on this thread - https://stackoverflow.com/questions/16899604/spring-cache-cacheable-not-working-while-calling-from-another-method-of-the-s

Sunday, April 14, 2019

Ruminating on Digital Twin

A 'Digital Twin' is a digital replica of any asset, process or system.

Digital Twin of an Asset

Let's take for example a digital replica of a fixed asset like a CNC machine or a moving asset like a truck. Using the power of IoT, we could create a digital twin of the asset on the cloud that would have the same 'state' as the asset on the field
Once you have the digital twin of an asset on the cloud, you can realize business value from it in the following broad areas:

Remote Monitoring and Diagnostics - using sensor data
Predictive Maintenance - using ML
Enhancing Product Quality - using closed loop engineering and integrating your PLM data and Field Service data with IoT data
Transforming Consumer Experience - understand how consumers are using your products and build new differentiated digital capabilities such as hyper-personalization, etc.

Digital Twin of a Process/System

A digital twin of a complete process would enable us to identify process bottlenecks and take necessary steps to streamline processes and improve the productivity. Such a capability is of immense business value in the process industry.

Digital Twin of a Person (Relationship Graph)

This is something which all social media companies already have - e.g. FB, LinkedIn, Google, etc. Besides storing all personal attributes, a digital twin of a person would also create a relationship graph that shows how the person is connected to other people and leverage this graph for business value. There are ethical concerns around how much data we are sharing with these large social media companies and the potential misuse of that data.

Wednesday, December 12, 2018

Blockchain in Healthcare

The usage of blockchain in healthcare is gaining traction. Any information that can be privately and securely shared between payers and providers is a good case for blockchain. Jotting down some of the blockchain use-cases that are being explored today -

1) Improving Provider Data Accuracy: Every year, Payers spend millions of dollars in maintaining an up-to-date record of their providers. Provider data management is crucial for maintaining an accurate provider directory. Accurate provider data is critical for connecting patients with appropriate network care providers.

But each Payer has an independent provider directory and repeats the process of collecting and validating provider credentials.
Having a secure blockchain backed provider directory makes sense; as all Payers can collaborate and share provider information. The Synaptic alliance was setup for this exact purpose. The Alliance views blockchain technology as a means to a critical end: ensuring that provider data is accurate and
sharable for reliable use across the healthcare ecosystem. More information about this alliance can be found in this whitepaper here.

2) Clinical Member Profile: Payers often need longitudinal clinical records of a member from different providers to manage the care better. Instead of spending time and effort in setting up brittle interfaces with hospital systems, payers and providers can use a private blockchain system to share clinical member records.
All data in the blockchain would be encrypted and immutable. A private blockchain can also help in regulatory compliance because they establish a trusted audit trail.

HIPAA compliant blockchain HER platforms are already present in the market:

https://patientory.com/

https://timicoin.io/

Caveat: Today large volumes of unstructured data (e.g. DICOM images, PDF files) are not cost effective to be stored on a blockchain. It is recommended to store the ‘link references’ to these resources in the blockchain.

Tuesday, November 27, 2018

Autonomous car levels

There are 2 systems of classification for autonomous cars prevalent today: National Highway Safety Administration (NHTSA) and Society of Automotive Engineers (SAE).

A good article illustrating the various levels is here - https://jalopnik.com/whats-a-level-4-autonomous-car-this-chart-explains-eve-1785466324

Ruminating on POC vs. POV

Of-late, it has become a fad to label any pilot project as a POV (Proof of Value), rather than a POC (Proof of Concept). But, is there a real difference between POV and POC?
Jotting down my views below:

A POC is typically an internal project and is not exposed to the real end consumer. The objective of an POC is to validate if the technology works or if a concept is viable.
POCs can be used to explore emerging technologies and share knowledge within the team. POCs also help teams come up with more accurate estimation of stories.
Thus a POC will prove that the technology works, but will it deliver the promised business value to the enterprise?
A POV will prove the business value of a concept - it can be in terms of increased ROI, lower TCO, faster GTM or increased customer satisfaction. If these factors can be measured, then you are delivering a POV !

Saturday, October 20, 2018

Cloud computing from the trenches

My team is loving building applications on the cloud and scaling them. Some of the best practices that we implemented in the last few cloud projects are as follows:

1. Infrastructure as Code: It is imperative that you develop automation to build-up and tear-down your complete application infrastructure with just a click of a button. This includes provisioning data services such as RDBMS database, NoSQL data store and populating the data-stores with data. Follow this by deployment of your docker containers containing your web apps/microservices - all managed by Kubernetes. Then finally run a few synthetic transactions to validate the complete setup.

In one of our projects, we could build-up and tear-down the complete pre-prod environment using automation scripts.

2. Stateless Services: In order to have seamless elastic scalability on the cloud, it is important to design your services to be completely stateless. Any state that needs to be saved, should be done in an external store. We have successfully used Redis as the store for many stateful applications or sharing data across microservices.

3. Circuit Breakers and Graceful Degradation: Make sure that service calls happen through a circuit breaker (e.g. Netflix Hystrix). This prevents overloading any system component in the event of a partial failure. Put in mechanisms for graceful degradation where ever possible - e.g. return data from a cache if database is down. Such measures avoid cascading failures.

4. Rolling updates and Canary Deployments: Kubernetes also supports a rolling update, so your downtime is reduced during service updates. Also canary deployments reduce the risk of introducing new features with faster GTM.

5. Autoscaling: Automate for elastic scalability - e.g. if a microservice is running on 4 containers, you should be able to scale-up and scale-down by a click of a button.

6. Redundancy / High Availability: Make sure that all your infrastructure services that are available as managed services on the cloud have redundancy across geographies built in - data stores, messaging middleware, noSQL stores, etc. Make sure that your services are also deployed in a redundant manner across data centers.

Wednesday, October 17, 2018

Ruminating on Consumer Driven Contracts

One of my teams has successfully implemented the paradigm of consumer driven contracts in a recent digital transformation program. We were very happy with the Pact framework and the OOTB integration available in Java (Spring Boot).

Anyone still not convinced on using Consumer Driven Contracts should peruse the below link without fail - https://docs.pact.io/faq/convinceme

The fundamental advantage of Consumer Driven Contracts is that only parts of the API that are actually used by the consumer get tested. Hence any provider behavior not used by current consumers is free to change without breaking tests.
When you are about to change the contract of a popular API, you can quickly check which consumers would be affected and where to focus your efforts on.

It is important to remember that Pact should not be used for functional testing...it is to be used only for contract adherence. Pact can also be integrated into your CI-CD process, wherein you run all the consumer contracts as part of the build.

Saturday, September 08, 2018

Ruminating on Kafka consumer parallelism

Many developers struggle to understand the nuances of parallelism in Kafka. So jotting down a few points that should help from the Kafka documentation site.

Consumers label themselves with a consumer group name, and each record published to a topic is delivered to one consumer instance within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.
Publishers can publish events into different partitions of Kafka. The producer is responsible for choosing which record to assign to which partition within the topic. This can be done in a round-robin fashion simply to balance load or it can be done according to some semantic partition function (say based on some key in the record).
The partitions in the log serve several purposes. First, they allow the log to scale beyond a size that will fit on a single server. Each individual partition must fit on the servers that host it, but a topic may have many partitions so it can handle an arbitrary amount of data. Second they act as the unit of parallelism.

Unlike other messaging middleware, parallel consumption of messages (aka load-balanced consumers) in Kafka is ONLY POSSIBLE using partitions.

Kafka keeps one offset per [consumer-group, topic, partition]. Hence there cannot be more consumer instances within a single consumer group than there are partitions.

So if you have only one partition, you can have only one consumer (within a particular consumer-group). You can of-course have consumers across different consumer-groups, but then the messages would be duplicated and not load-balanced.

Batch ETL to Stream Processing

Many of our customers are moving their traditional ETL jobs to real-time stream processing.
The following article is an excellent read of why Kafka is an excellent choice for unified batch processing and stream processing.

https://www.infoq.com/articles/batch-etl-streams-kafka

Snippets from the article:

Several recent data trends are driving a dramatic change in the old-world batch Extract-Transform-Load (ETL) architecture: data platforms operate at company-wide scale; there are many more types of data sources; and stream data is increasingly ubiquitous
Enterprise Application Integration (EAI) was an early take on real-time ETL, but the technologies used were often not scalable. This led to a difficult choice with data integration in the old world: real-time but not scalable, or scalable but batch.
Apache Kafka is an open source streaming platform that was developed seven years ago within LinkedIn.
Kafka enables the building of streaming data pipelines from “source” to “sink” through the Kafka Connect API and the Kafka Streams API.
Logs unify batch and stream processing. A log can be consumed via batched “windows”, or in real time by examining each element as it arrives.

Thursday, August 30, 2018

Tips and Tricks for Thread Dumps

Tip 1#: To find out the number of threads spawned by the JVM, run the following command: ps -eLF
This command will also print a column called 'LWP ID' (light-weight process ID) that prints the thread-id and the CPU utilization of that thread. This same thread-id can be correlated in the thread-dump obtained from the JVM.

Tip 2#: The thread-dump can be obtained by using the following command: jstack PID
If you are using Spring Boot framework, then we can use the Actuator URLs to download the thread-dump.

Tip 3#: This thread-dump file can be uploaded to a cool online tool : http://fastthread.io/ which gives a beautiful report on the threads running inside the JVM that can be analyzed.

Wednesday, August 29, 2018

Ruminating on Thread Pool sizes and names

Recently we were analyzing the thread-dumps of some JVMs in production and found a large number of threads created in certain thread-pools. Since the name of the thread-pool was generic (e.g. thread-pool-3, etc.) it was very difficult to diagnose the code that spawned the thread-pool.

The following code snippets should help developers in properly naming their thread-pools and also limiting the number of threads. In a cloud environment, the number of default threads in a pool will vary based on the CPU's available - for e.g. in the absence of a thread-pool size, the thread pool may grow to hundreds of threads.

We had seen this happen for the RabbitMQ java client. The default RabbitMQ java client uses a thread pool for callback messages (ACK) and the size of this thread-pool depends on the number of CPUs. Since the JVM is not aware of the docker config, all the processors on the host machine are considered and a large thread pool is created.

More details available here - https://github.com/logstash-plugins/logstash-input-rabbitmq/issues/93

Wednesday, July 18, 2018

Creating a RabbitMQ pipeline using listeners/publishers

In Event Driven Architectures, you often have to create a pipeline of event processing. One of my teams was using Spring AMQP libary and wanted to implement the following basic steps.
1. Read a message from the queue using a RabbitMQ channel.
2. Do some processing and transform the message.
3. Publish the message downstream with the same channel

The sample code given below will help developers in implementing this.

Another scenario is when you have multi-threaded code that is publishing messages to RabbitMQ, then you can use the RabbitMQTemplate with Channel caching. Sample code given below.

Wednesday, July 04, 2018

Credit Card Processing Fundamentals

For folks who want to learn the fundamentals of credit card processing, the following links would be useful. The intuitive illustrations make it very easy to understand the overall flow.

http://blog.unibulmerchantservices.com/transaction-authorization-process/

http://blog.unibulmerchantservices.com/submission-clearing-and-settlement-of-credit-card-transactions/

http://blog.unibulmerchantservices.com/mastercard-chargeback-stages/

Wednesday, June 27, 2018

Simple multi-threading code

Very often, we have to process a list of objects. Using a for-loop would process these objects in sequence. But if we want to process them in parallel, then the following code snippet will help.

Simple Utils for File & Stream IO

I still see many developers reinventing the wheel when it comes to IO/Stream operations. There are so many open-source libraries for doing this for you today :)

Given below are some code snippets for the most common use-cases for IO.

Wednesday, May 23, 2018

Ruminating on Agile estimates

Over the past few years, we have been using story points for estimation, rather than using man-hours.
For a quick introduction of agile estimation, please peruse the following links that give a good overview.

https://rubygarage.org/blog/how-to-estimate-with-story-points
https://rubygarage.org/blog/how-to-estimate-project-cost
https://rubygarage.org/blog/3-reasons-to-estimate-with-story-points

But still folks struggle to understand the advantages of estimating by story points. During the planning poker session, all team members discuss each story point and arrive at the story points through consensus. Thus each team member has skin in the game and is involved in the estimation process.

The time needed to complete a story point will vary based on a developer’s level of experience, but the amount of work is correctly estimated using story points.

IMHO, velocity should be calculated only after 2-3 sprints. This average velocity (#story-points/sprint) can be used to estimate the calendar timelines for the project.

Thursday, March 29, 2018

Cool Java client library for secure FTP

My team was looking for a library to PUT and GET files from a sftp server. We used the cool Java library called jSch (www.jcraft.com/jsch/).

Sample code to download and upload files to the SFTP server is given below..

Sunday, March 18, 2018

Spring WebSocket STOMP tips and tricks

Recently we successfully implemented secure Websockets in one of our projects and learned a lot of tricks to get things working together. Given below are some tips that would help teams embarking on implementing WebSockets in their programs.

1) Spring uses the STOMP protocol for Web Sockets. The other popular protocol for Web Sockets is WAMP, but Spring does not support it. Hence if you are using Spring, make sure that your Android, iOS and JS libraries support STOMP.

2) Spring websocket library by default also supports SockJS as a fall-back for web JS clients. If your use-case only entails supporting Android and iOS clients, then disable SockJS in your Spring configuration. It might happen that a use-case might work on SockJS on a web-client, but fail in native mobile code.

3) The default SockJS implementation of Spring, sends a server heartbeat header (char 'h') every 25 seconds to the clients. Hence there is no timeout on the socket connection for JS clients. But on mobile apps (pure STOMP), there is no heartbeat configured by default. Hence we have to explicitly set the heartbeat on the server OR on the client to keep the connection alive when idle. Otherwise, idle connections get dropped after 1 minute. Reference Links: https://stackoverflow.com/questions/28841505/spring-4-stomp-websockets-heartbeat
Sample server side code below.

4) One Android websocket library that works very well with Spring websockets is https://github.com/NaikSoftware/StompProtocolAndroid. But unfortunately, this library does not support heartbeats. Hence you have to explicitly send heartbeats using sample code like this - https://github.com/NaikSoftware/StompProtocolAndroid/issues/18.

5) On iOS, the following socket library worked very well with Spring websockets - https://github.com/rguldener/WebsocketStompKit. We faced an issue, where this library was throwing Array Index Range exceptions on server side heartbeat messages, but we made small changes in this library to skip process for heart-beat header messages.

6) It is recommended to give the URL scheme as wss://, although we found that https:// was also working fine. If you have SockJS enabled on Spring, then please append the URL with /websocket, as you would get a exception stating invalid protocol up-gradation. Hence clients should subscribe to/{endpoint}/websocket. https://github.com/rstoyanchev/spring-websocket-portfolio/issues/14

7) Both libraries also support sending headers during the CONNECT step. Very often, teams send the authorization token during the CONNECT step in the header, that can be used to authenticate the client. To access these headers on the server we need to access the Nativeheaders hashmap. Sample code - https://github.com/rsparkyc/WebSocketServer/blob/master/src/main/java/hello/ConnectionListener.java

Tuesday, March 13, 2018

Ruminating on Jackson JSON Parsing

In my previous post, we discussed about how we can extract an arbitrary JSON value out of a JSON string. Very often, developers face another error while using Jackson - com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field error.

This happens when your JSON string has an attribute that is not present in your POJO and you are trying to deserialize it. Your POJO might not be interested in these fields or these fields could be optional.

To resolve this error, you have two options:

Option 1: Disable the error checking on the Jackson Mapper class as follows.
objectMapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES)

Option 2: If you have access to the POJO object, then you can annotate it as follows
@JsonIgnoreProperties(ignoreUnknown = true)

Saturday, March 10, 2018

Identifying RabbitMQ consumers

One of the challenges my team was facing was to accurately identify the consumers of a RabbitMQ queue. Let's say if you have 5 consumers and you want to kill one of them through the dashboard, you need to identify the consumer.

Each consumer of RabbitMQ has a tag value that can be used to uniquely identify it. By default, RabbitMQ assigns some random number as the consumer tag, but just by looking at this random string tag there is no way to identify the actual consumer.

To resolve this, you need to create a ConsumerTagStrategy and associate it with the MessageListenerContainer. Code snippet given below:

Tuesday, February 27, 2018

Ruminating on BPM vs Case Management

Most BPM tools today support both BPMN (business process modeling notation) as well as CMMN (Case Management Modeling Notation). But when to use what?

It all depends on the process that you want to model. Given below are some tips that can be used to decide whether to model the process as a traditional BPM or a case management solution.

Traditional BPM:

If your process is a predefined and ordered sequence of tasks - e.g. sending out a insurance renewal message, onboarding an employee, etc.
The order of the steps rarely change - i.e. the process is repeatable.
Business users cannot dynamically change the process. The process determines the sequence of events.

Case Management:

When the process does not have a strict ordering of steps - e.g. settling a claim.
The process depends on the knowledge worker, who decides the next steps.
External events (submission of documents) determine what next step the knowledge worker will take.
Case management empowers knowledge workers and provides them with access to all the information concerning the case. The knowledge worker then uses his discretion and control to move the case towards the next steps.

Using the above guidelines, you can model your process using BPMN or CMMN. Business rules can be modeled as DMN (decision modeling notation).

Monday, February 19, 2018

Adding custom filters to Spring Security

One of my teams was looking for an option for adding filters on the Spring Security OAuth Server.

As we know, the Spring Security OAuth2 Server is a complex mesh of filters that get the job done in implementing all the grant types of the OAuth specification - https://docs.spring.io/spring-security/site/docs/current/reference/html/security-filter-chain.html#filter-ordering

The team wanted to add additional filters to this pipeline of Security filters. There are many ways of achieving this:

Option 1: Create a custom filter by extending the Spring GenericFilterBean class. You can set the order by using the @Order annotation.

Option 2: Register the filter manually in the WebSecurityConfigurerAdapter class using the addFilterAfter/addFilterBefore methods.

Option 3: Set the property "security.filter-order=5" in your application.properties. Now you can add upto 4 custom filters and set the order as either 1,2,3,4.
Another option is to manually set the order (without annotations) using FilterRegistrationBean in any @Configuration

The following 2 blogs helped us explore all options and use the appropriate one.
https://mtyurt.net/post/spring-how-to-insert-a-filter-before-springsecurityfilterchain.html
http://anilkc.me/understanding-spring-security-filter-chain/

Monday, January 29, 2018

Ruminating on the V model of software testing

In the V model of software testing, the fundamental concept in to interweave testing activities into each and every step of the development cycle. Testing is NOT a separate phase in the SDLC, but rather testing activities are carried out right from the start of the requirements phase.

During requirements analysis, the UAT test cases are written. In fact, many teams have started using user stories and acceptance criteria as test cases.
System test cases and Performance test cases are written during the Architecture definition and Design phase.
Integration test cases are written during the coding and unit testing phase.

A good illustration of this is given at http://www.softwaretestinghelp.com/what-is-stlc-v-model/