Wednesday, January 17, 2018

Why GPU computing and Deep Learning are a match made in heaven?

Deep learning is a branch of machine learning that uses multi-layered neural networks for solving a number of challenging AI problems.

GPU (Graphics Processing Unit) architecture are fundamentally different from CPU architecture. A GPU chip would be significantly slower that a CPU, but a single GPU might have thousands of cores while a CPU usually has not more than 12 cores.

Hence any task that can be parallelized over multiple core is a perfect fit for GPUs. Now it so happens, that deep learning algorithms involve a lot of matrix multiplications that are an excellent candidate for parallel processing over the cores of a GPU. Hence GPUs make deep learning algorithms run faster by an order of magnitude. Even training of models is much faster and hence you can expedite the GTM of your AI solutions.

An example of how GPUs are better for AI solutions is considering AlexNet, a well known image classification deep network. A modern GPU costing about $1000 takes 2.5 days to fully train AlexNet on the very large ImageNet dataset. On the otherhand, it takes a CPU costing several thousand dollars nearly 43 days.

A good video demonstrating the difference between GPU and CPU is here:

Timeout for REST calls

Quite often, we need a REST API to respond within a specified time frame, say 1500 ms.
Implementing this using the excellent Spring RestTemplate is very simple. A good blog explaining how to use RestTemplate is available here -


Monday, January 08, 2018

The curious case of Java Heap Memory settings on Docker containers

On docker containers, developers often come across OOM (out-of-memory) errors and get baffled because most of their applications are stateless applications such as REST services.

So why do docker containers throw OOM errors even when there are no long-living objects in memory and all short-lived objects should be garbage collected?
The answer lies in how the JVM calculates the max heap size that should be allocated to the Java process.

By default, the JVM assigns 1/4 of the total physical memory as the max heap size for the Java runtime. But this is the physical memory of the server (or VM) and not of the docker container.
So let's say your server has a memory of 16 GB on which you are running 8 docker containers with each container configured for 2 GB. But your JVM has no understanding of the docker max memory size. The JVM will allocate 1/4 of 16 GB = 4 GB as the max heap size. Hence garbage collection MAY not run unless the full heap is utilized and your JVM memory may go beyond 2 GB.
When this happens, docker or Kubernetes would kill your JVM process with an OOM error.

You can simulate the above scenario and print the Java heap memory stats to understand the issue. Print the runtime.freeMemory(), runtime.MaxMemory() and runtime.totalMemory() to understand the patterns of memory allocation and garbage collection.

The diagram below gives a good illustration of the memory stats of JVM.

So how do we solve the problem? There is a very simple solution - Just explicitly set the max memory of the JVM when launching the program - e.g. java -Xms 2000m -jar myjar.jar
This will ensure that the garbage collection runs appropriately and an OOM error does not occur.
A good article throwing more details on this is available here.