Thursday, May 14, 2015

Ruminating on Apple HealthKit backup

While my team was working on the Apple HealthKit iOS APIs, we came to know a few interesting things that many folks are not aware of. Jotting down our findings -
  • HealthKit data is only locally stored on the user's device
  • HealthKit data is not automatically synced to iCloud - even if you have enabled iCloud synching for all apps. 
  • HealthKit data is not backed up as part of normal device backup in iTunes. So if you restore your device, all HealthKit data would be lost !
  • HealthKit is not available on iPads. 
The only way we can take a backup of HealthKit data is to enable "encrypted backup" in iTunes. If this option is selected in iTunes, then your HealthKit data would get backed up.

Another interesting point from a developer's perspective is that the HealthKit store is encrypted on the phone and is accessible by authorized apps only when the device is unlocked. If the device is locked, no authorized app can access the data during that time. But apps can continue sending data via the iOS APIs. 

Thursday, February 05, 2015

Comparing two columns in excel to find duplicates

Quite often, you have to compare two columns in excel to find duplicates or 'missing' rows. Though there are many ways to do this, the following MS article gives a simple solution.

Depreciation of fixed assets in accounting

Would like to recommend the following site that gives a very simple explanation of the concept of depreciation in accounting. Worth a perusal for beginners.

Monday, January 26, 2015

Patient Engagement Framework for Healthcare providers

HIMSS (Healthcare Information and Management Systems Society) has published a good framework for engaging patients so as to improve health outcomes.

Patients want to be engaged in their healthcare decision-making process, and those who are engaged as decision-makers in their care tend to be healthier and have better outcomes. The whole idea to is to treat patients not just as customers, but partners in their journey towards wellness.

The following link provides a good reference for designing technology building blocks for improving patient experience.

Inform Me --- Engage Me --- Empower Me --- Partner with me

Ruminating on Open Graph Protocol

Ever wondered how some links on Facebook are shown with an image and a brief paragraph? I dug deeper to understand what Facebook was doing behind the scenes to visualize the link.

To my surprise, there was something called as "Open Graph Protocol" that defined a set of rules for telling Facebook, how your shared contents should be displayed on it.

For e.g. we can add the following meta-tags in any web page and Facebook would parse these tags when you post the link to this page.

  • <meta property=”og:title” content=” “/>
  • <meta property=”og:type” content=””/>
  • <meta property=”og:url” content=””/>
  • <meta property=”og:image” content=””/>
  • <meta property=”fb:admins” content=””/>
  • <meta property=”og:site_name” content=””/>
  • <meta property=”og:description” content=””/>

  • More information can be found at this link -

    Router blocking HTTPS traffic?

    Recently I had got a new cloud router for my broadband connection. Though the speed was very good, I was facing intermittent problems in accessing HTTPS sites. For e.g. webmail would hang sometimes, payment gateway pages would not load, Amazon app would not load screens, etc.

    At first, I was not sure if the router was to blame, or was it the internet connection itself. A quick google search revealed that this is a common problem faced by many routers and had to do with the MTU (Maximum Transmission Unit) size limit. I was surprised that the MTU size would affect HTTPS which is a application level protocol.

    The following links show an easy method to find out the correct MTU size for your network using the ping command. For e.g. ping -f -l 1472

    Thursday, January 15, 2015

    Applying Analytics to Clinical Trails

    The below link is a good article on using Big Data Analytics to improve the efficiency of clinical trials.

    Snippets from the article -

    "Recruiting patients has been a challenge for pharmaceutical companies. 90 per cent of trials are delayed with patient enrollment as a primary cause.  
    Effective target segmentation for enrollment is a key to success. Traditional methods of enrollment rely upon campaign and segmentation based on disease lines across wider populations. Using data science, we can look at the past data to identify proper signals and help planners with more precise and predictive segmentation. 

    Data scientists will look at the key attributes that matter for a given patient to successfully get enrolled. For each disease type, there may be several attributes that matter. For example, a clinical trial that is focused on a new diabetes medication targets populations’ A1C levels, age group, demographics, outreach methods, and site performance. Data science looks at the above attribute values for the target users past enrollment data and then builds ‘patient enrollment propensity’ and ‘dropout propensity’ models. These models can generate multi variant probabilities for predicting future success. 

    In addition to the above modeling, we can identify the target segment’s social media footprint for valuable clues. We can see which outreach methods are working, and which social media channels the ‘generation Googlers’ are using.  Natural language processing (NLP) techniques to understand the target population’s sentiment on clinical trial sites, physicians, and facilities can be generated and coded into a machine understandable form. Influencer segments can be generated from this user base to finely tune campaign methods for improving effectiveness."

    Thursday, January 08, 2015

    Ruminating on the Open Bank Project

    A colleague of mine introduced me to the 'Open Bank Project'. It's an interesting open source project to create a generic API layer on top of various core banking products. Third-party developers would then use these APIs to build cool mobile and social apps.

    The open source project was started by a Berlin based company named Tesobe. Right now, they seem to have successfully built adapters/connectors to 3 German banks and are planning to add connectors for more banks. A full list is given here -

    A few sample apps have been built using the Open Bank API -

    I think the concept is interesting, but the challenge would be to build the connectors to the various core banking products out there. Will download the APIs from github and evaluate them further. 

    Monday, September 22, 2014

    Exploring Apache Kafka..

    We had successfully used ActiveMQ and RabbitMQ in many projects and never felt the need to explore any other message broker. Today, my colleague introduced me to 'Apache Kafka' and was drooling over the high performance and reliability it provided. Kafka is extensively used within LinkedIn and can be used in many use-cases.

    The following blog post gives a good performance benchmark of Kafka.

    Another good blog post worth reading is:

    Another good tutorial on using Kafka to push messages to Hadoop is available here -

    Thursday, September 11, 2014

    Monitoring TOMEE using VisualVM

    A few years back, we moved to Jboss from Tomcat for our production servers, because there was no viable enterprise support for Tomcat.

    Today, we have viable options such as support from Tomitribe.

    The below article on Tomitribe gives a good overview of setting up VisualVM for monitoring Tomcat.

    Default tools in the JDK

    Found the below article worth a perusal. We get so used to using sophisticated tools that we forget there are things we can do with a bare JDK :)

    Monday, September 01, 2014

    Does Digital Transformation need 'Skunk Works' kind of environment?

    Skunk Works is a term that originated during WWII and is the official alias for Lockheed Martin’s Advanced Development Programs (ADP).

    Today Skunk Works is used to describe a small group of people who work on advanced technology projects. This group is typically given a high degree of autonomy and unhampered by bureaucracy. The primary driver of setting up a Skunk Works team is to develop something quickly with minimal management constraints.
    The term also refers to technology projects developed in semi-secrecy, such as Google X Lab.or the 50 people team established by Steve Jobs to develop the Macintosh computer.

    For any organization embarking on a Digital Transformation journey, it would be worthwhile to build such as Skunk Works team that can innovate quickly and bring an idea to a required threshold of technology readiness. I have seen so many ideas die under the shackles of bureaucracy and long processes. Having a skunk works team operate like a start-up within your organization can do wonders in leap-frogging your competition in the digital age.

    Monday, August 25, 2014

    Ruminating on Showrooming and Webrooming in the Digital Age

    When e-Commerce giants such as Amazon took the retail industry by storm, there was a lot of FUD on showrooming. As a digital native, even I indulged in showrooming before heading out to my favourite e-commerce site to buy the product online.

    But a recent study conducted in US has found that many folks also engage in reverse showrooming (aka webrooming). In reverse showrooming," or "webrooming," consumers go online to research products, but then actually go to a bricks-and-mortar store to complete their purchase.

    The following link on Business-Insider throws more details on this phenomenon.

    This report came as a surprise to me and I would assume that retailers are happy about this trend :)
    Retailers are also trying out innovative techniques to capitalize on this trend. Some of them include deploying knowledgeable sales staff that educate the customer and create a superior in-store customer experience. BLE technology enabled beacons push personalized offers to the customer mobile app while he is in the store. m-Wallets would enable contact-less and hassle-free payments at POS.

    Retailers are also embracing BOPiS (Buy Online Pick Up In Store) ! This greatly reduces the logistics/shipping costs, as the existing transportation network is used for delivery.

    Popular e-Commerce software vendors such as Hybris have also started catering to this market and have an in-store solution for retailers.

    Friday, August 22, 2014

    A good comparison of BLE and Classic Bluetooth

    The following link gives a good overview of the differences between BLE (Bluetooth low energy) and classic bluetooth. Definitely worth a perusal.

    The fundamental reason why BLE is becoming so popular in beacons is the extremely Low Power Consumption of BLE devices. Its low power consumption makes it possible to power a small device with a tiny coin cell battery for 5–10 years !

    Tuesday, August 12, 2014

    How does Facebook protect its users from malicious URLs?

    The following post gives a good overview of the various techniques (such as link shim) used by Facebook to protect its users from malicious websites - whose links would be embedded in posts.

    Facebook has its internal blacklist of malicious links and also queries external partners such as McAfee, Google, Web of Trust, and Websense.  When FB detects that a URL is malicious, it displays an interstitial page before the browser actually requests the suspicious page. This protects the user, who now has to make a conscious decision as to whether he wants to proceed to the malicious page.

    BTW, if you have not already installed the 'Web of Trust' browser plugin for your browser, do so immediately :)

    Another interesting point was the fact that it is more secure to run a check at click time than at display time. If one relied on display-time filtering alone, we would not be able to retroactively block any malicious URLs - lying in an email or an old page.

    Wednesday, July 09, 2014

    Collection of free books from Microsoft

    Eric Lingman has provided links to a large collection of free Microsoft books on a variety of topics on his blog post (link below).

    Some of the books that I found interesting were on Azure Cloud Design Patterns, SharePoint, Office 365, etc.

    Tuesday, June 03, 2014

    Categorization of applications in IT portfolio

    During any portfolio rationalization exercise, we categorize applications based on various facets, as explained in one of my old posts here.

    Interestingly, Gartner has defined three application categories, or "layers," to distinguish application types and help organizations develop more appropriate strategies for each of them.

    Snippets from the Gartner news site (

    Systems of Record — Established packaged applications or legacy homegrown systems that support core transaction processing and manage the organization's critical master data. The rate of change is low, because the processes are well-established and common to most organizations, and often are subject to regulatory requirements.
    Systems of Differentiation — Applications that enable unique company processes or industry-specific capabilities. They have a medium life cycle (one to three years), but need to be reconfigured frequently to accommodate changing business practices or customer requirements.
    Systems of Innovation — New applications that are built on an ad hoc basis to address new business requirements or opportunities. These are typically short life cycle projects (zero to 12 months) using departmental or outside resources and consumer-grade technologies.

    Ruminating on RTB, GTB and TTB

    The IT industry loves TLA's (three letter acronyms) ! Recently a customer was explaining their IT budget distribution to us in terms of 'Run the business investments', 'Grow the business investments' and 'Transform the business investments'.

    RTB investments are for 'keeping the lights on'. This budget is required to keep the operations running that support the core business functions. In RTB investments, the core focus is on efficiency and performance optimization. RTB-type applications are increasingly being outsourced to a IT vendor under a managed services contract.

    GTB investments are used to support organic growth and increased customer demand. For e.g. adding capacity to an existing data center, bolstering your DR site, virtualization for quick provisioning, etc.

    TTB investments are for creating new products or introducing new services; i.e. making changes to the current business model. For e.g. Apple entered the music industry with iTunes, IBM moved to services from hardware, etc. 

    Monday, May 26, 2014

    Ruminating on HIPAA compliance

    I was a bit confused on the intricacies of what entities are covered under HIPAA. The following article helped me clear a few cobwebs and also helped me appreciate the fact that it's impossible to protect all healthcare information all the time.

    The crux of the HIPAA regulation is that your information is only protected by a 'covered entity'. HIPAA defines 3 types of covered entities - Payer, Provider and Clearing House.

    Posting interesting snippets from the site:

    Health information that is protected when held by a covered entity. It may have no privacy protections when the information is held by a someone who is not a covered entity. In other words, health privacy protections depend on who has the information and not on the nature of the information. 

    It is important to understand that HIPAA does not automatically cover all health care providers. A free health clinic may not be subject to HIPAA because it doesn’t bill anyone. A doctor who charges every patient $25 cash and does not submit a bill to any insurance company may not be covered by HIPAA. A first aid room at your workplace may or may not be covered by HIPAA.

    Most school health records are not subject to HIPAA. Instead, school records (private schools are a major exception) are usually covered by another federal privacy law, the Family Educational Rights and Privacy Act (FERPA). 

    The list of unregulated health record keepers is shockingly long. These include gyms, medical and fitness apps and devices not offered by covered entities, health websites not offered by covered entities, Internet search engines, life and casualty insurers, Medical Information Bureau, employers (but this one is complicated), worker’s compensation insurers, banks, credit bureaus, credit card companies. many health researchers, National Institutes of Health, cosmetic medicine services, transit companies, hunting and fishing license agencies, occupational health clinics, fitness clubs, home testing laboratories, massage therapists, nutritional counselors, alternative medicine practitioners, disease advocacy groups, marketers of non-prescription health products and foods, and some urgent care facilities

    Friday, May 23, 2014

    Ruminating on Rate Limiting

    As architects, when we define the API strategy for any organization, we also need to design the 'Rate Limiting' features for that API. The concept of Rate Limiting is not new and the term has been used in networking world for long to represent the control of rate of traffic over the internet.

    Other common examples of Rate Limiting that we see very often are as follows:
    1. Limit consecutive wrong password entries to 3.
    2. Maximum size of an email attachment.
    3. Max number of emails one can send in a day.
    4. Max number of search queries one can fire every minute.
    5. Max. broadband download size per day, etc. 
    Rate Limiting is also an important line of defense from a security perspective. Jeff Atwood has a good blog post on 'Rate Limiting' available at:

    For services or APIs, there are standard ways in which we can rate limit the requests. For e.g.
    • Based on API key: This is how Twitter rate limits their API. Each account with a API key can only make x requests/{time period}. For e.g. 10 requests every 5 mins, 500 requests per day, etc.
    • Based on IP address: This may not work behind a proxy due to NATing.

    Tuesday, May 20, 2014

    Appending the current date to the file-name in a DOS batch program

    I was writing a utility batch for my backup folders and wanted to have the current date appended to the filename. Using just %DATE% was throwing errors as the default output of the date command contains a space on Windows. For e.g. echo %DATE% would return "Tue 05/20/2014".

    The following format of the %DATE% command did the job for me. A good trick to have in your sleeve :)
    echo %date:~-10,2%-%date:~-7,2%-%date:~-4,4%

    This formatting essentially trims chars from the end and then truncates. Just copy-paste fragments of the above string to understand how this works. For e.g. echo %date:~-10%

    I had used this to create a date-stamped jar file as follows.
    jar -cvf backup_%date:~-10,2%-%date:~-7,2%-%date:~-4,4%.jar data/*

    Wednesday, May 14, 2014

    Ruminating on Insurance Agents, Brokers, Producers

    In the insurance industry, the terms 'Agent', 'Broker' and 'Producer' are used interchangeably many a times. But in different markets, they have different meanings and also governed by different regulations. Jotting down the information I have gathered after discussions and Q&A sessions with my friends in the Insurance industry.

    • Agents have a primary alliance with the insurance carrier, whereas Brokers have a primary alliance with the insurance buyer. But in the Healthcare industry, both the terms are used interchangeably and agents/brokers are also called as 'Producers'.
    • Agents can be 'captive' or 'independent'. A captive agent only represents a single insurer. He is typically on the salary rolls of the carrier and earns a commission on every policy he sells. An independent agent can represent multiple insurance carriers. Independent insurance agents are not on the insurance carriers salary rolls and earn only commissions. Several insurance carriers may authorize an agent to sell for them. 
    • Independent insurance agents may also work with insurance intermediaries, that aggregate quotes from multiple insurance carriers and allows the agent to compare and select the best fit for the customer. Independent agents also provide packaged policies - for e.g. combining auto and home insurance as a single policy. The customer benefits with lower premiums. 
    • Both captive and independent agents have a contract with the insurance carrier that details out the the binding authority of the agent - essentially the authority to bind a policy on the insurer’s behalf.
    • Brokers typically do not have the authority to bind policies. Since brokers cannot bind policies, they have to obtain a binder from the insurance carrier. A binder is a legal document that serves as a temporary insurance policy for around 30 days, and must be signed by a representative of the insurer. A binder is replaced by a policy, once the policy is generated.
    • Brokers may or may not earn commissions from the insurance carrier. They get a flat fee from the insurance buyer for their services. 
    • Brokers can be retail or wholesale. Retail brokers directly engage with the end customers. Sometimes for very specialized insurance needs, retail brokers may contact a wholesale broker. For e.g. a wholesale broker can specialize in auto-manufacturing liability insurance, etc. 
    • Commissions are of two types - a flat (base) commission that is paid for every policy sold and a incentive commission if a particular volume is met or other growth targets are met. There is a lot of debate on the incentive commissions received by independent agents and brokers. This is because these bonuses may affect the neutrality of broker who is supposed to represent the insured. In many countries there are regulations around brokers providing disclosures to customers on the commissions that they would earn. 

    Friday, May 02, 2014

    Ruminating on the #hashtag economy

    The '#' (hash) symbol was originally created for twitter users to categorize their messages. It was used by Twitter users to identify keywords and trending topics. After Twitter, users on other Social channels such as Instagram, Tumblr and Pinterest jumped on the bandwagon and started using #hashtags to participate in online conversations on a topic. A good article explaining the origin of hashtag is available here.

    Ironically Facebook adopted #hashtags quite late in the game. But today Facebook has full support for hashtags and when we click on a hashtag, we see a feed of all posts (what people are saying) about that event or topic. Hashtags have become so popular that recently Obama encouraged citizens to use the hashtag "#1010Means" to protest against low minimum wages.

    Needless to say hashtags are a powerful concept for advertisers to capitalize on. It helps advertises to market their products/services to the right audience; who are interested in a particular subject.

    Organizations have also stated using the power of hashtags in Social channels for building innovative services that bring in new sources of revenue or help improve customer satisfaction rates. This whole new paradigm is known as the "#hashtag economy". 

    For e.g. Amex has creatively used hashtags to send promotions to their customers.

    Kotak Bank has created a Customer Self-Service platform on Twitter; wherein customers can tweet the right hashtags to perform banking transactions such as checking balance, request for checkboook, etc.

    Wednesday, April 23, 2014

    Ruminating on Network Port Mirroring

    For any network sniffer (analyzer) or Network Intrusion Detection Systems to work, the concept that is applied behind the scenes is 'Network Port Mirroring'.

    Port mirroring is needed for traffic analysis on a switch because a switch normally sends packets only to the port to which the destination device is connected. Hence most switches support configuring a 'port mirroring' to send a copy of each network packet to an other port (local port or a separate VLAN port).

    The following links are worth a perusal.

    Monday, April 14, 2014

    Updating content in an iOS app

    Any mobile app needs to have a design strategy for updating content from the server. We were exploring multiple options for retrieving content from the server and updating the local cache in our iOS app.
    After considerable research, we have found that iOS 7 provides a very neat design for background fetching of new content - one that uses silent push events to raise events in the client app. Even if the app is not running, it would be launched in the background (with UI invisible, rendered off-screen) to process the event. The following article gives a very good overview of the technique.

    Some snippets from the above article:

    A Remote Notification is really just a normal Push Notification with the content-available flag set. You might send a push with an alert message informing the user that something has happened, while you update the UI in the background. 
    But Remote Notifications can also be silent, containing no alert message or sound, used only to update your app’s interface or trigger background work. You might then post a local notification when you’ve finished downloading or processing the new content. 

     iOS 7 adds a new application delegate method, which is called when a push notification with the content-available key is received. Again, the app is launched into the background and given 30 seconds to fetch new content and update its UI, before calling the completion handler.

    How is the App launched in the background? 
    If your app is currently suspended, the system will wake it before callingapplication: performFetchWithCompletionHandler:. If your app is not running, the system will launch it, calling the usual delegate methods, includingapplication: didFinishLaunchingWithOptions:. You can think of it as the app running exactly the same way as if the user had launched it from Springboard, except the UI is invisible, rendered offscreen.

    Thursday, March 20, 2014

    Mobile Device Management Products

    My team was helping our organization to evaluate different Mobile Device Management (MDM) tools for enterprise level deployment. The following 2 articles are an excellent read for understanding the various features provided by MDM tools and how products compare to each other.

    Monday, March 17, 2014

    Ruminating on Distributed Logging

    Of late, we have been experimenting with different frameworks available for distributed logging. I recollect that a decade back, I had written my own rudimentary distributed logging solution :)

    To better appreciate the benefits of distributed log collection, it's important to visualize logs as streams and not files, as explained in this article.
    The most promising frameworks we have experimented with are:

    1. Logstash: Logstash combined with ElasticSearch and Kibana gives us a cool OOTB solution. Also Logstash is developed on the Java platform and was very easy to setup and start running. 
    2. Fluentd: Another cool framework for distributed logging. A good comparison between Logstash and Fluentd is available here
    3. Splunk: The most popular commercial tool for log management and data analytics. 
    4. GrayLog: A new kid on the block. Uses ElasticSearch. Need to keep a watch on this. 
    5. Flume: Flume's main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows.
    6. Scribe: Scribe is written in C++ and uses Thrift for the protocol encoding. This project was released as open-source by Facebook.  

    Sunday, March 16, 2014

    Calling a secure HTTPS webservice from a Java client

    Over the last decade, I have seen so many developers struggle with digital certificates when they have to call a secure webservice. A lot of confusion arises when a secure https webservice call is made from a servlet running in Tomcat. This is because the exception stack shows a SSLHandshake exception and then developers keep fiddling with the Tomcat connector configuration as stated here.

    But when we make a connection to a secure server, what we need is to trust the digital certificate of the server. If the digital certificate of the server has been signed by a trusted root authority such as 'Verisign', 'eTrust', then our default Java Trust Store would automatically validate it. But if the server has a self-signed certificate, then we have to add the server's digital certificate to the trust store.

    There are multiple ways of doing this. A long time ago, I had blogged about one option that entails setting the Java system properties. This can be done through code or by setting the Java properties of the JVM during startup. For e.g.

    System.setProperty("", trustFilename );
    System.setProperty("", "changeit") ;
    Different AppServers (WebSphere, Weblogic, etc.) may provide different ways to add certs to the trust store.

    Another option is to create a cert-store (filename:jssecacerts) that contains the digital cert of the server and copy that cert-store file to the “$JAVA_HOME\jre\lib\security” folder. There is also a nifty program called that downloads the certificate and creates the cert-store file. A good tutorial on the same is available here.
    I have also created a mirror of here. This program cam be run without any dependencies on external libraries and I have found it to be very handy.

    So what is the difference between setting the TrustStore system property and adding the jssecacerts file?
    Well, the documentation of JSSE should help our understanding here. The TrustManager performs the following steps to search for trusted certs:
    1.  system property
    2.  $JAVA_HOME/lib/security/jssecacerts
    3. $JAVA_HOME/lib/security/cacerts (shipped by default)

    It's important to note that is the TrustManager finds the jssecacerts file, then it would not read cacerts file! Hence it may be a better option to add the server digital cert to the cacerts keystore file. To add a certificate to a keystore, there is a nice GUI program called portecle. Alternatively do it from the command prompt using the keytool command as stated here

    Wednesday, February 19, 2014

    MongoDB support for geo-location data

    It was interesting to learn that FourSquare uses MongoDB to store all its geo-spatial data. I was also enlightening to see JSON standards for expressing GPS coordinates and other geo-spatial data. The following links would give quick information about GeoJSON and TopoJSON.

    MongoDB supports the GeoJSON format and allows us to easily build location aware applications. For e.g. Using MongoDB geospatial query operators, you can -
    • Query for locations contained entirely within a specified polygon.
    • Query for locations that intersect with a specified geometry. We can supply a point, line, or polygon, and any document that intersects with the supplied geometry will be returned.
    • Query for the points nearest to another point.
    As you can see these queries make it very easy to find documents (JSON objects) that are near a given point or documents that lie within a given polygon.

    Ruminating on SSL handshake using Netty

    I was again amazed at the simplicity of using Netty for HTTPS (SSL) authentication. The following sample examples are a good starting point for anyone interested in writing a HTTP server supporting SSL.

    Also adding support for 2-way SSL authentication (aka mutual authentication) is also very simple. Here are some hints on how to get this done.