Thursday, December 24, 2015

List of Prototyping tools

Found a good list of prototyping tools on cooper.com.

Liked the way they have compared the tools on various parameters :)

Saturday, December 12, 2015

PhoneGap vs. Cordova

Many folks are still confused on the distinction between PhoneGap and Apache Cordova. There are so many articles on the web that further blur our minds :)

I found this blog post on the Ionic site to be the most accurate comparison of PhoneGap vs Cordova.
Jotting down some snippets from the blog.

"PhoneGap is Cordova plus extra Adobe stuff. At first, the differences between Cordova and PhoneGap were minimal. But Adobe always had plans to build out a proprietary set of services around the PhoneGap ecosystem, and has started to execute on that plan with PhoneGap Build."

Building cool hybrid mobile apps - Lessons from Basecamp

Basecamp has successfully utilized hybrid techniques for building their iOS and Android apps. The following links are worth a perusal to understand the techniques they used.

https://signalvnoise.com/posts/3743-hybrid-sweet-spot-native-navigation-web-content

https://signalvnoise.com/posts/3438-drawing-the-nativeweb-line-in-basecamp-for-iphone

https://signalvnoise.com/posts/3766-hybrid-how-we-took-basecamp-multi-platform-with-a-tiny-team


PhoneGap on its blog has made an interesting distinction between two types of Hybrid applications -

  1. Web Hybrid: This is the default approach that PhoneGap takes. You package your apps as HMTL5/CSS3 and then run the app in a thin native web view container. All the UI controls are HTML or JavaScript controls.
  2. Native Hybrid: In this approach, you build a native app and use native controls for navigation, menus, etc. But most of the content pages are HTML views rendered in a web view. The HTML content can come from the local store or the server. 

Friday, December 11, 2015

Apple enforcing HTTPS connections using TLS v1.2

In iOS 9, Apple has implemented a new security feature called as ATS (App Transport Security), which is enabled by default.

So what is ATS? In simple words, ATS enforces all HTTP requests to be made on SSH - i.e. any API call your app makes to the backend servers must be on HTTPS. If you want to make an unsecured HTTP call, then you have to explicitly list down those exceptions in your Infoplist file.

ATS also enforces the latest protocol version of TLS - i.e. Transport Layer Security version 1.2. This can cause issues if your server is using HTTPS, but an older version of TLS. In such cases, you have two options - either upgrade your server to use the latest TLS protocol or add an exception to your app for these URLs.

More details on ATS can be found here

Thursday, December 10, 2015

Git vs GitHub vs GitHub Enterprise

A lot of folks get confused over the differences between Git and GitHub and use the words interchangeably. Also when folks talk about GitHub, it is assumed that it is only available on the public cloud and cannot be hosted in-premise.

Git is essentially a distributed version control system. It is called 'distributed' because we can use it locally and disconnected from the internet and then push our changes to another centralized repository (such as GitHub, Team Foundation Server, CodePlex, etc.) as and when required.
For a good comparison of centralized vs distributed source control systems, please read this blog post.

GitHub is a hosted service (public cloud) that can host your repositories and allows you to access your repositories via a web based interface. It is possible to use Git without GitHub, but only on a local machine. Hence in order to collaborate and work in a team, we have to use GitHub.
In the free plans of GitHub, we can create any number of public repositories, with unlimited collaborators. In the paid plans of GitHub, you can create private repositories.

GitHub Enterprise is the on-premises version of GitHub, which you can deploy and manage in your own, secure environment (private cloud).

Tuesday, December 08, 2015

Ruminating on the UX Design Process

Centerline has published a neat infographic illustrating the UX design process. While there are a  lot of UX related infographics on the net, I liked the simplicity and clear thought process of this one :)

When we create compelling user experiences for our customers, we follow a similar process.

  1. Gain a deeper understanding of the customer and the industry segment the customer operates in. Who are their end-customers? What is the market positioning of their product? 
  2. Based on customer segmentation, create personas and user journey maps. 
  3. Create a high level information architecture
  4. Create low fidelity prototypes (mockups) using Visio, PowerPoint, etc.
  5. After review, create high fidelity dynamic prototypes using tools such as iRise, Axure, etc. Work with Visual/Graphic Designers during this phase. 
  6. Once the application is developed, do a usability test using tools such as TechSmith Morae. Create a feedback loop for UX changes that gets incorporated in the next agile release. 
  7. Make sure that your UX team and Web/Mobile Analytics teams are working in tandem to resolve all UX concerns and improve the customer experience. 



Monday, December 07, 2015

Markdown and Pandoc

Over the past decade, the simple Markdown text formatting syntax has gained a lot of popularity. Many bloggers and web writers have shifted to using Markdown, though a few still use word processors or WYSIWYG editors.

A good introduction to Markdown can be found here - http://readwrite.com/2012/04/17/why-you-need-to-learn-markdown

John Gruber, the inventor or Markdown gives the below explanation for creating Markdown -

"The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions."

This is the reason that Markdown has become popular for web writers who publish their content on to the web or other digital channels.

Markdown text can be converted to HTML or many other formats (e.g. PDF, Word, etc.) using tools such as Pandoc. There are also online editors for Markdown such as http://dillinger.io/, that shows you the formatted HTML side-by-side. Blogging platforms such as Wordpress have also started supporting Markdown syntax. Even the Ghost blogging platform supports Markup.

But  there are limitations on what you can do in Markdown when it comes to complex formatting. Hence the format allows you to embed HTML code inline whenever you want some complex formatting.

It's important to understand that Markdown is good for creating content, but would not be a good fit to be used as a generic purpose web site creation tool . As John Gruber says in his philosophy:

"Markdown is not a replacement for HTML, or even close to it...The idea for Markdown is to make it easy to read, write, and edit prose."

Pandoc can also be used for reverse translation - i.e. to convert HTML, Word docx files to Markdown format. 

Ruminating on SSL and encrypted URLs

Recently, a colleague of mine asked an innocuous question that was quite interesting. We all knew that SSL protects the URL and hence it is not possible to snoop details out of the URL - for e..g GET params, resource path on server, etc.

But if the URL is encrypted by SSL, how does DNS work? How will the DNS server route the request to the right server?

The way it happens is as below:

  1. When a HTTP client (browser / API client) makes a request to a HTTPS URL, it only send the server name part of the URL to the DNS server. For e.g. if you are making a request to https://myserver.com/secret_path/secret_resource, then the HTTP client would only send 'myserver.com' to the DNS server for lookup. 
  2. The DNS server responds back with the actual IP address of the server. 
  3. The HTTP client then makes a call to the server using the IP address. What follows is the SSL handshake protocol and a secure connection is established with the server.
  4. Then the HTTP server makes a request for the actual resource on the secure pipe/tunnel. 

Quite simple actually, if you break down the steps :)

Whitepaper on APIs - Digital Glue in the new economy

Recently I coauthored a whitepaper on APIs and the important role they play in the digital economy.

Link - http://www.syntelinc.com/insights-and-resources/publications/api-management-platforms-digital-glue-api-economy

Saturday, December 05, 2015

Orchestrating Microservices using Event Driven Architecture (EDA) paradigms

If you follow the microservices architecture style, you would have a bunch of services running in their own independent process space.

But how do you orchestrate services for a given business workflow? For e.g. a business transaction that spans multiple calls to microservices.
Point to point integrations would result in 'Dependency Hell'. If each microservice calls the other microservice directly over HTTP API calls, then very soon we have a spaghetti of API dependencies.

One simple design pattern to resolve this is by using the EDA (Event Driven Architecture) paradigm. Each microservices does its job and then publishes an event. Other microservices subscribe to the event and act on it as necessary.

This pub/sub model results in loose coupling between the services and makes the system much more maintainable. A good blog-post covering this paradigm in more details is present here

Wednesday, December 02, 2015

Ruminating on the 'Infrastructure as Code' paradigm

Setup (installation and configuration) of servers and other infrastructure components is a tedious process.

In the olden days, process-oriented teams created meticulous documentation on 'how to setup a server' - typically in a word document with screenshots and step-by-step instructions.
Folks then tried to automate some tasks using scripts - e.g. unix shell scripts/bash etc.

But today, in a cloud-first world, setup of servers and deployment of applications need to be completely automated. The whole premise of 'Infrastructure-as-Code' is to write code in a high level language (e.g. Java, Python, Ruby) or a DSL (domain specific language) to automate the provisioning of infrastructure and managing configurations.

So this goes beyond just writing simple scripts. You utilize all the best practices of agile development projects - i.e. version control, unit testing, iterative development, etc. The whole revolution happening in DevOps acted as a catalyst in promoting the concept of 'programmable infrastructure'. In DevOps, the core concept of 'You built it, You run it' promotes closer collaboration between the development teams and IT ops team.

Popular tools such as Ansible, Kubernetes, Puppet, Chef, etc. can be used to automate your complete deployment cycle and help you achieve Continuous Delivery. 

List of Microservices resources

PaweÅ‚ Pacana has compiled a long list of 72 useful resources to learn about Microservices  :)

The link to the list of resources is here - http://blog.arkency.com/2014/07/microservices-72-resources/

It's important to understand that microservices is an architecture style and follows many of the best practices and principles of SOA. In fact, IMHO microservices is nothing but SOA done right! :)

Microservices is an architectural paradigm of building systems as a suite of independent services, each running in its own process space and communicating with each other using lightweight REST calls.

I found Martin Fowler's talk on Microservices an excellent source of information for beginners to learn about microservices.
The YouTube video is available here - https://www.youtube.com/watch?v=wgdBVIX9ifA

Tuesday, December 01, 2015

Weird laptop battery problems and ridiculous solutions

Laptops suffer from so many idiosyncratic battery problems and have their own ritual of try-n-test solutions that appear so weird !

Recently, I was facing a unique problem on my HP pro book 440. The charger was connected, but the battery was not charging. The status showed - "Plugged in, not charging". I tried multiple options to rectify this - by removing and plugging in the battery, by restarting the machine, but nothing worked.

The following steps posted on HP forum finally did the trick.

- While the computer in running, disconnect AC from the outlet
- Restart computer
- After loading, reconnect AC

Hope this helps someone in dire need :)

Wednesday, October 21, 2015

Classification of medical devices by FDA

In the US, the Food and Drug Administration (FDA) regulates any apparatus involved in diagnosing or treating disease.

While we were working on an IoT enabled Diabetes Management Solution, we learned that the FDA classifies all medical devices into 3 categories  - Class 1 / Class 2 & Class 3.

  • Class 1 devices are low risk devices and have minimum regulatory control. For e.g. dental floss, lancets, etc.  These devices must be listed in the FDA's medical device registry, but do not have a stringent approval process. 
  • Class 2 devices have higher risk and need stronger regulatory controls. For e.g. blood glucose meters, test strips, insulin pumps, etc. 
  • Class 3 devices have the highest risk and therefore have the highest level of regulatory control. For e.g. heart valves, continuous glucose monitors, artificial pancreas, etc. 

Monday, October 19, 2015

Digital in brick-n-mortar stores

While a lot of attention has been given to online experiences in digital transformation, there are a lot of opportunities in enhancing the in-store experience in brick-n-mortar stores.

Google has published some interesting stats on the use of smartphones within stores here - https://ssl.gstatic.com/think/docs/mobile-in-store_infographics.pdf

Some interesting stats:

  1. 84% of customers use their mobile phone in stores to help them shop. 
  2. Customers spend an average of 15 mins using their smartphones inside stores.
  3. Customers use their smartphones for searching for products/services - 82% use a Search Engine, 62% use Store website, 50% use Brand website. 
Thus mobile has the power to transform the shopping experience in stores. Also beacons can be utilized to provide location context sensitive promotions to customers. 

SweetIQ has put up a neat infographic that illustrates how beacons can be used to enhance the in-store digital experience. 

Friday, October 09, 2015

Managing Database Versions in an Agile Project

Today we have a robust set of tools for code versioning, CI and release management - for e.g. Java, .NET, Ruby web or REST applications. Examples of tools are Github, Hudson, Jetkins, etc.

But what about the RDBMS? How do we manage it across the various environments - i.e. from development to integration to UAT to production. A good illustration of the typical challenges is given here.

Flyway is a tool to address these problems. Using simple techniques such as a schema-version table and automatically apply db scripts (that follow a naming convention for sequence tracking), the tool can help any Agile project in managing RDBMS instances across different environments. It would also be a nifty addition to your existing DevOps tools. 

Sunday, October 04, 2015

Service Discovery Mechanisms in Microservices

In a microservices based architecture, we would not know the number of instances of a server or their IP addresses beforehand. This is because microservices typically run in VMs or Docker containers that are dynamically spawned based on usage load.

So consumers would need some kind of service discovery mechanism to communicate with microservices. There are two options to design this -

a) Server-side Service Discovery - Here the consumers make a request to a load-balancer/service registry and then the request is routed to the actual service end-point. This paradigm is clearly explained on this blog here. Examples of this design pattern is the AWS Elastic Load Balancer.


b) Client-side Service Discovery - Here the consumers use a small library for making service calls. This library makes calls to the service registry and obtains the load-balanced actual service end-point. Netflix uses this approach and its service registry is called Eureka and its client library is called Ribbon.



Saturday, October 03, 2015

Handling failures and improving resilience in microservices

In a microservices architecture, one has to build services that can handle failures. For e.g. If a microservice calls another dependent microservice that is down, then we need to handle this using timeouts and implement the Circuit Breaker pattern.

Netflix has open-sourced an incredibly useful library called as Hystrix to solve such problems. Anyone building large scale distributed architectures on the Java platform would find Hystrix a boon. When you make a remote service call through Hystrix libraries, it does the following:

  1. If the remote service call does not return within a specified threshold, Hystrix times-out the call.
  2. If a service is throwing errors and the number of errors exceed a threshold, then Hystrix would trip the circuit-breaker and all requests would fail-fast for a specified amount of time (recovery period)
  3. Hystrix enables developers to implement a fall-back action when a request fails, for e,g returning a default value or a null value or from cache. 
The full operating model of Hystrix is explained in great details on Github wiki 

It was also interesting to learn that the tech guys at Flipkart have taken Hystrix and implemented a service proxy on top of it called 'Phantom'. Looks like the advantage of using Phantom is that your consumers do not have to code against the Hystrix libraries. 

Ruminating on SemVer

Semantic Versioning (aka SemVer) of components has become mainstream today. The official page laying out the guidelines is available here - http://semver.org/

Following SemVer, each component has a 3 digit version in the format of 'Major.Minor.Patch' - for e.g. 2.3.23
  • You increment the major version, when you make incompatible changes. 
  • You increment the minor version, when you make changes but those changes are backward compatible.
  • The patch digit is incremented when you just make a bug-fix and it is obviously backward compatible.
  • With SemVer, pre-releases can be defined by appending a hyphen and the word 'alpha/beta' after it. For e.g. a pre-release for version 3.0.0 could be 3.0.0-alpha.1. 
Following SemVer is a boon in managing dependencies between components. So if component A is using version 4.2.3 of component B, then you know that as long as version B does not become 5.x.y, there would be no breaking changes. You can specify dependencies in the manifest file of a component.

While using SemVer for software components is fine, does it make sense to have the x.y.z version in the URL of public APIs?
APIs are the interfaces you expose to your consumers. Do your consumers really need to know about the bug fixes you have made? or the new features you have added? Maybe yes or no !
IMHO, just using a single version number in your API URL would suffice majority of real life business usecases. For e.g. https://api.company.com/v1/customer.

A good blog post by APIGEE on API versioning is available here. As stated in the blog - "Never release an API without a version and make the version mandatory."

If you want to constrain the amount of information that you want back from the API (e.g. mobile client on slow networks), then you can follow this strategy. Another alternative is to look at new options such as GraphQL

Ruminating on Netflix Simian Army

A friend of mine introduced me to the a suite of powerful tools used at Netflix for testing the resilience and availability of their services. The suite of tools is called 'Simian Army', which essentially is a collection of tools such as 'Chaos Monkey', 'Latency Monkey', 'Security Monkey', etc.

I was aware that Netflix runs its entire IT infrastructure on AWS and was happy to hear that all the tools are available on Github here - https://github.com/Netflix/SimianArmy/wiki

A good introduction to the genesis behind these tools is given on the Netflix blog here - http://techblog.netflix.com/2011/07/netflix-simian-army.html

Another interesting blog on the lessons that Netflix learned after migrating to AWS is available here.


Wednesday, September 16, 2015

Ruminating on Apple's DEP

Apple's device enrollment program (DEP) makes it easy for enterprises to roll out the deployment of their apple devices to their employees, agents, partners, etc.

DEP helps in automating the registration of the app to a MDM (Mobile Device Management) platform. The enterprise can also streamline the initial set-up process and modify it to suit their needs.

For any organization embarking on a mobile strategy, it is worthwhile to check if the selected MDM platform has support for DEP. 

Tuesday, September 15, 2015

Advantage of using Story Points instead of hours

Using story points for estimating user-stories in helpful because it encourages us to use 'relative sizing' and estimating the 'size of work' and not the real effort required.

Mike Cohn has given a good analogy by relating this concept to running a trail. Two people can agree on the fact that the trail is 5 miles long, but one may take 30 mins and the other may take 45 mins.

During the Planning Poker game, each developer is given cards with numbers 1,2,3,5,8 on them. Then the Scrum Master and Product Owner take the effort sizing from all developers to arrive at a consensus.

The Fibonacci scale is quite popular for estimating the user-story or epic size, as there is sufficient difference between the numbers to prevent confusion. For e.g. If the scale is sequential, then there would be a debate around sizing of 6 or 7 or 8. But a Fibonacci scale, makes it easy to relative sizing. 

Do we need a dedicated Scrum Master?

The need for a full-time Scrum Master is often a topic of hot debate in many Agile projects. Based on the numerous agile projects that we have successfully executed, I would give the following recommendations -

  • If your team is adopting SCRUM for the first time, then it is better to have a full-time Scrum Master. He would be responsible for ensuring that all agile processes are followed and everyone understands the rules of the game. The Scrum Master essentially acts as an evangelist educating teams on all aspects on SCRUM.
  • Once the teams have become comfortable with SCRUM processes, then we can have a part-time Scrum Master. IMHO, the technical architect or tech lead is most suited to play this role.
  • One of the main functions of a Scrum Master is to remove all impediments that the team faces. To be successful in this role, you need someone who can understand the technical complexities, business drivers and has a good rapport with the product owner. Hence architects are a good fit for the role of a Scrum Master. 
  • The Scrum Master also facilitates the daily Scrum and weekly Scrum of Scrums to facilitate collaboration across teams. He also leads the retrospectives and facilitates combined learning. 

Static code analyzers for native mobile app development

Listing down the tools used by my mobility team for static code analysis of mobile apps.

For iOS, the most popular tool is Clang. The default IDE (Xcode) also comes with a static code analyzer in-built in the IDE.

Sonar also provides a commercial plug-in for Objective-C that can be very useful if you are already using Sonar for all other platforms. There is another open-source Sonar plug-in for Objective C available here - https://github.com/octo-technology/sonar-objective-c

For Android, the most popular static code analyzer is lint. Lint integrates very well with Eclipse and Android Studio.

Facebook recently released a open-source static code analyzer for Android and iOS called as Infer. Facebook uses Infer to detect bugs in its Android and iOS apps. 

Ruminating on Less and Saas

CSS has been a boon to all web developers and allows for the clear separation of presentation from HTML markup. But CSS comes with it own limitations. For e.g.
  • CSS does not have the ability to declare variables. Hence if you want a color to be used across multiple element types, you have to repeat the color. 
  • CSS does not support nesting of properties. Hence we end up repeating the code again and again. 
To counter these limitations, there are new languages that have propped up that are known as 'CSS-Extension' languages. These languages support variables, nesting, etc. and make it super-easy to define themes in CSS.

Two of the most popular extension CSS languages are Less and Saas. These languages can be compiled into pure CSS language before being deployed to production. 

Sunday, September 13, 2015

Ruminating on the timelessness of the Agile Manifesto

I had signed the Agile Manifesto a decade back (in 2005) and was amazed to realize, how relevant the principle tenets are even today!

It is imperative for any software development project to imbibe the following principles to succeed -
  1. Individuals and interactions over processes and tools
  2. Working software over comprehensive documentation
  3. Customer collaboration over contract negotiation
  4. Responding to change over following a plan

Applying the Start-Stop-Continue paradigm to Sprint Retrospective

We all know the importance of Retrospective meetings after a Sprint. This is an excellent time to reflect on what worked, what did not work and what areas need improvement.

A simple way to conduct a retrospective with the entire team is to follow the Start-Stop-Continue model. You ask each team member to articulate -

  • what according to him/her should we start doing, 
  • what should we stop doing and 
  • what should we continue doing (with some changes if required). 

Then after collecting everyone's views, the team should brainstorm and debate around all the ideas presented and select the top 3 or 5 ideas that they would implement in the next sprint.

Many teams start skipping the retrospective if their project is running smoothly, but it is important to remember that there is always scope for improvement, no matter how good your team is currently functioning. 

Wednesday, August 05, 2015

Ruminating on Data Lake

Anyone contemplating to understand a Data Lake should peruse the wonderful article by Martin Fowler on the topic - http://martinfowler.com/bliki/DataLake.html

Jotting down important points from the article -

  1. Traditional data warehouse (data marts) have a fixed schema - it could be a star schema or a snowflake schema. But having a fixed schema imposes many restrictions for data analysis. A Data Lake is essentially schema-less. 
  2. Data warehouses also typically cleanse the incoming data and improve the data quality. They also aggregate data for faster reporting. In contrast, a Data Lake stores raw data from source systems. It is up-to the data scientist to extract the data and make sense of it. 
  3. We still need Data Marts - Because the data in a data lake is raw, you need a lot of skill to make any sense of it. You have relatively few people who work in the data lake, as they uncover generally useful views of data in the lake, they can create a number of data marts each of which has a specific model for a single bounded context.A larger number of downstream users can then treat these lake-shore marts as an authoritative source for that context.

Monday, July 27, 2015

builtwith.com - A nifty tool

We used to use browser tools such as Firebug to find out more 'backend' information about a particular site - for e.g. what servers does it run on? What server-side web technology is being used? What web content management tool is being used? etc.

Found a nifty website that gives all this info in the form of a neat table - http://builtwith.com/
A useful tool to have in the arsenal for any web-master. 

Friday, July 24, 2015

Correlation does not imply Causation !

One of the fundamental tenets that any analytics newbie needs to learn is that - Correlation does not imply Causation !

Using statistical techniques, we might find a relationship between two events, but that does not mean that the occurrence of an event causes the other event. Jotting down a few amusing examples that I found from the internet.
  • The faster windmills are observed to rotate, the more wind is observed to be. Therefore wind is caused by the rotation of windmills 
  • Sleeping with one's shoes on is strongly correlated with waking up with a headache. Therefore, sleeping with one's shoes on causes headache.
  • As ice cream sales increase, the rate of drowning deaths increases sharply. Therefore, ice cream consumption causes drowning.
  • Since the 1950s, both the atmospheric CO2 level and obesity levels have increased sharply. Hence, atmospheric CO2 causes obesity.
  • The more firemen are sent to a fire, the more damage is done.
  • Children who get tutored get worse grades than children who do not get tutored
  • In the early elementary school years, astrological sign is correlated with IQ, but this correlation weakens with age and disappears by adulthood.
  • My dog is more likely to have an accident in the house if it’s very cold out.
A good site showcasing such spurious correlations is here - http://www.tylervigen.com/spurious-correlations

Thursday, July 23, 2015

Using the Solver add-in in Excel for finding optimal solutions

Today we learned about a nifty tool in Excel that can be used to solve 'maximizer' or 'most optimal' solution to problems. For e.g. Given a set of constraints, should we make cars or trucks.

The below links would give a quick idea on how to use this tool to find out optimal solutions and also carry out 'what-if' analysis. You enter the objective, constraint and decision variable cells and let the tool do the magic.

http://www.excel-easy.com/data-analysis/solver.html

http://www.solver.com/excel-solver-help

Wednesday, July 15, 2015

How can large enterprises compete with new-age digital startups?

Chief Executive magazine recently featured an article by Nitin Rakesh on how large enterprise can compete with digital startups. The article is available at the following links:

http://issuu.com/chiefexecutive/docs/jul_aug2015
Retraining Goliath to face digital David

The article advises large enterprises to capitalize on their strengths - i.e.

a) Utilize financial power to acquire digital competitors - How Allstate acquired Esurance..
b) Leverage existing brand equity - How Amex partnered with Walmart to launch Bluebird..
c) Mine existing customer data - Leverage customer insights to deliver highly personalized services.
d) If possible collaborate rather than compete with digital startups.


Thursday, June 25, 2015

import.io - Next Generation Web Crawler

We had used many open source web crawlers in the past, but recently a friend of mine referred me to a cool tool at import.io

Import.io essentially parses the data on any website and structures it into a table of rows/columns - "Turn web pages into data". This data can be exported as an CSV file and it also provides a REST API to extract the data. This kind of higher abstraction over raw web crawling can be extremely useful for developers.

We can use the magic tool for automatic extraction or use their free tool to teach it how to extract data. 

Ruminating on Email marketing software

Recently we were looking for a mass email software for a marketing use-case. Jotting down the various online tools/platforms that we are currently evaluating.

  1. Mailjet - Has a free plan for 200 emails/day
  2. MailChimp - Has a free plan for 12000 emails/month
  3. Campaign Monitor
  4. Active Campaign 
  5. Salesforce Marketing Cloud 

APIs in Fleet Management

Fleet Management software is used by fleet owners to manage their moving assets. The software enables them to have a centralized data-store of their vehicle and driver information and also maintain maintenance logs (service and repair tracking).

The software also allows us to schedule preventive maintenance activities, monitor fuel efficiency, maintain fuel card records, calculate metrics such as "cost per mile" etc. You can also setup reminders for certification renewals and license expiration.

It was interesting to see Fleetio (a web based fleet management company) roll out a API platform for their fleet management software. Their vision is to become a digital hub for all fleet related stuff and turn their software product into a platform that can be leveraged by partners to create a digital ecosystem.

The API would allow customers to seamlessly integrate data in Fleetio with their operational systems in real time. For e.g. Pulling work orders from your fleet management system and pushing it to your accounting software in real time. Pushing mileage updates from a bespoke remote application to your fleet management software, Integrate driver records with Payroll systems, etc. All the tedious importing and exporting of data is gone !

TomTom also has a web based fleet management platform called as WEBFLEET that provides an API (Webfleet.connect) for integration. The Fleetlynx platform also has an API to integrate with Payroll and Maintenance systems.


Saturday, June 20, 2015

Ruminating on bimodal IT

Over the past couple of years, Gartner has been evangelizing the concept of bimodal IT to organizations for succeeding in the digital age. A good note by Gartner on the concept is available here.

Mode 1, which refers to the traditional "run the business" model focuses on stability and reliability.
Mode 2, which are typically "change the business" initiatives focus on speed, agility, flexibility and the ability to operate under conditions of uncertainty.

Bimodal IT would also need resources with different skills. As an analogy, Mode 1 IT resources would be the marathon runners, whereas Mode 2 IT resources need to be like sprinters. It would be difficult for a IT resource to be both. There is a risk that he might relegate to a mid-distance runner...and today's IT does not need mid-distance runners..

Tuesday, June 16, 2015

Ruminating on Section 508 Accessibility standards

In the UX world, you would often come across the phrases such as "compliance with Section 508". So what exactly is Section 508 and how does it relate to User Experience?

"Section 508" is actually an amendment to the Workforce Rehabilitation Act of 1973 and was signed into a law in 1998. This law mandates that all IT assets developed by or purchased by the Federal Agencies be accessible by people with disabilities. The law has stated web guidelines that should be followed while designing and developing websites.

It is important to note that Section 508 does not directly apply to private sector web sites or to public sites which are not U.S. Federal agency sites. But there are other forces at play, that may force a organization to make their websites accessible. The ADA (Americans with Disabilities Act) that was passed way back in 1990 prohibits any organization to discriminate on the basis of disability.
The following link reveals examples of law suites filed for violation of ADA - http://www.law360.com/articles/513033/doj-focuses-on-ada-compliance-in-the-digital-age

Beyond the legal regulations, there are also open initiatives aimed at improving the accessibility of websites. W3C has an initiative named "Web Accessibility Initiative (WAI)" that lays down standards and guidelines for accessibility. There is also a standard for content authoring called - "Web Content Accessibility Guidelines (WCAG)".

The following sites provide good reading material on Accessibility -


Jotting down the high level guidelines that should be followed for accessibility.

  1. A text equivalent for every non-text element shall be provided (e.g., via "alt", "longdesc", or in element content).
  2. Equivalent alternatives for any multimedia presentation shall be synchronized with the presentation. For e.g.  synchronized captions.
  3. Web pages shall be designed so that all information conveyed with color is also available without color, for example from context or markup. Color is not used solely to convey important information. Ensure that foreground and background color combinations provide sufficient contrast when viewed by someone having color deficits or when viewed on a black and white screen. 
  4. Documents shall be organized so they are readable without requiring an associated style sheet. If style-sheets are turned off, the document should still be readable. 
  5. Client-side image maps are used instead of server-side image maps. Appropriate alternative text is provided for the image as well as each hot spot area.
  6. Data tables have column and/or row headers appropriately identified (using the element).
  7. Pages shall be designed to avoid causing the screen to flicker with a frequency greater than 2 Hz and lower than 55 Hz. No element on the page flashes at a rate of 2 to 55 cycles per second, thus reducing the risk of optically-induced seizures.
  8. When electronic forms are designed to be completed on-line, the form shall allow people using assistive technology to access the information, field elements, and functionality required for completion and submission of the form, including all directions and cues.
  9. When a timed response is required, the user shall be alerted and given sufficient time to indicate more time is required.

Friday, June 12, 2015

Implementing sliding window aggregations in Apache Storm

My team was working on implementing CEP (Complex Event Processing) capabilities using Apache Storm. We evaluated multiple options for doing so - one option was using a lightweight in-process CEP engine like Esper within a Storm Bolt.

But there was another option of manually implementing CEP-like aggregations (over a sliding window) using Java code. The following links show us how to do so.

http://www.michael-noll.com/blog/2013/01/18/implementing-real-time-trending-topics-in-storm/

Rolling Count Bolt on Github

While the above code would help in satisfying certain scenarios, it would not provide the flexibility of a CEP engine. We need to understand that CEP engines like (Tibco BE, Esper, StreamInsights) are fundamentally different from Apache Storm; which is more of a highly distributed stream computing platform.

A CEP engine would provide you with SQL like declarative queries and OOTB high level operators like time window, temporal patterns, etc. This brings down the complexity of writing temporal queries and aggregates. CEP engines can also detect patterns in events. But most CEP engines do not support a distributed architecture.

Hence it makes sense to combine CEP with Apache Storm - for e.g. embedding Esper within a Storm Bolt. The following links would serve as good reference -

http://stackoverflow.com/questions/29025886/esper-ha-v-s-esper-storm
http://stackoverflow.com/questions/9164785/how-to-scale-out-with-esper/9776881#9776881

Monday, June 01, 2015

Ruminating on Shipping Containers and Docker

Today during one of the lectures at IIMB, I was introduced to a book called 'The Box' by Mark Levinson.

The book narrates the story of how the invention of the shipping container completely changed the face of global commerce. A snippet from the book -

"the cost of transporting goods was decisive in determining what products they would make, where they would manufacture and sell them, and whether importing or exporting was worthwhile. Shipping containers didn't just cut costs but rather changed the whole economic landscape. It changed the global consumption patterns, revitalizing industries in decay, and even allowing new industries to take shape."

A nice video explaining the same is available on YouTube - https://www.youtube.com/watch?v=IDmLEFDDd-c

A similar revolution is happening in the IT landscape by means of a new software container concept called as Docker. In fact, the logo of Docker contains an image of shipping containers :)

Docker provides an additional layer of abstraction (through a docker engine, a.k.a docker server) that can run a docker container containing any payload. This has made it really easy to package and deploy applications from one environment to the other.

A Docker container encapsulates all the code and its dependencies required to run an application. They are quite different from virtualization technology. A hypervisor running on a 'Host OS' essentially loads the entire 'Guest OS' and then runs the apps on top of it. In Docker architecture, you have a Docker engine (a.k.a Docker server) running on the Host OS. Each Docker server can host many docker containers. Docker clients can remotely talk with Docker servers using a REST API to start/stop containers, patch them with new versions of app, etc.

A good article describing the differences between them is available here - http://scm.zoomquiet.io/data/20131004215734/index.html

Source: http://scm.zoomquiet.io/data/20131004215734/index.html

All docker containers are isolated from each other using the Linux Kernel process isolation features.

In fact, it is these OS-level virtualization features of Linux that has enabled Docker to become so successful.

Other OS such as Windows or MacOS do not have such features as part of their core kernel to support Docker. Hence the current way to run Docker on them is to create a light-weight Linux VM (boot2docker) and run docker within it. A good article explaining how to run Docker on MacOS is here - http://viget.com/extend/how-to-use-docker-on-os-x-the-missing-guide

Docker was so successful that even Microsoft was forced to admit that it was a force to reckon with !
Microsoft is now working with Docker to enable native support for docker containers in its new Nano server operating system - http://thenewstack.io/docker-just-changed-windows-server-as-we-know-it/

This IMHO, is going to be a big game-changer for MS and would catapult the server OS as a strong contender for Cloud infrastructure. 

Ruminating on bare metal cloud environments

Virtualization has been the underpinning technology that powered the Cloud revolution. In a typical virtualized environment, you have the hypervisor (virtualization software) running on the Host OS. These type of hypervisors are called "Type 2 hypervisor".

But there are hypervisors that can be directly installed on hardware (i.e. hard disk). These hypervisors, know as "Type 1 hypervisors" do not need a host OS to run and have their own device drivers and other software to interact with the hardware components directly. A major advantage of this is that any problems in one virtual machine do not affect the other guest operating systems running on the hypervisor.

The below image from Wikipedia gives a good illustration.


Thursday, May 14, 2015

Ruminating on Apple HealthKit backup

While my team was working on the Apple HealthKit iOS APIs, we came to know a few interesting things that many folks are not aware of. Jotting down our findings -
  • HealthKit data is only locally stored on the user's device
  • HealthKit data is not automatically synced to iCloud - even if you have enabled iCloud synching for all apps. 
  • HealthKit data is not backed up as part of normal device backup in iTunes. So if you restore your device, all HealthKit data would be lost !
  • HealthKit is not available on iPads. 
The only way we can take a backup of HealthKit data is to enable "encrypted backup" in iTunes. If this option is selected in iTunes, then your HealthKit data would get backed up.

Another interesting point from a developer's perspective is that the HealthKit store is encrypted on the phone and is accessible by authorized apps only when the device is unlocked. If the device is locked, no authorized app can access the data during that time. But apps can continue sending data via the iOS APIs. 

Thursday, February 05, 2015

Comparing two columns in excel to find duplicates

Quite often, you have to compare two columns in excel to find duplicates or 'missing' rows. Though there are many ways to do this, the following MS article gives a simple solution.

https://support.microsoft.com/kb/213367?wa=wsignin1.0

Depreciation of fixed assets in accounting

Would like to recommend the following site that gives a very simple explanation of the concept of depreciation in accounting. Worth a perusal for beginners.

Monday, January 26, 2015

Patient Engagement Framework for Healthcare providers

HIMSS (Healthcare Information and Management Systems Society) has published a good framework for engaging patients so as to improve health outcomes.

Patients want to be engaged in their healthcare decision-making process, and those who are engaged as decision-makers in their care tend to be healthier and have better outcomes. The whole idea to is to treat patients not just as customers, but partners in their journey towards wellness.

The following link provides a good reference for designing technology building blocks for improving patient experience.

Inform Me --- Engage Me --- Empower Me --- Partner with me

http://himss.files.cms-plus.com/HIMSSorg/NEHCLibrary/HIMSS_Foundation_Patient_Engagement_Framework.pdf

Ruminating on Open Graph Protocol

Ever wondered how some links on Facebook are shown with an image and a brief paragraph? I dug deeper to understand what Facebook was doing behind the scenes to visualize the link.

To my surprise, there was something called as "Open Graph Protocol" that defined a set of rules for telling Facebook, how your shared contents should be displayed on it.

For e.g. we can add the following meta-tags in any web page and Facebook would parse these tags when you post the link to this page.


  • <meta property=”og:title” content=” “/>
  • <meta property=”og:type” content=””/>
  • <meta property=”og:url” content=””/>
  • <meta property=”og:image” content=””/>
  • <meta property=”fb:admins” content=””/>
  • <meta property=”og:site_name” content=””/>
  • <meta property=”og:description” content=””/>


  • More information can be found at this link -  http://www.optimizesmart.com/how-to-use-open-graph-protocol/

    Router blocking HTTPS traffic?

    Recently I had got a new cloud router for my broadband connection. Though the speed was very good, I was facing intermittent problems in accessing HTTPS sites. For e.g. webmail would hang sometimes, payment gateway pages would not load, Amazon app would not load screens, etc.

    At first, I was not sure if the router was to blame, or was it the internet connection itself. A quick google search revealed that this is a common problem faced by many routers and had to do with the MTU (Maximum Transmission Unit) size limit. I was surprised that the MTU size would affect HTTPS which is a application level protocol.

    The following links show an easy method to find out the correct MTU size for your network using the ping command. For e.g. ping www.google.com -f -l 1472

    http://www.tp-link.com/lb/article/?faqid=190

    http://www.tp-link.com/sa/article/?faqid=69

    Thursday, January 15, 2015

    Applying Analytics to Clinical Trails

    The below link is a good article on using Big Data Analytics to improve the efficiency of clinical trials.

    http://archive.expresspharmaonline.com/specials/pharma-technology-review/2126-leveraging-data-science-to-accelerate-clinical-trial-results

    Snippets from the article -

    "Recruiting patients has been a challenge for pharmaceutical companies. 90 per cent of trials are delayed with patient enrollment as a primary cause.  
    Effective target segmentation for enrollment is a key to success. Traditional methods of enrollment rely upon campaign and segmentation based on disease lines across wider populations. Using data science, we can look at the past data to identify proper signals and help planners with more precise and predictive segmentation. 

    Data scientists will look at the key attributes that matter for a given patient to successfully get enrolled. For each disease type, there may be several attributes that matter. For example, a clinical trial that is focused on a new diabetes medication targets populations’ A1C levels, age group, demographics, outreach methods, and site performance. Data science looks at the above attribute values for the target users past enrollment data and then builds ‘patient enrollment propensity’ and ‘dropout propensity’ models. These models can generate multi variant probabilities for predicting future success. 

    In addition to the above modeling, we can identify the target segment’s social media footprint for valuable clues. We can see which outreach methods are working, and which social media channels the ‘generation Googlers’ are using.  Natural language processing (NLP) techniques to understand the target population’s sentiment on clinical trial sites, physicians, and facilities can be generated and coded into a machine understandable form. Influencer segments can be generated from this user base to finely tune campaign methods for improving effectiveness."

    Thursday, January 08, 2015

    Ruminating on the Open Bank Project

    A colleague of mine introduced me to the 'Open Bank Project'. It's an interesting open source project to create a generic API layer on top of various core banking products. Third-party developers would then use these APIs to build cool mobile and social apps.

    The open source project was started by a Berlin based company named Tesobe. Right now, they seem to have successfully built adapters/connectors to 3 German banks and are planning to add connectors for more banks. A full list is given here - https://api.openbankproject.com/connectors-status/

    A few sample apps have been built using the Open Bank API - https://openbankproject.com/apps/

    I think the concept is interesting, but the challenge would be to build the connectors to the various core banking products out there. Will download the APIs from github and evaluate them further.