Monday, March 17, 2014

Ruminating on Distributed Logging

Of late, we have been experimenting with different frameworks available for distributed logging. I recollect that a decade back, I had written my own rudimentary distributed logging solution :)

To better appreciate the benefits of distributed log collection, it's important to visualize logs as streams and not files, as explained in this article.
The most promising frameworks we have experimented with are:

  1. Logstash: Logstash combined with ElasticSearch and Kibana gives us a cool OOTB solution. Also Logstash is developed on the Java platform and was very easy to setup and start running. 
  2. Fluentd: Another cool framework for distributed logging. A good comparison between Logstash and Fluentd is available here
  3. Splunk: The most popular commercial tool for log management and data analytics. 
  4. GrayLog: A new kid on the block. Uses ElasticSearch. Need to keep a watch on this. 
  5. Flume: Flume's main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows.
  6. Scribe: Scribe is written in C++ and uses Thrift for the protocol encoding. This project was released as open-source by Facebook.