Thursday, May 31, 2012

Byte code instrumentation and the ORM magic

All ORM tools use some kind of byte-code instumentation to do the persistance magic behind the scenes. But as an architect, it is important to understand what Hibernate or any JPA tool does to the entity classes?

Hibernate 'enhances' entity classes at runtime using a byte-code library called Javaassist. For e.g. it adds a '_dirty' flag to each field. It also adds a '_loaded' flag for each field to support lazy loading. A good blog explaining these concepts is here. So Hibernate reads the XML configuration or obtains annotations at runtime using reflection to apply byte-code instumentation.

There are various ways of doing byte-code instumentation using libraries such as CGLib, ASM, Javaassist, etc.
This byte-code enhancement can be done at compile-time or run-time. For Hibernate, besides a few special cases which require compile time 'enhancement' to byte-code; all common scenarios can be satisfied with runtime instrumentation.

The following link gives a good overview of all the enhancement options available in JPA.
http://openjpa.apache.org/builds/1.2.1/apache-openjpa-1.2.1/docs/manual/ref_guide_pc_enhance.html

In the .NET world, NHibernate uses the Linfu or Castle Dynamic Proxies byte-code enhancement providers. 
http://nhforge.org/blogs/nhibernate/archive/2008/11/09/nh2-1-0-bytecode-providers.aspx

Mapping between Entity Objects and DTOs

Very often, we need to map between our Entity objects and DTO's. This mapping code can be quite tedious to write.
There is a lot of hot debate on whether to use DTO's or just pass the entity objects directly to the view or webservices. There are pros and cons of each approach. Some good links on this debate are listed here:

Data Transfer Object - MSDN

http://stackoverflow.com/questions/5216633/jpa-entities-and-vs-dtos

Pros and Cons of Data Transfer Objects 

If you are using popular ORM tools such as Hibernate, iBatis or any other JPA complaint tool, then it may not even be possible to use the Enrity objects directly in your service or presentation tier. This is because these ORM toolkits typically use some kind of byte-code instrumentation to do the persistance magic behind the scenes. A good link explaining this is available here.

To avoid the drudgery of writing the 'Adapter/Mapping' code for each Entity object and DTO object, we can use some cool AutoMapper tools. These AutoMapper tools work on Reflection techniques and automatically map the source and target object properties. Custom mapping is supported using XML configuration or through code.

In the .NET world, there is a popular AutoMapper tool that has become the de-facto standard for a lot of .NET projects. In the Java world, there are 2 popular alternatives - Dozer and ModelMapper.
I found Dozer to be more comprehensive with some pretty good features. The usage is super-simple if you use the Singleton Wrapper and place the custom mapping file in the classpath.

If you are using the Spring Framework, then the 'BeanUtils' class has some simple static methods to copy properties from one object to the other. 

Wednesday, May 30, 2012

Performance benchmarks

Every development project needs a formal performance engineering process - one that emphasizes on early performance testing and benchmarking.

For performance benchmarks, it is recommended to do a shallow and wide implementation of a few critical use-cases and then run the load tests against the target hardware. These test results would help in some basic capacity planning.

But what if you have to do some initial rough capacity planning to allocate budgets and do not have the time to do a formal benchmarking exercise. It is here that standard performance benchmarks help. These standard performance benchmarks take a sample transactional use-case (e.g. Order Processing System) and run this workload on various platforms to gather statistics. There are 2 standards that are quite popular -

  1. TPC (Transaction Processing Performance Council) - (TPC) is a non-profit organization founded to define transaction processing and database benchmarks and to disseminate objective, verifiable TPC performance data to the industry. TPC-C is the benchmark for OLTP workloads. 

  2. SPECjEnterprise2010 - SPECjEnterprise2010 is an industry-standard benchmark designed to measure the performance of application servers conforming to the Java EE 5.0 or later specifications.
Interesting results of the performance benchmarks on various hardware can be found here:
http://www.tpc.org/tpcc/results/tpcc_perf_results.asp
http://www.spec.org/jEnterprise2010/results/jEnterprise2010.html

For the past few years, the Java Day Trader application and its .NET equivalent StockTrader application have been used by vendors to compare the performance of Java vs .NET on their respective platforms. Jotting down some links that point to some interesting debatable data :)

http://www.ibm.com/developerworks/opensource/library/os-perfbenchmk/index.html

http://blogs.msdn.com/b/wenlong/archive/2007/08/10/trade-benchmark-net-3-0-vs-ibm-websphere-6-1.aspx

http://msdn.microsoft.com/en-us/netframework/bb499684.aspx

https://cwiki.apache.org/GMOxDOC22/daytrader-a-more-complex-application.html

JavaDB (Derby) in JDK 1.6 and above

JDK 1.6 and above ship with a default pure Java database called as "JavaDB". Is is based on the open source Apache Derby project.

By default, on a Windows platform JavaDB gets installed at "C:\Program Files\Sun\JavaDB".
Set the 'DERBY_HOME' system property to this path. Also put 'DERBY_HOME/bin' in the PATH property.

There is a good tutorial here that should get you up and running with JavaDB in 10-15 mins :)

Derby does not have a default GUI admin tool, but one can use many third-party tools such as SQuirrel and others. I think JavaDB provides a good alternative to MySQL for some scenarios.


Monday, May 28, 2012

What is a framework?

When someone says they have defined a "framework", what does it mean? Is a framework just a library of resuable components? Or is it something more?

There is a good article on CodeProject on the same topic - http://www.codeproject.com/Articles/5381/What-Is-A-Framework

The book "Applying UML and Patterns" by Craig Larman also gives a very good understanding of the concept. Jotting down snippets from both these resources, in my own words.One may consider them the 10 guiding principles while designing a framework.
  1. At the risk of oversimplification, a framework can be defined as a cohesive set of classes/interfaces that provide services for the core part of a logical subsystem. 
  2. A framework contains both concrete and abstract classes that define interfaces to conform to, and other object interactions.
  3. Frameworks usually allow the end-users to define sub-classes of existing framework classes for customization and extension of the framework services.
  4. A framework enforces adherence to a consistent design approach.
  5. Relies on the "Hollywood Principle" - "Don't call us, we will call you". This pattern is also called as IoC (Inversion of Control). 
  6. A framework makes it easier to work with complex technologies.
  7. A framework reduces/eliminates repetitive tasks.
  8. A framework is often re-usable across multiple scenarios -  regardless of high level design considerations. Frameworks offer a higher degree of reuse - much more than individual classes.
  9. A framework forces the team to implement code in a way that promotes consistent coding, fewer bugs, and more flexible applications.
  10.  A framework can be used as a software building block in the system architecture definition. 

Thursday, May 24, 2012

Eclipse Memory Analyser

Read the following good reviews on Eclipse Memory Analyser. Looks like it can read both SUN JVM HPROF memory dumps as well as IBM JDK dumps.

http://memoryanalyzer.blogspot.in/2010/01/heap-dump-analysis-with-memory-analyzer.html

http://memoryanalyzer.blogspot.in/2010/02/heap-dump-analysis-with-memory-analyzer.html#more

http://www.eclipse.org/mat/

Some other interesting blogs that would help us resolve OOM errors :)

http://www.rallydev.com/engblog/2011/09/20/outofmemoryerror-fun-with-heap-dump-analysis/

http://www.rallydev.com/engblog/2012/03/16/java-memory-problems-why-is-my-heap-exhausted/

There is also a good article that contains sample code to simulate a Java OOM error and uses the Memory Analyser tool to identify the root cause of the error - http://www.javacodegeeks.com/2012/05/gc-overhead-limit-exceeded-java-heap.html

C heap vs Java heap

Found this interesting discussion on StackOverFlow around C Heap and Java Heap.
A good read and its important to understand that the JVM is also ultimately a C program :)
 
http://stackoverflow.com/questions/78352/what-runs-in-a-c-heap-vs-a-java-heap-in-hp-ux-environment-jvms

Thursday, May 17, 2012

RAID basics

Found this good blog that explains in simple terms, the various levels of RAID (Redundant Array of Independent Disks).

RAID 10 has become the defacto standard for relational databases, due to the excellent redundancy and performance given by them.  In RAID 10 (also known as 1+0), blocks are mirrored and also striped.