Tech Talk: 2011

Friday, December 30, 2011

OutOfMemoryError while using StringBuilder/StringBuffer

I was helping a friend debug a OutOfMemory exception in a Java web application. The program made heavy use of StringBuilder and was appending a large number of strings. An entire record set (containing thousands of records) was essentially converted into an XML string.
Strangely when the OOM error occured, there was still plenty of heap memory available. Furthur deep-dive debugging and some googling around, taught a few important lessons.

Whenever the internal buffer capacity of StringBuilder/StringBuffer is exceeded, then the next character array size it creates is twice of the original size. A good blog explaining this is here. Hence it is better to initialize the initial capacity of the StringBuilder to a reasonable value beforehand.
StringBuilder needs a continuous block of memory for further allocation. For e.g. you may have 20MB free heap space, but it may be fragmented. Hence even a 5MB StringBuilder allocation may fail and result in a OOM error. Links to forums - Link 1 Link2 Link3
Try to use a 64-bit machine, as there are no practical limitations for the heap memory allocation and also it is much easier to find a continous block of memory due to 64-bit addressing.
Alter the design of the program to not store the string in memory, but in a file. Alternatively stream it directly to the HTTP response.

Friday, December 02, 2011

Taxonomy of Services

In one of my previous posts, I had blogged about creating a taxonomy of services using functional categorization.
For e.g. Entity Services, Task/Activity Services, Process Services and Infrastructure services.

But services can also be categorized from different perspectives such as layers or intent of use. For e.g.
Categorization based on Service Layer:

1. Business Services: Represent high level business functions that define an enterprise.
2. Application Services: Application specific and usually will be aggregated in a composite service at the business level.
3. Infrastructure Services: Utility functions that deliver cross cutting functions.

Categorization based on scope:
1. Enterprise Services: Multiple LOBs use the service.
2. Domain Services: Applicable only within a LOB.
3. Application Services: Local to the App Level.

Thursday, November 17, 2011

Analysis vs Design

This age-old debate keeps propping up every now and then :)
Found a couple of good articles reflecting on the difference between the two.

http://butunclebob.com/ArticleS.UncleBob.AnalysisVsDesign

http://devhawk.net/2004/03/30/analysis-vs-design-modeling/

Thursday, November 03, 2011

Oracle Web Service Manager vs Oracle Enterprise Gateway

Oracle Web Service Manager is an integral part of the Oracle SOA suite and it allows us to implement security declaratively; without any coding from the developer. Security policies can be enforced at run-time using WSM agents or WSM gateways.
There is another product called "Oracle Enterprise Gateway" that has features that overlap with OWSM - hence this results in lot of confusion.

So, lets understand the concepts one by one. A WSM agent is a component that is installed with the endpoint service. So it provides the 'last-mile' security (last security layer). Now OWSM also a gateway component where security policies can be employed in a central location. A gateway can also perform functions that an agent cannot do, such as message routing, transformations, and failover. OWSM also has an extension for OSB, that allows us to use OWSM policies at the OSB (ESB) layer. These capabilities would suffice the requirements of most intranet SOA infrastructures.

Oracle positions the Oracle Enterprise Gateway as the first line of defence ("perimiter security") when SOA services are exposed to the outside world. This is the equivalent of a DMZ firewall. So it looks like Oracle Enteprise Gateway is a more expansive product that can do everything that OWSM does, plus all the bells and whistles.

Tuesday, November 01, 2011

Ruminating on the Oracle MDM suite

Recently during one of our internal brainstorming sessions, there was a lot of confusion over the various components available on the Oracle platform to build a robust MDM solution. Part of the confusion was because Oracle has picked up best of breed components from various acquisitions and integrated them together to form the MDM suite. Its important to understand that when someone talks about Oracle MDM - it is a suite of components and NOT just one product.

Oracle's acquisition of Hyperion has further added to the confusion, as Hyperion has full capabilites to be used as a MDM solution. Oracle promotes Hyperion Data Relationship Manager as a component in its MDM suite - that can be used for managing the relationships between different attributes of master data from disparate sources.

At the fundamental level, to build a MDM end-to-end solution, you need basic components such as a ETL tool, Data Profiler, Data Cleansing Engine that can be used for standarization, de-duplication, validation, etc. Given below are the core components of the Oracle MDM suite, followed by optional components that help in jump-starting your MDM journey.

Oracle Data Integrator Enterprise Edition: ODI can be used for ELT style bulk data movement, or near real-time updates, and data services. ODI can consolidate master data from various sources and also publish master data to downstream applications. (Note: Oracle has acquired Golden Gate product that enables real-time intergration of data across disaparate data-sources. GG can also be used in conjuction with ODI for movement of data)
Oracle Data Quality / Data Profiling: Oracle Data Profiling allows us to profile the master data and investigate the content and the structure of their different data sources. It also gives users the ability to monitor the evolution of data quality over time using Time Series. Oracle Data Quality allows us to standardize, validate, cleanse and enrich master data – for e.g. master list of securities, issuers, official list, etc. Using both these tools will ensure the integrity of data stored in the MDM data store.
Oracle Business Intelligence Suite: OBIEE can be used for analytics and reporting on master data entities.

Besides these core components, Oracle MDM suite also contains pre-packaged MDM solutions such as “Customer Hub”, “Product Hub”, “Site Hub”, “Supplier Hub”, etc. (Some of these are part of Siebel MDM, I believe.)

Thursday, October 27, 2011

Ruminating on IRM (Information Rights Management)

From a security architecture perspective, it is important to consider the need for using IRM technology. Traditionally we have secured access to documents using RBAC patterns for secure access and download.

But how do you control the information once it is downloaded to the users machine? Can the user copy/paste from the document? Can the user print the document? Can the user forward the document to someone or upload it somewhere? Can he run macros on the document? So how can an enterprise have total control on sensitive information?

These questions cannot be answered by classical access control mechanisms, they need a new security framework concept called "Information Rights Management". Many traditional ECM vendors also offer IRM adapters or add-ons to help customers have total centralized control over their digital assets. For e.g. SharePoint 2010 has IRM protectors that can be plugged-in for end-to-end protection of documents on the user's computers.Oracle UCM can be extended with Oracle IRM, etc.

Across all these IRM product architectures, it is necessary to have some form of client application installed on all users machines. Files that get downloaded from the DMS are special encrypted rights-managed files. The file format contains meta-data that defines the access that can be given to the user. The client application would decrypt the file, understand the access constraints and accordingly give rights to the user. On the windows platform, MS has long released the Windows Rights Management Services - a comprehensive API to address IRM challenges on the windows platform.

Thursday, August 11, 2011

How does .NET TPL control the number of threads

I often wondered what heuristics the Task Parallel Library (TPL) in .NET uses to control the number of threads for optimal utilization on multi-core machines.

Found a great discussion thread on StackOverFlow explaining the details.

Thursday, July 21, 2011

Techniques for Service Identification in SOA

Its very important to use proper service identification techniques to identify services in an portfolio. In fact, service identification should be the first step in your service lifecycle management process.

Jotting down some of the techniques that we have been using for identifying services:

Domain decomposition approach - Look at the high level business entities and create entity services for them.
Top down BPM driven approach - Start from the business processes and divide them into sub-processes. Each business process consists of tasks & activities, that would orchestrate between different service components.
Business Goal driven approach - Derive services from business goals. Decompose the business goals into a set of services that would help satisfy the business goal. Provide tracability between business goals and IT services by a Goal/Business Service matix.
Existing systems - Service wrappers are created on existing systems - to surface them for orchestration in a business process or a composite service. This technique may not be appropriate if the existing IT landscape is not aligned with business goals.
UI driven approach - Identify/discover services based on the user interface requirements. UI technologies such as Flash, Silverlight, Ext-JS directly call JSON/REST services on back-end application servers.

Tuesday, June 28, 2011

Ruminating on SEO

SEO (Search Engine Optimization) is an integral part of any Internet Marketing Campaign. SEO strives to increase the visibility of a website in search results. For SEO, we have to consider both on-page factors and off-page factors. Given below are some examples of what can be done on the site pages and what needs to be done outside the site pages.

On-Page factors examples:

Meta tags
Headings
Links
Keyword frequency (Internal Keyword linking strategy)
Site Structure (Create and submit site maps)
UI design that is "Search Engine Friendly" (Image Alt tags, Menus, etc.)
Make tagging and bookmarking easy.
Robots.txt
URL Normalization (if different URLs lead to the same content)

Off-Page factors examples:

PageRank analysis (Increase no. of inbound links)
Utilize Social Networks, Forums to form a 'link partnership'. (Social Media Optimization - Social SEO)
Create and submit articles, blogs, RSS feeds that link to the site.
Create Sharable content - Mashup ready.

Monday, June 27, 2011

XSL transformations on the brower

Recently, I came across a web framework that was quite unconventional - the framework was performing XML transformations using XSLT on the browser. All web requests were directed to a legacy system that returned XML and this XML was directly sent to the brower. The browser had already loaded the necessary XSL, Javascript and does the XML -> HTML transformation.

There are certain pros and cons of this approach.
Advantages:

Clear separation of markup/layout from content.
Heavy XSL processing offloaded to the brower, brower JS code can check for brower compatibilities and spit out right markup code.

Disadvantages:

All browsers do not support XSL in a standard way. Can be quite a pain to make the look-n-feel compatible with all browsers.
Search engines/bots see raw XML, may not be able to interpret and understand.
Disables progressive rendering. User won't see anything at all until entire stylesheet and data is loaded completely.
XSL is pretty tough to master. Resource skill could become a potential bottleneck.
Could run into scalability issues for large XML payloads. When processing XML files, XSLT must load the entire document into memory.

IMHO, although XSLT processing in-browser is fast compared to server side, it is still better if no transformation is required at all. There are tons of server side web-frameworks that can do the job better and faster.

Wednesday, June 15, 2011

Open source API for read/write to Excel files

Long back, I had blogged about native APIs in .NET and Java to read/write Excel files.
Recently came across a new native .NET library that can be used to read/write Excel files and supports the binary BIFF format. This is the format used by Excel 2005 files - i.e. xls files. This API is still evolving and looks quite basic at this point of time.
The code is available here: http://code.google.com/p/excellibrary/
There is also a .NET port of the popular Apache POI library available at: http://code.google.com/p/npoi/

The new office (2007, 2010) documents are based on XML standards (i.e. xlsx files). There are a couple of open source projects for creating and modifying 2007 'xlsx' excel files.
1) http://epplus.codeplex.com/
2) http://excelpackage.codeplex.com/

Wednesday, April 13, 2011

Operational Reports vs MIS Reports

Once organizations create a data warehouse, a lot of people push all the reporting needs to the DW. But do all reports need to run from a DW?
The answer lies in understanding the difference between operational and informational (MIS) reports. A good article by Bill Inmon on this difference can be found here.

Operational reports are typically detail oriented and shows the latest up-to-date records. Operational reports are used by stakeholders for short term tactical decision making. MIS reports look at summary data over a longer time horizon and are used for strategic decision making.

Examples of operational reporting include bank teller end-of-day window balancing reports, daily account audits and adjustments, daily production records, flight-by-flight traveler logs and transaction logs.

Examples of informational reporting include monthly sales trends, annual revenue, regional sales by product line for the quarter, industry production figures for the year, number of employees by quarter and weekly shipping costs by carrier.

Wednesday, March 23, 2011

Activity Diagrams vs BPMN Diagrams

For modeling business processes, there are two standards popular today – UML 2.0 Activity Diagrams and BPMN.

There are semantic differences in notation between the two standards. For e.g. the way OR-splits, AND-splits/joins are shown.

A detailed whitepaper showcasing the differences between the two notations can be found here.
A discussion thread at http://www.bpm-research.com/forum/index.php?showtopic=501 makes an interesting read.

Saturday, January 29, 2011

Business Function Models Vs Business Capability Models

The difference between these models boils down to the difference between a “business function” and a “business capability”. Many organizations use them interchangeably. For e.g. A business capability model may illustrate current-state business functions and also future-state business functions that need to be built to deliver on the business vision.
But there are few folks who would like to draw a clear line of differentiation between the two. Capability can be defined as the ability to perform actions to achieve specific strategic goals/objectives.

The following links provide interesting reading:
Link 1
Link 2

Hence a business capability is much more than a business function – it encompasses other objects such as Actors, Services, Functions, Processes and Infrastructure. Examples of business capability are – the ability to service customers through online channels, capability to survive liquidity crisis, etc.

Business Architecture Models

While defining the enterprise architecture of an organization, it is essential to understand the various processes, functions and capabilities of the business. Recently came across the FEA Business Reference Model on Wikipedia. The simplicity of the diagram impressed me.

Every organization has different business areas or LOBs. Each business area has a set of business functions that it performs. A ‘Business Function Model’ describes these functional areas and sub-functions in a graphical representation. An example of a BFM can be found here.

Each LOB executes its business functions by following certain business processes. A business process is an orchestration of different business activities and tasks that may be exposed as SOA services.

Tuesday, January 18, 2011

Various dimensions of Security

When we design our applications to be secure, we have to consider all aspects of security. I have often seen people associate security with just authentication and authorization, but there are other security principles to be considered as stated below.

Integrity: We have to ensure that all messages/data have not been tampered with. Integrity of messages ensures that the data has not been maliciously modified by 'man-in-the-middle'.
Confidentiality: This security principle ensures that all messages are encrypted and cannot be eavesdropped.
Authentication/Authorization: Ensure that all resource access goes through a proper authentication process.
Non-Repudiation: This ensures that any party involved cannot refute the validity of a message exchange.

Modern toolkits and technologies such as digital certifications satisfy all of the above security principles.

Monday, January 17, 2011

Concurrent Business Engineering

Some time back, I had blogged about the advantages of having a unified BPM/SOA strategy at the enterprise level. Ran through a Forrestor report that touch a similar concept and calls it as "Concurrent Business Engineering".

Concurrent Business Engineering entails greater colloboration between Business and IT for jointly working on defining new business processes and also defining the technology platform for supporting the processes.
Business services are best designed with a strong understanding of the business process context, hence a top down BPM process-centric view would help in understanding what services need to be surfaced to provide maximum agility to the business process.

This is similar to the idea of having a unified BPM/SOA strategy and utilizes the best of top-down and bottom-up methods for executing the business strategy - as blogged earlier.

Friday, January 14, 2011

Types of Services in SOA

Found a nice article on MSDN describing the various types of services - a taxonomy for services.
Jotting down the concepts explained in the article.

Entity Services - Expose/Surface business entities in the system.e.g. employee, customer, sales order,etc. They contain CRUD operations and additional domain specific operations - e.g. FindOrderByPrice Entity Services abstract the underlying datasources and persistence mechanisms.

Capability Services - Implement a specific business capability - for e.g. Pricing Service, Credit Card Processing Service, etc. They may use Entity Services for persistence.

Thus Entity Services are "data-centric" and Capability Services are "action-centric".

Process Services - Acts as a facade for a BPM process. Process services would maintain state due to the very nature of a workflow having to maintain state over a long running process. Process Services are typically implemented using BPM tools such as WWF, Biztalk, WPS, etc.

Infrastructure Services (Utility Services) - Common cross cutting functions such as Logging, Auditing, Security, Authorization, etc.

Thursday, January 06, 2011

Ruminating on SOA Governance

SOA Governance has two dimensions. First – the processes and methodologies used. Second – the tools and products used for governance.

Quite often, people assume that the purchase of a SOA Governance tool would suffice for implementing SOA Governance. But the fact is that the tools would only help in automating certain enforcement policies and service lifecycle workflows. What is first needed is a framework of processes, policies and organization structure to be defined. Any governance process needs to embrace the trilogy of “people, process and technology”.

The following diagram illustrates this point and states the various activities that need to be done for implementing SOA Governance.

The Open Group has also published a SOA Governance Framework that can be accessed here.