Monday, February 20, 2012

Business Intelligence vs Analytics

My collegue Sandeep Raut has a very simple blog-post explaining the differences between traditional BI and Analytics. Summarizing a few key points from the blog below.

"BI traditionally is concerned with creating reports on past data or even current live data. We create OLAP cubes using which we can slice & dice the data, even do a drill down. Analytics is about analyzing the data using mathematics/statistics to identify patterns. These patterns can then be used to predict what may happen in the future. Analytics is about identifying relationships between key data variables that were unknown before. It is about surfacing unknown patterns."

But in my humble opinion, should Analytics not be a subset of BI? I can understand the hype that product vendors create to differentiate their products in the market, but can Analytics exist in isolation to BI? Even predictive data analysis using "realt-time" data/text mining techniques would logically fall under BI....
After all BI is all about meeting business needs through actionable information !
Maybe it is just a game of words and semantics. I remember a few years back, the term DSS (Decision Support Systems) was more widely used than BI :)

Wednesday, February 15, 2012

Using Parallelism in .NET WinForm applications

We all have gone through the travials of multi-threaded programming in WinForm applications. The challenge in WinForm applications is that the UI controls are bound to the thread that created/rendered them; i.e. the UI control can only by updated by the main thread or the GUI thread that created it.

But to keep the UI responsive, we cannot execute any long running task (>0.5 sec) on the UI thread, else the GUI would hang or freeze. If we run the business logic asynchronously on another thread, then how do we pass the results back to the main GUI thread to update the UI?

Traditionally this has been done using the Control.Invoke() methods. More details on this approach is available on this link: http://msdn.microsoft.com/en-gb/magazine/cc300429.aspx

But with the introduction of TPL, there is another alternative way of doing this. We can use the TaskScheduler and SynchronizationContext classes to call heavy lifting work and then pass the results to the main GUI thread.

For e.g.
TaskScheduler uiScheduler = 
           TaskScheduler.FromCurrentSynchronizationContext();
new Task({Your code here}).start(uiScheduler);

Given below are 2 excellent articles eloborating this in detail:
http://www.codeproject.com/Articles/152765/Task-Parallel-Library-1-of-n

http://reedcopsey.com/2010/03/18/parallelism-in-net-part-15-making-tasks-run-the-taskscheduler/

Sacha Barber has an excellent 6 series article on the intricacies of TPL, which I loved reading.

Parallelism in .NET

In one of my previous blogs, I had pointed out to an interesting article that shows how TPL controls the number of threads in the Thread Pool using hill-climbing heuristics.

In order to understand why TPL (Task Parallel Library) is far superior to simple muli-threading, we need to understand the concepts of global queue, local queue on each thread, work-stealing algorithms, etc.
Given below are some interesting links that explain these concepts with good illustrations.

http://www.danielmoth.com/Blog/New-And-Improved-CLR-4-Thread-Pool-Engine.aspx

http://blogs.msdn.com/b/jennifer/archive/2009/06/26/work-stealing-in-net-4-0.aspx

http://udooz.net/blog/2009/08/net-4-0-work-stealing-queue-plinq/

A few important points to remember:
  • There is one global queue for the default Thread Pool in .NET 4.0
  • There is also a local queue for each Thread. The Task Scheduler distributes the tasks from the global queue to the local queues on each Thread. Even sub-tasks created by each Thread get queued on the local queue. This improves the performance, as there is no contention to pick up work items (tasks) from the global queue; especially in a multi-core scenario.
  • If a thread is free and there are no tasks in its local queue and also global queue, then it will steal work from other threads. This ensures that all cores are optimally utilized. This concept is called 'work stealing'.
  • Tasks from the global queue are picked up in 'FIFO' order. Tasks from the local queue are picked up in 'LIFO' order based on the assumption that the last-in is still hot in the cache. Work stealing again happens in 'FIFO' order.
There is a wonderful book on parallel computing available on MSDN that is a must read for everyone.

Monday, February 13, 2012

Data Services in the Microsoft world

In my previous blog, I ranted on the concept of Data Services in creating a data virtualization layer. In the .NET world, data services equate to WCF data services (formerly a.k.a ADO.NET data services)

Microsoft is propogating the use of an open standard called OData for building REST style data services. A good article describing OData is available on MSDN. OData essentially leverages JSON/ATOM and HTTP semantics to build a simple data services layer across disparate data sources.
But looks like besides M$, there are no big vendors jumping on the OData bandwagon. Its interesting to note that WebSphere eXtreme Scale Servers also expose a OData service.

Ruminating of Data Virtualization

The industry is flooded with confusing terms when it comes to understanding 'Data Virtualization'. We have IaaS (Information as a service), Data Services, EII (Enterprise Information Integration), Data Federation, etc. and so on! The point is that there are no industry standard definitions for these analyst-coined terms and there is a lot of overlap between them.

Rick Lans tries to clear the cloud with some simple definitions here. Another interesting post by Barry Devlin throws more light on the concept of data virtualization.

The core concept behind data virtualization is to create an abstraction layer (Data Access Layer) that hides the complexities of the underlying disparate data sources and provides a unified view of the enterprise data to the applications. This can be implemented using "SOA style" Data Services or creating a virtual data layer that can be queried using SQL-like semantics. More info can be found at these links: Link1 & Link2

RedHat has a nice whitepaper explaining the concept of Data Services in a SOA environment. This post explains the benefits of data virtualization. Composite Software is a leader in data virtualization techniques and has shared a couple of interesting case studies that demonstrate the use of their data virtualization platform.

One thought that came to my mind was regarding the challenges in accessing NoSQL data from the data virtualization layer. While some type of NoSQL datastores such as XML documents, Key/Value pairs can be exposed as a relational SQL view, it may not be possible to have a uniform query interface for unstructured data. All NoSQL data stores will expose some kind of Java API that can be used for querying. Would it be possible to create a common set of meta-data for both structured and unstructured data?
In such scenarios, IMHO, the only strategy for data virtualization is to use Data Services.

Thursday, February 09, 2012

Google Protocol Buffers

Just found a good post by the Google Engineering team ranting about the historical context of Google Protocol Buffers.
My first reaction to GPB was - "Why on earth another binary serialization format"?
I think the reason behind the popularity of GPB has been its simplicity and ease of use. 

This site has an interesting discussion on comparing GPB to XML/JSON.  A few snippets from the site comments/discussions -

  • A major difference between protocol buffers and JSON is that protocol buffers use a binary format, while JSON is plain text.  Because it's binary, the format is more compact and easier to interpret by a computer - which makes protocol buffers faster than JSON.
  • Another reason GPB is so fast is that it uses positional binding. JSON is less bloated compared to XML (which is over bloated), it still sends the name of the attribute with each record. That creates an enormous amount of overhead. PB, on the other hand, uses positional binding and doesn't send the attribute names at all.
  •  Binary protocols have to deal with portability issues like byte-order (little/big-endian) etc., there are advantages when it comes to parsing dates, timestamps, etc.

Alternatives to XML Serialization

Today, there are a lot of alternatives for XML serialization of data structures. These data interchange formats are smaller and faster than processing XML.
Most popular are Google Protocol Buffers, Thrift (from FaceBook),  Avro and MessagePack. A good article comparing these alternatives is available here -
http://www.igvita.com/2011/08/01/protocol-buffers-avro-thrift-messagepack/

Wikipedia also has an interesting article comparing various data serialization. 

Tuesday, January 03, 2012

Techniques for handling very large strings in Java

In my previous blog, I had jotted down the perils of storing large strings in memory. So what are the alternatives? Listing down a few at the top of my head right now.
  1. Stream the string to a file and read chunk-wise from the file when required.
  2. Store an array of strings, instead of storing a large string. A large continuous block of memory may not be available, but there could be small holes in the fragmented heap.
  3. Compress the string using GZIP compression methods. Use the GZIPWriter class to keep appending strings to a byte-buffer.
  4. If the large XML string is to be sent back as a webservice response, utilize the streaming support in SOAP stacks such as Axis 2 and CXF. Evaluate the use of MTOM for large attachments.
  5. If you are operating on a large number of files, first deal with the 'large' files. To understand why, please peruse these links - Link 1 & Link 2
In one of the scenarios, the large XML string had to be fed to the JasperReports engine. Found a few interesting options to deal with this challenge here.

Heap Memory in .NET

Apropos my previous post, my team was trying to resolve another memory leak problem in one of the .NET applications. It is interesting to note that a .NET program does not have any explict way to specify the heap size. The .NET heap size will keep on growing till it consumes all of the available memory.
A hosted application such as IIS can control the amount of heap allocated to a Application Domain.
The following discussion threads throw more light on this: Link1  Link2

Also found this amazing article by Andrew Hunter (ANTS profiler contributor) explaning the Large Object Heap concept in .NET. Understanding these concepts will make us appreciate how we get an unexpected OutOfMemory error even if our total object size is relatively small.

Friday, December 30, 2011

OutOfMemoryError while using StringBuilder/StringBuffer

I was helping a friend debug a OutOfMemory exception in a Java web application. The program made heavy use of StringBuilder and was appending a large number of strings. An entire record set (containing thousands of records) was essentially converted into an XML string.
Strangely when the OOM error occured, there was still plenty of heap memory available. Furthur deep-dive debugging and some googling around, taught a few important lessons.
  1. Whenever the internal buffer capacity of StringBuilder/StringBuffer is exceeded, then the next character array size it creates is twice of the original size. A good blog explaining this is here. Hence it is better to initialize the initial capacity of the StringBuilder to a reasonable value beforehand.
  2. StringBuilder needs a continuous block of memory for further allocation. For e.g. you may have 20MB free heap space, but it may be fragmented. Hence even a 5MB StringBuilder allocation may fail and result in a OOM error. Links to forums - Link 1  Link2 Link3
  3. Try to use a 64-bit machine, as there are no practical limitations for the heap memory allocation and also it is much easier to find a continous block of memory due to 64-bit addressing.
  4. Alter the design of the program to not store the string in memory, but in a file. Alternatively stream it directly to the HTTP response. 

Friday, December 02, 2011

Taxonomy of Services

In one of my previous posts, I had blogged about creating a taxonomy of services using functional categorization.
For e.g. Entity Services, Task/Activity Services, Process Services and Infrastructure services.

But services can also be categorized from different perspectives such as layers or intent of use. For e.g.
Categorization based on Service Layer:

1. Business Services: Represent high level business functions that define an enterprise.
2. Application Services: Application specific and usually will be aggregated in a composite service at the business level.
3. Infrastructure Services: Utility functions that deliver cross cutting functions.

Categorization based on scope:
1. Enterprise Services: Multiple LOBs use the service.
2. Domain Services: Applicable only within a LOB.
3. Application Services: Local to the App Level.

Thursday, November 17, 2011

Analysis vs Design

This age-old debate keeps propping up every now and then :)
Found a couple of good articles reflecting on the difference between the two.

http://butunclebob.com/ArticleS.UncleBob.AnalysisVsDesign

http://devhawk.net/2004/03/30/analysis-vs-design-modeling/

Thursday, November 03, 2011

Oracle Web Service Manager vs Oracle Enterprise Gateway

Oracle Web Service Manager is an integral part of the Oracle SOA suite and it allows us to implement security declaratively; without any coding from the developer. Security policies can be enforced at run-time using WSM agents or WSM gateways.
There is another product called "Oracle Enterprise Gateway" that has features that overlap with OWSM - hence this results in lot of confusion.

So, lets understand the concepts one by one. A WSM agent is a component that is installed with the endpoint service. So it provides the 'last-mile' security (last security layer). Now OWSM also a gateway component where security policies can be employed in a central location. A gateway can also perform functions that an agent cannot do, such as message routing, transformations, and failover. OWSM also has an extension for OSB, that allows us to use OWSM policies at the OSB (ESB) layer. These capabilities would suffice the requirements of most intranet SOA infrastructures.

Oracle positions the Oracle Enterprise Gateway as the first line of defence ("perimiter security") when SOA services are exposed to the outside world. This is the equivalent of a DMZ firewall. So it looks like Oracle Enteprise Gateway is a more expansive product that can do everything that OWSM does, plus all the bells and whistles.

Tuesday, November 01, 2011

Ruminating on the Oracle MDM suite

Recently during one of our internal brainstorming sessions, there was a lot of confusion over the various components available on the Oracle platform to build a robust MDM solution. Part of the confusion was because Oracle has picked up best of breed components from various acquisitions and integrated them together to form the MDM suite. Its important to understand that when someone talks about Oracle MDM - it is a suite of components and NOT just one product.

Oracle's acquisition of Hyperion has further added to the confusion, as Hyperion has full capabilites to be used as a MDM solution. Oracle promotes Hyperion Data Relationship Manager as a component in its MDM suite - that can be used for managing the relationships between different attributes of master data from disparate sources.

At the fundamental level, to build a MDM end-to-end solution, you need basic components such as a ETL tool, Data Profiler, Data Cleansing Engine that can be used for standarization, de-duplication, validation, etc. Given below are the core components of the Oracle MDM suite, followed by optional components that help in jump-starting your MDM journey.

  • Oracle Data Integrator Enterprise Edition: ODI can be used for ELT style bulk data movement, or near real-time updates, and data services. ODI can consolidate master data from various sources and also publish master data to downstream applications. (Note: Oracle has acquired Golden Gate product that enables real-time intergration of data across disaparate data-sources. GG can also be used in conjuction with ODI for movement of data)
  • Oracle Data Quality / Data Profiling: Oracle Data Profiling allows us to profile the master data and investigate the content and the structure of their different data sources. It also gives users the ability to monitor the evolution of data quality over time using Time Series. Oracle Data Quality allows us to standardize, validate, cleanse and enrich master data – for e.g. master list of securities, issuers, official list, etc. Using both these tools will ensure the integrity of data stored in the MDM data store.
  • Oracle Business Intelligence Suite: OBIEE can be used for analytics and reporting on master data entities.
Besides these core components, Oracle MDM suite also contains pre-packaged MDM solutions such as “Customer Hub”, “Product Hub”, “Site Hub”, “Supplier Hub”, etc. (Some of these are part of Siebel MDM, I believe.)  

Thursday, October 27, 2011

Ruminating on IRM (Information Rights Management)

From a security architecture perspective, it is important to consider the need for using IRM technology. Traditionally we have secured access to documents using RBAC patterns for secure access and download.

But how do you control the information once it is downloaded to the users machine? Can the user copy/paste from the document? Can the user print the document? Can the user forward the document to someone or upload it somewhere? Can he run macros on the document? So how can an enterprise have total control on sensitive information?

These questions cannot be answered by classical access control mechanisms, they need a new security framework concept called "Information Rights Management". Many traditional ECM vendors also offer IRM adapters or add-ons to help customers have total centralized control over their digital assets. For e.g. SharePoint 2010 has IRM protectors that can be plugged-in for end-to-end protection of documents on the user's computers.Oracle UCM can be extended with Oracle IRM, etc.

Across all these IRM product architectures, it is necessary to have some form of client application installed on all users machines. Files that get downloaded from the DMS are special encrypted rights-managed files. The file format contains meta-data that defines the access that can be given to the user. The client application would decrypt the file, understand the access constraints and accordingly give rights to the user. On the windows platform, MS has long released the Windows Rights Management Services - a comprehensive API to address IRM challenges on the windows platform.

Thursday, August 11, 2011

How does .NET TPL control the number of threads

I often wondered what heuristics the Task Parallel Library (TPL) in .NET uses to control the number of threads for optimal utilization on multi-core machines.
Found a great discussion thread on StackOverFlow explaining the details.

Thursday, July 21, 2011

Techniques for Service Identification in SOA

Its very important to use proper service identification techniques to identify services in an portfolio. In fact, service identification should be the first step in your service lifecycle management process.

Jotting down some of the techniques that we have been using for identifying services:
  1. Domain decomposition approach - Look at the high level business entities and create entity services for them.
  2. Top down BPM driven approach - Start from the business processes and divide them into sub-processes. Each business process consists of tasks & activities, that would orchestrate between different service components.
  3. Business Goal driven approach -  Derive services from business goals. Decompose the business goals into a set of services that would help satisfy the business goal. Provide tracability between business goals and IT services by a Goal/Business Service matix.
  4. Existing systems -  Service wrappers are created on existing systems - to surface them for orchestration in a business process or a composite service. This technique may not be appropriate if the existing IT landscape is not aligned with business goals.
  5. UI driven approach - Identify/discover services based on the user interface requirements. UI technologies such as Flash, Silverlight, Ext-JS directly call JSON/REST services on back-end application servers.

Tuesday, June 28, 2011

Ruminating on SEO

SEO (Search Engine Optimization) is an integral part of any Internet Marketing Campaign. SEO strives to increase the visibility of a website in search results. For SEO, we have to consider both on-page factors and off-page factors. Given below are some examples of what can be done on the site pages and what needs to be done outside the site pages.

On-Page factors examples:
  • Meta tags
  • Headings
  • Links
  • Keyword frequency (Internal Keyword linking strategy)
  • Site Structure (Create and submit site maps)
  • UI design that is "Search Engine Friendly" (Image Alt tags, Menus, etc.)
  • Make tagging and bookmarking easy.
  • Robots.txt
  • URL Normalization (if different URLs lead to the same content)
Off-Page factors examples:
  • PageRank analysis (Increase no. of inbound links)
  • Utilize Social Networks, Forums to form a 'link partnership'. (Social Media Optimization - Social SEO)
  • Create and submit articles, blogs, RSS feeds that link to the site.
  • Create Sharable content - Mashup ready.

Monday, June 27, 2011

XSL transformations on the brower

Recently, I came across a web framework that was quite unconventional - the framework was performing XML transformations using XSLT on the browser. All web requests were directed to a legacy system that returned XML and this XML was directly sent to the brower. The browser had already loaded the necessary XSL, Javascript and does the XML -> HTML transformation.

There are certain pros and cons of this approach.
Advantages:
  • Clear separation of markup/layout from content.
  • Heavy XSL processing offloaded to the brower, brower JS code can check for brower compatibilities and spit out right markup code.
Disadvantages:
  • All browsers do not support XSL in a standard way. Can be quite a pain to make the look-n-feel compatible with all browsers.
  • Search engines/bots see raw XML, may not be able to interpret and understand.
  • Disables progressive rendering. User won't see anything at all until entire stylesheet and data is loaded completely.
  • XSL is pretty tough to master. Resource skill could become a potential bottleneck.
  • Could run into scalability issues for large XML payloads. When processing XML files, XSLT must load the entire document into memory.
IMHO, although XSLT processing in-browser is fast compared to server side, it is still better if no transformation is required at all. There are tons of server side web-frameworks that can do the job better and faster.

Wednesday, June 15, 2011

Open source API for read/write to Excel files

Long back, I had blogged about native APIs in .NET and Java to read/write Excel files.
Recently came across a new native .NET library that can be used to read/write Excel files and supports the binary BIFF format. This is the format used by Excel 2005 files - i.e. xls files. This API is still evolving and looks quite basic at this point of time. 
The code is available here:  http://code.google.com/p/excellibrary/
There is also a .NET port of the popular Apache POI library available at: http://code.google.com/p/npoi/

The new office (2007, 2010) documents  are based on XML standards (i.e. xlsx files). There are a couple of open source projects for creating and modifying 2007 'xlsx' excel files.
1) http://epplus.codeplex.com/
2) http://excelpackage.codeplex.com/

Wednesday, April 13, 2011

Operational Reports vs MIS Reports

Once organizations create a data warehouse, a lot of people push all the reporting needs to the DW. But do all reports need to run from a DW?
The answer lies in understanding the difference between operational and informational (MIS) reports. A good article by Bill Inmon on this difference can be found here.

Operational reports are typically detail oriented and shows the latest up-to-date records. Operational reports are used by stakeholders for short term tactical decision making. MIS reports look at summary data over a longer time horizon and are used for strategic decision making.

Examples of operational reporting include bank teller end-of-day window balancing reports, daily account audits and adjustments, daily production records, flight-by-flight traveler logs and transaction logs.

Examples of informational reporting include monthly sales trends, annual revenue, regional sales by product line for the quarter, industry production figures for the year, number of employees by quarter and weekly shipping costs by carrier.

Wednesday, March 23, 2011

Activity Diagrams vs BPMN Diagrams

For modeling business processes, there are two standards popular today – UML 2.0 Activity Diagrams and BPMN.

There are semantic differences in notation between the two standards. For e.g. the way OR-splits, AND-splits/joins are shown.

A detailed whitepaper showcasing the differences between the two notations can be found here.
A discussion thread at http://www.bpm-research.com/forum/index.php?showtopic=501 makes an interesting read.

Saturday, January 29, 2011

Business Function Models Vs Business Capability Models

The difference between these models boils down to the difference between a “business function” and a “business capability”.  Many organizations use them interchangeably. For e.g. A business capability model may illustrate current-state business functions and also future-state business functions that need to be built to deliver on the business vision.
But there are few folks who would like to draw a clear line of differentiation between the two.  Capability can be defined as the ability to perform actions to achieve specific strategic goals/objectives.

The following links provide interesting reading:
Link 1
Link 2

Hence a business capability is much more than a business function – it encompasses other objects such as Actors, Services, Functions, Processes and Infrastructure.  Examples of business capability are – the ability to service customers through online channels, capability to survive liquidity crisis, etc.

Business Architecture Models

While defining the enterprise architecture of an organization, it is essential to understand the various processes, functions and capabilities of the business. Recently came across the FEA Business Reference Model on Wikipedia. The simplicity of the diagram impressed me.

Every organization has different business areas or LOBs. Each business area has a set of business functions that it performs.  A ‘Business Function Model’ describes these functional areas and sub-functions in a graphical representation. An example of a BFM can be found here
Each LOB executes its business functions by following certain business processes.  A business process is an orchestration of different business activities and tasks that may be exposed as SOA services.

Tuesday, January 18, 2011

Various dimensions of Security

When we design our applications to be secure, we have to consider all aspects of security. I have often seen people associate security with just authentication and authorization, but there are other security principles to be considered as stated below.
  1. Integrity: We have to ensure that all messages/data have not been tampered with. Integrity of messages ensures that the data has not been maliciously modified by 'man-in-the-middle'.
  2. Confidentiality: This security principle ensures that all messages are encrypted and cannot be eavesdropped. 
  3. Authentication/Authorization: Ensure that all resource access goes through a proper authentication process.
  4. Non-Repudiation: This ensures that any party involved cannot refute the validity of a message exchange.
Modern toolkits and technologies such as digital certifications satisfy all of the above security principles.

Monday, January 17, 2011

Concurrent Business Engineering

Some time back, I had blogged about the advantages of having a unified BPM/SOA strategy at the enterprise level.  Ran through a Forrestor report that touch a similar concept and calls it as "Concurrent Business Engineering".

Concurrent Business Engineering entails greater colloboration between Business and IT for jointly working on defining new business processes and also defining the technology platform for supporting the processes.
Business services are best designed with a strong understanding of the business process context, hence a top down BPM process-centric view would help in understanding what services need to be surfaced to provide maximum agility to the business process.

This is similar to the idea of having a unified BPM/SOA strategy and utilizes the best of top-down and bottom-up methods for executing the business strategy - as blogged earlier

Friday, January 14, 2011

Types of Services in SOA

Found a nice article on MSDN describing the various types of services - a taxonomy for services.
Jotting down the concepts explained in the article.


Entity Services - Expose/Surface business entities in the system.e.g. employee, customer, sales order,etc. They contain CRUD operations and additional domain specific operations - e.g. FindOrderByPrice Entity Services abstract the underlying datasources and persistence mechanisms.

Capability Services - Implement a specific business capability - for e.g. Pricing Service, Credit Card Processing Service, etc.   They may use Entity Services for persistence.

Thus Entity Services are "data-centric" and Capability Services are "action-centric".

Process Services - Acts as a facade for a BPM process. Process services would maintain state due to the very nature of a workflow having to maintain state over a long running process.  Process Services are typically implemented using BPM tools such as WWF, Biztalk, WPS, etc.

Infrastructure Services (Utility Services) - Common cross cutting functions such as Logging, Auditing, Security, Authorization, etc.

Thursday, January 06, 2011

Ruminating on SOA Governance

 SOA Governance has two dimensions. First – the processes and methodologies used. Second – the tools and products used for governance.

Quite often, people assume that the purchase of a SOA Governance tool would suffice for implementing SOA Governance. But the fact is that the tools would only help in automating certain enforcement policies and service lifecycle workflows. What is first needed is a framework of processes, policies and organization structure to be defined. Any governance process needs to embrace the trilogy of “people, process and technology”.

The following diagram illustrates this point and states the various activities that need to be done for implementing SOA Governance.



The Open Group has also published a SOA Governance Framework that can be accessed here.

Wednesday, December 29, 2010

SOA Registry, Repository and Service Catalog

While implementing enterprise SOA, it is important to consider deploying a service catalog for services. There is a lot of confusion between the concepts of registry, repository and service catalog.

Traditionally a registry has been a lookup service provided to service consumers. Service providers register their services in the registry and service consumers select an appropriate service for their needs. Standards such as UDDI addressed these needs.The Registry would contain service descriptions, service contracts and service policies that describe a service. Service registries have also been practically used for determining a service end-point address at runtime based on the service unique name.

So what is a repository? As the importance of SOA Governance grew, it became necessary to capture more meta-data about a service. A service repository integrates information about a service from multiple sources and stores it in a centralized database. Service information may include design artifacts, deployment topologies, service code repository, service monitoring stats, etc. Vendors have started positioning their generic asset management products as SOA repositories. For e.g. Rational Asset Manager.

A lot of vendors now sell a combined product that consists of the registry and repository. For e.g. IBM Websphere Registry and Repository.

A service catalog is a concept that can be implemented using SOA registry/repository products.

Monday, December 27, 2010

Entities Vs Value Objects

In Domain Driven Design, we often separate Entities and Value Objects. Junior architects always get confused between these 2 concepts.
The essential difference is that domain entities have an identity and a lifecycle. So each Entity has a unique identity and with a given domain, no two entities can have the same identity. Value objects need not have an identity. So if we have an "equals()" method that compares the parameter values of each value object, then we can have value objects that are identical. Value objects should ideally also be immutable.
The following links offer interesting stuff on this concept.
1) Lostechies
2) StackOverflow
3) Devlicious