Tuesday, December 15, 2009
The most important advantage of longitudnal data is that we can measure change and the effect of various factors over the data-point time values. For e.g. what is the effect a particular drug had on a cancer patient? The effect of different teachers on a student?
So essentially, longitudnal data helps in establishing cause-n-effect relationships. Longitudnal data stores are also being used for predictive modeling and other areas. Longitudnal data stores are very popular in the Life Sciences and Healthcare industry.
I am interesting in learning the best practices for creating and optimizing a data-model for longitudnal data stores.
The term Biostatistics is a combination of the words biology and statistics. So it essentially it is the application of statistics to biology. The science of biostatistics encompasses the design of biological experiments, especially in medicine and agriculture; the collection, summarization, and analysis of data from those experiments; and the interpretation of, and inference from, the results.
Bioinformatics is the application of information technology and computer science to the field of molecular biology. Its primary use has been in genomics and genetics, particularly in those areas of genomics involving large-scale DNA sequencing. Bioinformatics now entails the creation and advancement of databases, algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data.
Bioinformatics focuses on applying computationally intensive techniques (e.g., pattern recognition, data mining, machine learning algorithms, and visualization) to understand biological processes.
More information can be found on Wikipedia at:
Thursday, November 19, 2009
Recently I came across a neat way to handle this using attributes in C#.NET. This article describes the use of attributes to specify the 'operationName' and the 'property-value' to be set on the control when we check for entitlements.
Example code snippet:
[YourCompany.Authorization("EditSalary", "ReadOnly", true)]
private System.Windows.Forms.TextBox Salary;
Though the example is that of a .NET program, the same concept can easily be applied in Java 5, which supports annotations. Annotations in Java are roughly equivalent to Attributes in .NET.
Wednesday, November 18, 2009
Although XSLT is more powerful than XQuery for transformations, it is much simpler to learn XQuery. I remember the tough learning curve I had to go through when learning XSL, but I could grasp XQuery basics in a couple of hours. Both XSLT and XQuery use XPath for quering XML.
Found this interesting discussion on O'Reilly site that compares the two technologies. David P. in the discussion shares some interesting views on the design philosophy behind the two technologies - XQuery is a MUST UNDERSTAND language where as XSLT is a MAY UNDERSTAND language, i.e. it is more adaptive and template driven. XSLT is untyped; conversions between nodes and strings and numbers are handled pretty much transparently. XQuery is a typed language which uses the types defined by XML Schema. XQuery will complain when it gets input that isn't of the appropriate type. XQuery is better for highly-structured data, XSLT for loosely-structured documents.
We were building a transformation engine to convert between the various ACORD formats. The Acord committee has ensured that any new version of ACORD only 'adds' new elements to the current schema. No elements are deleted or the schema changed for backwards compatibility.
Hence for these transformations, XQuery fitted the bill. We also did not have to create a Canonical Data Model as direct transfortions were possible due to the above mentioned restriction. Hence if the tool supported 15 ACORD versions, only 15 XQuery files were required.
Other links on the same subject:
Monday, October 05, 2009
Recently there was a big debate during a code reveiw session on the use of constants. The developers had used constants for the following purposes:
1. Each and every message key used in the i18N application was declared as a constant. The application contained around 3000 message keys and hence the same number of constants.
2. Each and every database column name was declared as a constant. There were around 5000 column names and still counting..
Does it make sense to have such a huge number of constants in any application? IMHO, common sense should prevail. Message keys just don't need to be declared as constants. We already have one level of indirection - why add one more?
Reg. database column names, I have mixed opinions. If a column is being used in multiple classes, does it make sense to declare it as a global constant? Maybe declaring a class as 'Table Name' and its members as 'Column Name' would be a good idea? But if you have a large number of tables and columns, it would become very tedious to even create these constants file.
I found a few tools on the net that can automate the creation of these constants files.
Friday, September 25, 2009
Wednesday, September 16, 2009
Discussions with the developers revealed that FxCop also throws an error when Dispose() is not called on Datasets.
Further investigation revleaded that the Dataset exposes the Dispose() method as a side effect of inheritance. The Dataset class inherits from the MarshalByValueComponent which implements the IDisposable interface because it is a component. The method is not overridden in the System.Data.Dataset class and the default implementation is in the MarshalByValueComponent class. The default implementation just removes the component from the parent container it is in. But in case of Dataset, there are no parent containers and hence the Dispose() method does nothing useful.
Conclusion: It is not necessary to call Dispose() on Datasets. Developers can safely ignore the FxCop warnings too :)
A XML schema defines and declares types (complex and simple) and elements. All elements have a type. If the type has not been specified, it defaults to xsd:anyType. It's easy to understand this by drawing an anology between a class and an object. New elements can be created by referencing existing elements.
In a WSDL message description, a WSDL part may reference an element or a type. If the SOAP binding specifies style="document", then the WSDL part must reference an element. If the SOAP binding specifies style="rpc", then the WSDL part must reference a type.
Now coming to namespaces, users typically get confused over the difference between a targetNamespace and defaultNamespace. The 'targetNamespace' attribute is typically used in schema files to identify and select the namespace into which new elements that are defined are created in. It is the namespace an instance is going to use to access the types it declares. For e.g. An XML document may use this schema as the default schema. Default schema is defined simply by using 'xmlns=' without a prefix.
It is important to remember that XML schema is itself an XML document. Hence a schema can contain a namespace attribute and also a targetNamespace attribute. Typically they are the same.
Jotting down links where more explanation is given:
Friday, June 26, 2009
Thursday, June 11, 2009
Tuesday, June 09, 2009
Monday, June 08, 2009
Saturday, May 16, 2009
Friday, May 15, 2009
Wednesday, May 13, 2009
Friday, May 08, 2009
Over the past few weeks, the Architecture Support Group that I head at my organization, received quite a few queries on making asynchronous web service calls in a SOA environment. So decided to blog about the various options at hand.
To make asych webservice calls in .NET, the following programming models are available. Please visit the links for furthur information.
3. Fire and Forget mechanism: Here we can decorate the server side webmethod with the 'OneWay' attribute.
On the Java side, the popular Axis-2 framework supports asych web services calls out of the box, by generating call back handlers in the webservice proxy.
The WS-Addressing specification is also trying to bring in a standard for defining different MEP (Message Exchange Patterns).
Thursday, April 30, 2009
- Obfuscate the dlls
Sunday, March 15, 2009
Monday, March 02, 2009
- Services should be platform neutral and programming language neutral
- Services should have a standard interface contract
- Services should be loosely coupled
- Design for coarse grained services
- Service Abstraction - Hide information that is not required by clients. Hide underlying technology details. Promotes loose coupling.
- Service Reusability - Position services as enterprise resources with agnostic functional context; i.e. the service can be used in other functional scenarios too. Always design the service in such a way, that it can be used beyond its original context.
- Service Autonomy - Services need to have a high degree of control over their underlying runtime execution environment. The higher up a service is in a typical composition hierarchy, the less autonomy it tends to have due to dependencies on other composed services.
- Service Statelesness - defer or delegate state to databases, rather than in memory.
- Service Discovery - services can be discoved using standards such as UDDI.