Thursday, December 29, 2005
I knew that the VM does not exit as long as there is one non-daemon thread running. But there were a couple of new things I learned. Some snippets from the above article:
There are two ways a thread can become a daemon thread (or a user thread, for that matter) without putting your soul at risk. First, you can explicitly specify a thread to be a daemon thread by calling setDaemon(true) on a Thread object. Note that the setDaemon() method must be called before the thread's start() method is invoked.Once a thread has started executing (i.e., its start() method has been called) its daemon status cannot be changed.
The second technique for creating a daemon thread is based on an often overlooked feature of Java's threading behavior: If a thread creates a new thread and does not call setDaemon() to explicitlly set the new thread's "daemon status", the new thread inherits the "daemon-status" of the thread that created it. In other words, unless setDaemon(false) is called, all threads created by daemon threads will be daemon threads; similarly, unless setDaemon(true) is called, all threads created by user threads will be user threads.
Monday, December 26, 2005
When we compile a Java source using the 'javac' exe, the generated class file by default contains some debug info - By default, only the line number and source file information is generated.
Hence when a stack trace is printed on the screen, we can see the source file name and the line number also printed on the screen. Also, when using log4J, I remember using a layout which can print the line number of each log statement - I bet log4J uses the debug infomation present in the class file to do this.
The javac exe also has a '-g' option - This option generates all debugging information, including local variables. To understand this better, I compiled a java source twice; once with the -g option and once without. I then decompiled both the class files to see how the decompiled output differs. The class file compiled with the debug option, showed all the local variable names same as in the original Java file, whereas the one without the debug option, had the local variables named by the decompiler as the local variable information was not stored in the class file.
Another interesting fact is that the class file can contain certain attributes that are non-standard; i.e. vendor specific. Suprised !!!...So was I..Please find below a snippet from the VM spec :
Compilers for Java source code are permitted to define and emit class files containing new attributes in the attributes tables of class file structures. Java Virtual Machine implementations are permitted to recognize and use new attributes found in the attributes tables of class file structures. However, all attributes not defined as part of this Java Virtual Machine specification must not affect the semantics of class or interface types. Java Virtual Machine implementations are required to silently ignore attributes they do not recognize.
For instance, defining a new attribute to support vendor-specific debugging is permitted. Because Java Virtual Machine implementations are required to ignore attributes they do not recognize, class files intended for that particular Java Virtual Machine implementation will be usable by other implementations even if those implementations cannot make use of the additional debugging information that the class files contain
Friday, December 23, 2005
Well, I found out 2 ways to do this:
1) Using TOP to get the topmost 2 rows from the table in descending order and then again getting a TOP of the resultset.
SELECT TOP 1 *
FROM(SELECT TOP 2 *
ORDER BY Marks desc)
ORDER BY Marks;
2) Using the MAX function. (I was suprised to see this work)
SELECT Max(Marks) FROM Table1
WHERE Marks Not In (SELECT MAX(Marks) FROM Table1);
The hierarchical database model existed before the far more familiar relational model in early mainframe days. Hierarchical databases were blown away by relational versions because it was difficult to model a many-to-many relationship - the very basis of the hierarchical model is that each child element has only one parent element.
In his article, Scott Ambler makes the following statement:
Hierarchical databases fell out of favor with the advent of relational databases due to their lack of flexibility because it wouldn’t easily support data access outside the original design of the data
structure. For example, in the customer-order schema you could only access an order through a customer, you couldn’t easily find all the orders that included the sale of a widget because the schema isn’t designed to all that.
I found an article on the microsoft site which discusses the areas where directory services can be used in place of Relational databases.
Some good candidates for heirarchical databases are- White pages" information, User credential & security group information, Network configuration and service policies, Application deployment policies etc.
Thursday, December 22, 2005
It is because of this heirarchy that we can have any arbitraty level of granularity in logging. For e.g. suppose we have 10 components in a distributed environment and we want to shutdown the logging of one component. This can be easily done in Log4J using config files. We can also disable logging at a class level - provided we have a Logger for that class - using the Logger.getLogger(this.class) method. Another example is if U want to switch off logging of one or more modules in a application.
We can even make dymanic changes in the configuration file at runtime and these changes would get reflected provided we have used the configureAndWatch() method.
I came across a good synopsis on classloaders here : http://www.onjava.com/pub/a/onjava/2003/04/02/log4j_ejb.html
Here are some snippets from the above article:
Classloaders, as the name suggests, are responsible for loading classes within the JVM. Before your class can be executed or accessed, it must become available via a classloader. Given a class name, a classloader locates the class and loads it into the JVM. Classloaders are Java classes themselves. This brings the question: if classloaders are Java classes themselves, who or what loads them?
When you execute a Java program (i.e., by typing java at a command prompt), it executes and launches a native Java launcher. By native, I mean native to your current platform and environment. This native Java launcher contains a classloader called the bootstrap classloader. This bootstrap classloader is native to your environment and is not written in Java. The main function of the bootstrap classloader is to load the core Java classes.
The JVM implements two other classloaders by default. The bootstrap classloader loads the extension and application classloaders into memory. Both are written in Java. As mentioned before, the bootstrap classloader loads the core Java classes (for example, classes from the java.util package). The extension classloader loads classes that extend the core Java classes (e.g., classes from the javax packages, or the classes under the ext directory of your runtime). The application classloader loads the classes that make up your application.
bootstrap classloader <-- extension classloader <-- application classloader All three default classloaders follow the delegation model. Before a child classloader tries to locate a class, it delegates that task to a parent. When your application requests a particular class, the application classloader delegates this request to the extension classloader, which in turn delegates it to the bootstrap classloader. If the class that you requested is a Java core class, the bootstrap classloader will make the class available to you. However, if it cannot find the class, the request returns to the extension classloader, and from there to the application classloader itself. The idea is that each classloader first looks up to its parent for a particular class. Only if the parent does not have the class does the child classloader try to look it up itself.
In application servers, each separately-deployed web application and EJB gets its own classloader (normally; this is certainly the case in WebLogic). This classloader is derived from the application classloader and is responsible for that particular EJB or web application. This new classloader loads all classes that the webapp or EJB require that are not already part the Java core classes or the extension packages. It is also responsible for loading and unloading of classes, a feature missing from the default classloaders. This feature helps in hot deploy of applications. When WebLogic starts up, it uses the Java-supplied application classloader to load the classes that make up its runtime. It then launches individual classloaders, derived from the Java application classloader, which load the classes for individual applications. The individual classloaders are invisible to the classloaders of the other applications; hence, classes loaded for one particular application will not be seen by another application. What if you want to make a single class available to all applications? Load it in a top-level classloader. This could be in the classpath of WebLogic. When WebLogic starts, it will automatically load this class in memory using the Java-supplied application classloader. All sub-application classloaders get access to it. However, the negatives of this approach are clear too. First, you lose the capability of hot deploy for this particular class in all individual applications. Second, any change in this class means that the server needs to be restarted, as there is no mechanism for a Java application classloader to reload classes. You will need to weigh in the pros and cons before you take this approach. WebLogic Server allows you to deploy newer versions of application modules such as EJBs while the server is running. This process is known as hot-deploy or hot-redeploy and is closely related to classloading.
Java classloaders do not have any standard mechanism to undeploy or unload a set of classes, nor can they load new versions of classes. In order to make updates to classes in a running virtual machine, the classloader that loaded the changed classes must be replaced with a new classloader. When a classloader is replaced, all classes that were loaded from that classloader (or any classloaders that are offspring of that classloader) must be reloaded. Any instances of these classes must be re-instantiated.
Wednesday, December 21, 2005
But I did not know that the market is flooded with a variety of OR tools.
Here are a few of them:
Abra - http://abra.sourceforge.net
BasicWebLib - http://basicportal.sourceforge.net/
Castor - http://castor.exolab.org
Cayenne - http://objectstyle.org/cayenne/
DataBind - http://databind.sourceforge.net
DB VisualArchitect - http://www.visual-paradigm.com/dbva.php
EnterpriseObjectsFramework - http://www.apple.com/webobjects/
Expresso - http://www.jcorporate.com
FireStorm - http://www.codefutures.com/products/firestorm
Hibernate - http://www.hibernate.org
Hydrate - http://hydrate.sourceforge.net
iBATIS DB Layer - http://www.ibatis.com
Infobjects ObjectDRIVER- http://www.infobjects.com
InterSystems Cache - http://www.intersystems.com
Jakarta ObjectRelationalBridge - http://db.apache.org/ojb
Java Ultra-Lite Persistence (JULP) - http://julp.sourceforge.net
Jaxor - http://jaxor.sourceforge.net
JdoGenie - http://www.jdogenie.com
JDOMax - http://www.jdomax.com
JDX - http://www.softwaretree.com
KodoJdo - http://www.solarmetric.com/Software/Kodo_JDO/
LiDO - http://www.libelis.com
O/R Broker - http://orbroker.sourceforge.net/
Persistence EdgeXtend - http://www.persistence.com/products/edgextend/
PlanetJ's WOW = http://planetjavainc.com/wow.html
PowerMapJdo - http://www.powermapjdo.com
Signsoft intelliBO - http://www.intellibo.com
SimpleOrm - http://www.SimpleOrm.org
SpadeSoft XJDO - http://www.spadesoft.com
Sql2java - http://sql2java.sf.net
Sun Microsystem certified (SunTone?) - FrontierSuite? for J2EE & J2SE - http://www.objectfrontier.com
Sun Microsystem certified (SunTone?) - FrontierSuite? for JDO - http://www.objectfrontier.com
The ProductivityEnvironmentForJava (PE:J) - http://www.hywy.com/
TopLink - http://otn.oracle.com/products/ias/toplink/content.html
VBSF - http://www.objectmatter.com
Also the following articles give some some good introduction to O-R mapping tools and their comparisions:
Tuesday, December 20, 2005
The big problem that Swing has is that it's big and complicated. And the reason that it's big and complicated is that it is the most fully featured, flexible, whatever-way-you-want-to-measure-it UI [user interface] tool kit on the planet. You can do the most amazing things with Swing. It's sort of the 747 cockpit of UI design tool kits, and flying a 747 is a little intimidating, and what people really want is a Cessna.
It is definately possible to write good speedy Swing applications, but U need to be a master of Swing and threading and know all the potholes which hog memory.
Here are a few links which contain some good info on fast Swing apps:
Wednesday, December 14, 2005
On the other hand, if the two token values match, then we are confident that the flow of control is exactly as expected. At this point, the token value in the session is modified to a new value and the form submission is accepted.
Struts has inbuild support for the Synchronizer Token pattern. Method such as "saveToken()" and "isTokenValid()" are present in the ActionForm class.
Here's the sample code for doing so:
WebProxy proxyObj = new WebProxy("10.10.111.19", 8080) ;
NetworkCredential networkCredential = new NetworkCredential("username", "secret_password") ;
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(query) ; proxyObj.Credentials = networkCredential ;
req.Proxy = proxyObj ;
For proxy setting in Java programs click here.
Databases supports various levels of lock granularity. The lowest level is a single row. Sometimes, the database doesn't have enough resources to lock each individual
row. In such cases, the database can acquire locks on a single data or index page, group of pages, or an entire table. The granularity of locks depends on the memory available to the database. Servers with more memory can support more concurrent users because they can acquire and release more locks.
We can give locking hints to the database while sending queries that would help us override locking decisions made by the database. For instance, in SQL Server. we can specify the ROWLOCK hint with the UPDATE statement to convince SQL Server to lock each row affected by that data modification.
While dealing with transactions, the type of locking used depends on the Transaction Isolation Level specified for that particular transaction.
SQL Server supports implicit and explicit transactions. By default, each INSERT, UPDATE, and DELETE statement runs within an implicit transaction.Explicit transactions, on the other hand, must be specified by the programmer. Such transactions are included in
BEGIN TRANSACTION ... COMMIT TRANSACTION block
There are two common types of locks present in all databases- Shared locks and Exclusive locks.
Shared locks (S) allow transactions to read data with SELECT statements. Other connections are allowed to read the data at the same time; however, no transactions are allowed to modify data until the shared locks are released.
Exclusive locks (X) completely lock the resource from any type of access including reads. They are issued when data is being modified through INSERT, UPDATE and DELETE statements
Locking/Transaction Isolation levels solve the following concurrency problems:
- Lost Updates: Lost updates occur when two or more transactions select the same row and then update the row based on the value originally selected. Each transaction is unaware of other transactions. The last update overwrites updates made by the other transactions, which results in lost data.
- Dirty Read: Uncommitted dependency occurs when a second transaction selects a row that is being updated by another transaction. The second transaction is reading data that has not been committed yet and may be changed by the transaction updating the row.
- Nonrepeatable Read: Inconsistent analysis occurs when a second transaction accesses the same row several times and reads different data each time.
- Phantom reads: These occur when an insert or delete action is performed against a row that belongs to a range of rows being read by a transaction. The transaction's first read of the range of rows shows a row that no longer exists in the second or succeeding read, as a result of a deletion by a different transaction. Similarly, as the result of an insert by a different transaction, the transaction's second or succeeding read shows a row that did not exist in the original read.
These concurrency problems can be solved using the correct transaction isolation level in our programs. To balance between high concurrency and data integrity, we can choose from the following isolation levels:
- READ UNCOMMITTED: This is the lowest level where transactions are isolated only enough to ensure that physically corrupt data is not read. Does not protect against dirty read, phantom read or non-repeatable reads.
- READ COMMITTED: The shared lock is held for the duration of the transaction, meaning that no other transactions can change the data at the same time. Protects against dirty reads, but the other problems remain.
- REPEATABLE READ: This setting disallows dirty and non-repeatable reads. But phantom reads are still possible.
- SERIALIZABLE: This is the highest level, where transactions are completely isolated from one another. Protects against dirty read, phantom read and non-repeatable read.
There are other types of locks available in the database such as 'update lock', 'intent lock', 'schema lock' etc. The update lock prevents deadlocks in the database. For more information about these check out the following links:
Tuesday, December 13, 2005
Well, we can have many strategies:
- Capture one instance of the unique rows using a SELECT DISTINCT .., and dump the results into a temp table. Delete all of the rows from the original table and then insert the rows from the temp table back into the original table.
- If there is an identity column, the we can also write a query to get all the rows that have duplicate entries :
SELECT Field1, Field2, Count(ID)
GROUP BY Foo1.Field1, Foo1.Field2
HAVING Count(Foo1.ID) > 1
Then loop thru the resultset and get a cursor for the query
"SELECT Field1, Field2, ID
WHERE Field1 = @FIELD1 and Field2 = @FIELD2".
Use the cursor to delete all the duplicate rows returned except one.
The following links discuss the above techniques and some more:
Thursday, December 08, 2005
- DOORS: Dynamic Object Oriented Requirements System
- RequisitePro : A requirements management tool from Rational Software
- Requirements Traceability Management (RTM)
- ArgoUML: http://argouml.tigris.org/
-Rational Rose http://www.rationalrose.com/
-Enterprise Architect http://www.sparxsystems.com.au/
-Rational Rose Real Time http://www-128.ibm.com/developerworks/rational/products/rose
-Visio : http://www.mvps.org/visio/
- Rational Application Developer
-JVider is Java Visual Interface Designer which helps with GUI building for Swing applications
SilkPerformer is load performance testing tool.
Microsoft Web Application Stress Tool
Web Testing/ GUI testing:
SilkTest from Segue Software
Configuration Management Tools:
Concurrent Version System - CVS
Will keep augmenting this list...
Great for anyone who wants to learn about data-modelling or have a template for a given problem domain.
The link is here : http://www.databaseanswers.org/data_models/index.htm
Some interesting data models I found in this site are - datamodels for Hierarchies , e-commerce etc. Worth a perusal.
In the above article, the author argues that object schemas can be quite different from physical data schemas. Using two lucid examples he shows how different object schemas can map to the same data schema and how it is possible for a single object schema to correctly map to several data schemas.
There are situations where OO varies significantly from ERD, for example in cases of inheritance and non-persistent classes.
Aslo, OO models can convey more specification on the class diagram in representing certain things like associations than ERD can.
I personally believe that if U are building a OO system, then we should first go for OO Class modelling and then think about resolving the OO-relational database impedance mismatch.
A multidimensional database - or a multidimensional database management system (MDDBMS) - implies the ability to rapidly process the data in the database so that answers can be generated quickly. A number of vendors provide products that use multidimensional databases. Approaches to how data is stored and the user interface vary.
Conceptually, a multidimensional database uses the idea of a data cube to represent the dimensions of data available to a user. For example, "sales" could be viewed in the dimensions of product model, geography, time, or some additional dimension. In this case, "sales" is known as the measure attribute of the data cube and the other dimensions are seen as feature attributes. Additionally, a database creator can define hierarchies and levels within a dimension (for example, state and city levels within a regional hierarchy).
More information can be at:
Tuesday, December 06, 2005
But there is also a lot of confusion regarding the exact definition of cursors, because different people use it in different contexts. For e.g. when a database administrator talks of cursors, he means the server-side cursor he uses inside stored procedures. But if a JDBC application developer talks about a cursor, he may mean the pointer in the JDBC ResultSet object.
I did a bit of reseach on the web, and finally cleared a few cobwebs from the head. Here's a snapshot of what I learned:
- There are implicit cursors and explicit cursors. An implicit cursor is - well, that's just a piece of memory used by the database server to work with resulsets internally. Implicit cursors are not accessible via an externally exposed API, whereas explicit cursors are.
- Server-side cursors and Client-side cursors: The difference between a client-side cursor and a server-side cursor in classic ADO is substantial and can be confusing. A server-side cursor allows you to manipulate rows on the server through calls on the client, usually storing all or a portion of the data in TEMPDB. The client-side cursor fetches data into a COM object on the client. Its name comes from the fact that the buffered data on the client exhibits cursor-like behaviors you can scroll through and, potentially, update it. The behavior difference manifests itself in a few ways. Fetching a large resultset into a client cursor causes a big performance hit on the initial fetch, and server cursors result in increased memory requirements for SQL Server and require a dedicated connection all the time you're fetching. Client-side cursors can be dangerous because using them has the side effect of retrieving all records from the server that are a result of your command/SQL statement. If you execute a procedure or SQL statement that could retrieve 10,000 records, and use a client-side cursor, you had better have enough memory to hold 10,000 records on the client. Also, control will not return to your code/application until all of these records are retrieved from the server, which can make your application appear slow to users. If you suspect that you are going to retrieve a large number of rows, client-side cursors are not the way to go.
- DYNAMIC, STATIC, and KEYSET cursors: Dynamic cursors will show changes made on the base table as you scroll through the cursor. Static cursors copy the base table, to the tempdb. The cursor then reads from the tempdb, so any changes happening on the base table will not be reflected in the cursors scrolling. Keysets are in between Dynamic and Static cursors. A keyset will copy the base table's selected records keys into the tempdb, so the cursor will select the rows from the tempdb, but the data from the base table. So change to the base table will be seen, but new record inserts will not be.
Next, I tried to find out how these cursors can be manipulated using .NET and Java APIs.
In .NET, the DataReader object acts as a wrapper class for a server-side cursor. The DataReader provides a read-only and forward-only cursor. More info about this is here.
In JDBC, we handle everything using ResultSet objects. Since JDBC 2.0/3.0, we have ResultSets that are scrollable backwards and also that allow update operations. So I believe if U configure a ResultSet to be scrollable backwards and allow update opeartions, U are essentially dealing with a server-side cursor.
Code snippet:Statement stmt = con.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE,ResultSet.CONCUR_READ_ONLY);
ResultSet srs = stmt.executeQuery("SELECT COF_NAME, PRICE FROM COFFEES");
The above code makes the ResultSet as scroll-sensitive, so it will reflect changes made to it while it is open. This is the equivalent of a dynamic cursor. In a dynamic cursor, the engine repeats the query each time the cursor is accessed so new members are added and existing members are removed as the database processes changes to the database.
The second argument is one of two ResultSet constants for specifying whether a result set is read-only or updatable: CONCUR_READ_ONLY and CONCUR_UPDATABLE.
Specifying the constant TYPE_FORWARD_ONLY creates a nonscrollable result set, that is, one in which the cursor moves only forward. If you do not specify any constants for the type and updatability of a ResultSet object, you will automatically get one that is TYPE_FORWARD_ONLY and CONCUR_READ_ONLY (as is the case when you are using only the JDBC 1.0 API). You might keep in mind, though, the fact that no matter what type of result set you specify, you are always limited by what your DBMS and driver actually provide.
Monday, December 05, 2005
On the DOS prompt type :
The 'd' (dump) option would dump the first 128 bytes. Typing 'd' again would give the hex dump of the next block of bytes.
U can type '?' to get a list of all the commands. Type 'q' to quit from debug mode.
Check out the following links:
http://www.tamirgal.com/home/dev.aspx?Item=SharpPcap (A C# API)
http://sourceforge.net/projects/jpcap (A Java API)
Friday, December 02, 2005
The article is available at : http://msdn.microsoft.com/msdnmag/issues/04/06/DataPoints/
Salient points of the article:
- A DataReader is a stream of data that is returned from a database query. When the query is executed, the first row is returned to the DataReader via the stream. The stream then remains connected to the database, poised to retrieve the next record. The DataReader reads one row at a time from the database and can only move forward, one record at a time. As the DataReader reads the rows from the database, the values of the columns in each row can be read and evaluated, but they cannot be edited.
- The DataReader can only get its data from a data source through a managed provider. The DataSet can also get its data from a data source via a managed provider, but the data source can also be loaded manually, even from an XML file on a hard drive.
- The DataReader supports access to multiple resultsets, one at a time, in the order they are retrieved. (Methods to checkout are NextResult() and Read() )
- When the SqlDataAdapter's Fill method is executed it opens its associated SqlConnection object (if not already open) and issues its associated SqlCommand object against the SqlConnection. Behind the scenes, a SqlDataReader is created implicitly and the rowset is retrieved one row at a time in succession and sent to the DataSet. Once all of the data is in the DataSet, the implicit SqlDataReader is destroyed and the SqlConnection is closed.
- The DataSet's disconnected nature allows it to be transformed into XML and sent over the wire via HTTP if appropriate. This makes it ideal as the return vehicle from business-tier objects and Web services. A DataReader cannot be serialized and thus cannot be passed between physical-tier boundaries where only string (XML) data can go.
- There are other times when a DataReader can be the right choice, such as when populating a list or retrieving 10,000 records for a business rule. When a huge amount of data must be retrieved to a business process, even on a middle tier, it can take a while to load a DataSet, pass the data to it on the business tier from the database, and then store it in memory. The footprint could be quite large and with numerous instances of it running (such as in a Web application where hundreds of users may be connected), scalability would become a problem. If this data is intended to be retrieved and then traversed for business rule processing, the DataReader could speed up the process as it retrieves one row at a time and does not require the memory resources that the DataSet requires.
A UDF is actually a kind of subroutine that contains T-SQL statements and can return a scalar value or a table value. Hence U can call a UDF from a SELECT statement. Whereas a Stored Procedure needs to be invoked using the 'EXEC' command.
So when to use what? The answer depends on the problem situation. If you have an operation such as a query with a FROM clause that requires a rowset be drawn from a table or set of tables, then a function will be your appropriate choice. However, when you want to use that same rowset in your application the better choice would be a stored procedure.
The other differences btw a UDF and a Stored Procedure are as follows:
- A stored prodecure supports output parameters, whereas a UDF does not.
- A UDF can have a return value (scalar or table). A SPROC can create a table, but cannot return it. It can only return a integer (as status code)
- SPROCs are powerful enough to change server environment variables. UDFs are not.
- SPROCs can be used in the FOR XML clause but UDFs can't be.
- Whenever an error occurs in a UDF, the function stops immediately. In a SPROC, we must include error handling code, or the next statement is processed.
- U can 'join' to a UDF, but cannot 'join' to a SPROC.