Monday, August 16, 2004

DB2 deadlocks in Websphere applications

I spent almost the whole last month trying to figure out root causes of some DB2 deadlocks. Finally I nailed them.

At first, it is hard to reproduce. So I used a tool called Silk Peformer to make it reproducable by stressing the application gradually. Then take a DB2 event monitor during running the script. The hardest part is to match WAS trace.log (by Java thread id) and the huge DB2 event monitor log (by DB2 application handle). Moreover, I got to talk to some developers who are familar with the business logic involved here to understand why it does that etc.

It is hard but rewarding to see why deadlocks happened. It will be nicer to have a tool for DB2 event monitor log. I want to see sorting by application handle and search by SQL statement too.

Steps for deadlock solving.
1.
Take a closer look at DB2 event monitor log, esp. how many applications has been involved. Draw a diagram whenever possible.

2.
Search the java/jsp code with deadlocked SQL and get a list of possible class and method involved.

3. we have been doing it bottom up, now go read Java code that involved in different threads. see if any Java/JSP that found in 2. are involved here.
Read WAS trace log to verfiy that.

4. There are two possibilities source of deadlock,
a. coming from application code. It can be verfied thru inserting a java
stack trace at the method level.
b .coming from EJB container.
It can come from unmarked "READ" (access intent) remote method.


No comments: