DB locking issue

Oracle database - analyzing a deadlock [ORA-00060]

As a weblogic administrator the interaction among the application server and the database is often strong. In fact, according to Confio Software (2013) approximately 70% of applications’ performance problems are caused by the database. Whit this in mind, knowing some methods or common cases about the database performance is a relevant skill for application server administrators. The aim of this post is to describe how a performance problem that caused the crash of the database and the application server was solved through the analysis of the TRC file generated by the Oracle database.

The system affected

The system who suffered that problem was used by an important company.

Symptoms

DELETE FROM Table_A WHERE CREDITEVALUATIONID = :B
DELETE FROM Table_A WHERE CREDITEVALUATIONID = :B
DELETE FROM Table_A WHERE CREDITEVALUATIONID = :B
DELETE FROM Table_A WHERE CREDITEVALUATIONID = :B

Analysis

It was clear we had a deadlock problem, but we needed the root cause. It was found using the TRC file generated by the Oracle database and the sources of information that can be found in the references section within this document.

This is the TRC generated at that time:

/ora_control/admin/xxxxx/udump/ora_24473.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle10/product/db/10.2.0
System name:    test
Node name:      test
Release:        B.11.11
Version:        U
Machine:        test
Instance name: test
Redo thread mounted by this instance: 1
Oracle process number: 1087
Unix process pid: 24473, image: oracle@******

*** 2012-01-31 19:26:46.853
*** ACTION NAME:() 2012-01-31 19:26:46.845
*** MODULE NAME:(JDBC Thin Client) 2012-01-31 19:26:46.845
*** SERVICE NAME:(SYS$USERS) 2012-01-31 19:26:46.845
*** SESSION ID:(3988.49756) 2012-01-31 19:26:46.845
DEADLOCK DETECTED ( ORA-00060 )
[Transaction Deadlock]
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-0004a1a4-00000000      1087    3988    SX   SSX      733    3151    SX   SSX
TM-0004a1a4-00000000       733    3151    SX   SSX     1087    3988    SX   SSX
session 3988: DID 0001-043F-0003D9CF    session 3151: DID 0001-02DD-000FE20F
session 3151: DID 0001-02DD-000FE20F    session 3988: DID 0001-043F-0003D9CF
Rows waited on:
Session 3151: no row
Session 3988: no row
Information on the OTHER waiting sessions:
Session 3151:
  pid=733 serial=40226 audsid=292398699 user: 1633/TEST
  O/S info: user: SYSTEM, term: unknown, ospid: 1234, machine: xxxxxxxxxx
            program: JDBC Thin Client
  application name: JDBC Thin Client, hash value=2546894660
  Current SQL Statement:
  DELETE FROM Table_A WHERE CREDITEVALUATIONID = :B
End of information on OTHER waiting sessions.
Current SQL statement for this session:
DELETE FROM Table_A WHERE CREDITEVALUATIONID = :B
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
c0000003bbe25980      1941  package body  *************************
c0000003bbe25980      2470  package body  *************************
c00000035ce033a0        12  procedure ******************************
c00000035ae75f88         1  anonymous block

As can be seen in the previous trace it is evident than some delete sentences are related to the problem, it makes sense with the sentences shown in the symptoms section.

The most important piece of information given by the previous trace is the following

                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-0004a1a4-00000000      1087    3988    SX   SSX      733    3151    SX   SSX
TM-0004a1a4-00000000       733    3151    SX   SSX     1087    3988    SX   SSX

The TM (at the beginning of each line) means that the lock found in the trace is table lock caused by the application and often generated by relationships between parent and child tables without indexing the foreign key (Burleson D., n.d).

This problem is also described by [Oracle] (https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1528515465282).

How can I identify the tables involved in this deadlock? I have to use this data: TM-0004a1a4-00000000 , the number remarked has to be converted from hexadecimal to decimal. In that case I got 303524, whit this data you have to execute this query:

SELECT * FROM dba_objects WHERE object_id= 303524;

The previous query returns the locked object that in my case was: Table_C

Now we had to execute this query to get the relationships between this table and others:

SELECT c.TABLE_NAME,c.*
FROM ALL_CONSTRAINTS c
WHERE CONSTRAINT_TYPE='R'
AND R_CONSTRAINT_NAME IN 
  (SELECT CONSTRAINT_NAME
  FROM ALL_CONSTRAINTS
  WHERE CONSTRAINT_TYPE IN ('P','U')
  AND TABLE_NAME='Table_C');

Or you can use a tool to get the relationships visually. In that case, I got this

The previous image shows that Table_C has a relationship by FK (column CREDITEVALUATIONID) with the table Table_A. However, this FK is not indexed and this was the root cause of that problem.

After that a query was executed to detect other cases like this and after fixing them, the performance of that module was improved.

This is an example of reactive support. However, we can use health checks to discover these kinds of problems in advance (proactive support). In fact, [SYSCO AS] (http://www.sysco.no) is investing a lot of time on providing health checks to identify database problems before they happen.

References list

Confio Software (2013) The Missed Opportunity for Improved Application Performance [Online document] Available from: http://cdn.swcdn.net/creative/v13.2/pdf/Whitepapers/DevOps_Essentials_DBAs_%20Developers_WP_Confio_May2014.pdf (Accessed: 30 June, 2015)

Houri Mohamed (2012) Mohamed Houri’s Oracle Notes Performance [Online document] Available from: http://hourim.wordpress.com/category/deadlock/ (Accessed: 30 June, 2015)

Burleson Donald (n.d.) Oracle Metric TM http://www.dba-oracle.com/m_tm.htm [Online document] Available from: http://hourim.wordpress.com/category/deadlock/ (Accessed: 30 June, 2015)

Oracle (2015)Troubleshooting “ORA-00060 Deadlock Detected” Errors [ID 62365.1] [Online document] Available from: support.oracle.com (Accessed: 30 June, 2015)

Oracle (n.d) Andre – Thanks for the question regarding “Reading deadlock trace files”, version 8.1.5 [Online document] Available from: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1528515465282 (Accessed: 30 June, 2015)