Strange ASM Hang Issue
May 31, 2011 Leave a comment
Sometime Back there was strange ASM issue. Actually Users were complaining about connectivity Issue. At first Look everything seems to be Okay but after looking more found that ASM instance was hanging and this is for Oracle 11.1.0.7.0.
Well, Oracle Support filed this as a BUG but solution was to bounce DB and ASM instance and we faced 10 Min Downtime. And Strange part is it Affected DR Site Also with same issue.
Here are some details
Current Wait Stack: 0: waiting for 'enq: CF - contention' name|mode=43460004, 0=0, operation=0 wait_id=13621 seq_num=13737 snap_id=1 wait times: snap=13 min 44 sec, exc=13 min 44 sec, total=13 min 44 sec wait times: max=15 min 0 sec wait counts: calls=1646 os=1646 in_wait=1 iflags=0x15a0 There is at least one session blocking this session. Dumping one blocker: inst: 1, sid: 1358, ser: 1 Wait State: auto_close=0 flags=0x22 boundary=(nil)/-1 So, Controlfile Enqueue Was held more then Expected. Current Wait Stack: 1: waiting for 'KSV master wait' =0, =0, =0 wait_id=6434 seq_num=7633 snap_id=2 wait times: snap=2 min 0 sec, exc=12 min 3 sec, total=12 min 4 sec wait times: max=infinite wait counts: calls=242 os=242 in_wait=1 iflags=0x15a0 0: waiting for 'ASM file metadata operation' msgop=11, locn=0, =0 wait_id=6432 seq_num=7630 snap_id=2 wait times: snap=0.000000 sec, exc=0.000012 sec, total=12 min 7 sec wait times: max=infinite wait counts: calls=0 os=0 in_wait=1 iflags=0x1520 Wait State: auto_close=0 flags=0x23 boundary=(nil)/-1 And was held my archiver process. And If we see systemstate Dump System State 2 ~~~~~~~~~~~~~~~~ 1: 2: 0: waiting for 'pmon timer' 3: 0: waiting for 'VKTM Logical Idle Wait' 4: 5: 0: waiting for 'PING' 6: 0: waiting for 'rdbms ipc message' 7: 0: waiting for 'DIAG idle wait' 8: 0: waiting for 'rdbms ipc message' 9: 0: waiting for 'ges remote message' 10: 0: waiting for 'gcs remote message' 11: 0: waiting for 'rdbms ipc message' 12: 0: waiting for 'rdbms ipc message' 13: 0: waiting for 'rdbms ipc message' 14: 0: waiting for 'rdbms ipc message' 15: 0: waiting for 'rdbms ipc message' 16: 0: waiting for 'GCS lock cvt S' 17: 0: waiting for 'rdbms ipc message' 18: 0: waiting for 'rdbms ipc message' 19: 0: waiting for 'ASM file metadata operation' 20: 0: waiting for 'ASM file metadata operation'[Enqueue FA-00000002-00000168] 21: 0: waiting for 'ASM file metadata operation' 22: 0: waiting for 'ASM file metadata operation'[Enqueue FA-00000002-00000168] 23: 0: waiting for 'ASM file metadata operation'[Enqueue FA-00000002-00000168] 24: 0: waiting for 'ASM file metadata operation'[Enqueue FA-00000002-00000168] Blockers ~~~~~~~~ Above is a list of all the processes. If they are waiting for a resource then it will be given in square brackets. Below is a summary of the waited upon resources, together with the holder of that resource. Notes: ~~~~~ o A process id of '???' implies that the holder was not found in the systemstate. Resource Holder State Enqueue FA-00000002-00000168 19: 0: waiting for 'ASM file metadata operation' Enqueue FA-00000002-00000168 21: 0: waiting for 'ASM file metadata operation' Enqueue FA-00000002-00000168 24: 24: is waiting for 19: 21: 24: System State 9 ~~~~~~~~~~~~~~~~ 1: 2: 3: 0: waiting for 'VKTM Logical Idle Wait' 4: 5: 0: waiting for 'PING' 6: 0: waiting for 'rdbms ipc message' 7: 8: 0: waiting for 'rdbms ipc message' 9: 0: waiting for 'ges remote message' 10: 0: waiting for 'gcs remote message' 11: 0: waiting for 'rdbms ipc message' 12: 0: waiting for 'rdbms ipc message' 13: 0: waiting for 'rdbms ipc message' 14: 0: waiting for 'rdbms ipc message' 15: 0: waiting for 'rdbms ipc message' 16: 0: waiting for 'rdbms ipc message' 17: 0: waiting for 'rdbms ipc message' 18: 0: waiting for 'rdbms ipc message' 19: 0: waiting for 'ASM network foreground exits' 20: 0: waiting for 'log write(odd)' 24: 0: waiting for 'ASM file metadata operation' 25: 0: waiting for 'ASM file metadata operation' 26: 0: waiting for 'ASM file metadata operation'[Latch for] 27: 0: waiting for 'ASM file metadata operation' 28: 0: waiting for 'buffer busy' [Latch for] Cmd: Select 29: 0: waiting for 'ASM file metadata operation' 30: 0: waiting for 'ASM file metadata operation' 31: 0: waiting for 'ASM file metadata operation' 33: 0: waiting for 'ASM file metadata operation' 34: 0: waiting for 'ASM file metadata operation'[Latch for] 35: 0: waiting for 'ASM file metadata operation'[Latch for] 36: 0: waiting for 'ASM file metadata operation'[Latch for] 37: 0: waiting for 'ASM file metadata operation'[Latch for] Blockers ~~~~~~~~ Above is a list of all the processes. If they are waiting for a resource then it will be given in square brackets. Below is a summary of the waited upon resources, together with the holder of that resource. Notes: ~~~~~ o A process id of '???' implies that the holder was not found in the systemstate. Resource Holder State Latch for ??? Blocker Object Names ~~~~~~~~~~~~ Latch for write state=waiting for log force
Well Came to know that If a process hangs holding the FAL queue latch, no other process will be able to acquire the FAL queue latch. As a result, all processes that attempt to acquire the FAL queue latch will hang. If a process is holding the controlfile transaction, the hang will broaden beyond ARCH processes. The Archiver Process hangs while holding the FAL queue latch and then in turn reports the error ORA-00494: enqueue [CF] held for too long (more than 900 seconds). Bug 7432601