ORA-15075

We faced issue with ASM 2 Weeks Back and here is small background behind this. This is Oracle 11.1.0.7.0 and using ASMLIB. Changed name of original disk and other stuff to avoid policy violation.

We wanted to have some more space on lefthand SCSI so SA’s Added One 50GB Disk. While adding so he accidently overwrite DISK1 which was already in use and thanks to ASM it invalidated DISK1 and rebalanced to other disks. Issue didn’t cause any outage and we were happy till we have maintenence for this database.

Now, this time we wanted to move this Disk to SAN and we are using external reduncency for same. So, best way is to add disk and remove disk without any downtime needed. Below is the output before starting any work. If you can see that DISK1 is marked as CANDIDATE disk and which now won’t exist as of the problem mention above.


GROUP_NUMBER DISK_NUMBER MOUNT_S HEADER_STATU MODE_ST STATE    NAME                           LABEL                           PATH
------------ ----------- ------- ------------ ------- -------- ------------------------------ ------------------------------- ----------------------
           1           0 CACHED  CANDIDATE    ONLINE  NORMAL   DISK001                        DISK001                  		  ORCL:DISK001
           1           1 CACHED  MEMBER       ONLINE  NORMAL   DISK002                 	      DISK002                  		  ORCL:DISK002
           1           2 CACHED  MEMBER       ONLINE  NORMAL   DISK003                        DISK003                  		  ORCL:DISK003
           1           3 CACHED  MEMBER       ONLINE  NORMAL   DISK004                        DISK004                  		  ORCL:DISK004
           1           4 CACHED  MEMBER       ONLINE  NORMAL   DISK005                        DISK005                  		  ORCL:DISK005
           1           5 CACHED  MEMBER       ONLINE  NORMAL   DISK006                        DISK006                  		  ORCL:DISK006
           2           0 CACHED  MEMBER       ONLINE  NORMAL   DISK007                        DISK007                  		  ORCL:DISK007
           2           1 CACHED  MEMBER       ONLINE  NORMAL   DISK008                        DISK008                  		  ORCL:DISK008
           0           0 CACHED  PROVISIONED  ONLINE  NORMAL                   		          DISK009                  		  ORCL:DISK009
           0           1 CACHED  PROVISIONED  ONLINE  NORMAL                    	          DISK010                  		  ORCL:DISK010

There we about 25-30 Disk provisioned so cut the other output. But as you can see that this two disk are ready to add. While Adding this disk we got this error.


SQL> ALTER DISKGROUP XXXXXXDATA ADD  DISK 'ORCL:DISK009' SIZE 51200M ,
'ORCL:DISK010' SIZE 51200M
NOTE: Assigning number (1,6) to disk (ORCL:DISK009)
NOTE: Assigning number (1,7) to disk (ORCL:DISK010)
NOTE: requesting all-instance membership refresh for group=1
NOTE: initializing header on grp 1 disk DISK009
NOTE: initializing header on grp 1 disk DISK010
NOTE: cache opening disk 6 of grp 1: DISK009 label:DISK009
NOTE: cache opening disk 7 of grp 1: DISK010 label:DISK010
NOTE: requesting all-instance disk validation for group=1
Wed May 25 17:36:50 2011
NOTE: disk validation pending for group 1/0xb4587064 (XXXXXXDATA)
SUCCESS: validated disks for 1/0xb4587064 (XXXXXXDATA)
ERROR: ORA-15075 signalled during reconfiguration of diskgroup XXXXXXDATA
NOTE: membership refresh pending for group 1/0xb4587064 (XXXXXXDATA)
kfdp_query(XXXXXXDATA): 7
Wed May 25 17:36:56 2011
kfdp_queryBg(): 7
kfdp_query(XXXXXXDATA): 8
kfdp_queryBg(): 8
NOTE: cache closing disk 6 of grp 1: DISK009 label:DISK009
NOTE: cache closing disk 6 of grp 1: DISK009 label:DISK009
NOTE: De-assigning number (1,6) from disk (ORCL:DISK009)
NOTE: cache closing disk 7 of grp 1: DISK010 label:DISK010
NOTE: cache closing disk 7 of grp 1: DISK010 label:DISK010
NOTE: De-assigning number (1,7) from disk (ORCL:DISK010)
kfdp_query(XXXXXXDATA): 9
kfdp_queryBg(): 9
SUCCESS: refreshed membership for 1/0xb4587064 (XXXXXXDATA)
Wed May 25 17:36:59 2011
ORA-15032: not all alterations performed
ORA-15075: disk(s) are not visible cluster-wide
ERROR: ALTER DISKGROUP XXXXXXDATA ADD  DISK 'ORCL:DISK009' SIZE 51200M ,
'ORCL:DISK010' SIZE 51200M
Wed May 25 17:37:07 2011
SQL> ALTER DISKGROUP XXXXXXDATA ADD  DISK 'ORCL:DISK009' SIZE 51200M ,
'ORCL:DISK010' SIZE 51200M
NOTE: Assigning number (1,8) to disk (ORCL:DISK009)
NOTE: Assigning number (1,9) to disk (ORCL:DISK010)
NOTE: requesting all-instance membership refresh for group=1
NOTE: De-assigning number (1,8) from disk (ORCL:DISK009)
NOTE: De-assigning number (1,9) from disk (ORCL:DISK010)
ERROR: ORA-15033 signalled during reconfiguration of diskgroup XXXXXXDATA

There could be many reasons for this like scandisk not ran on all nodes, permission etc. For our case multipath was not set maybe. But for us we need to remove this two disk and let SA’s correct the issue and then try to add them back.


SQL> alter diskgroup XXXXXXDATA drop disk 'ORCL:DISK009'
ORA-15032: not all alterations performed
ORA-15054: disk "ORCL:DISK009" does not exist in diskgroup "XXXXXXDATA"
ERROR: alter diskgroup XXXXXXDATA drop disk 'ORCL:DISK009'
Wed May 25 18:32:48 2011
SQL> alter diskgroup XXXXXXDATA drop disk 'ORCL:DISK010'
ORA-15032: not all alterations performed
ORA-15054: disk "ORCL:DISK010" does not exist in diskgroup "XXXXXXDATA"
ERROR: alter diskgroup XXXXXXDATA drop disk 'ORCL:DISK010'
Wed May 25 18:33:04 2011
SQL> alter diskgroup XXXXXXDATA drop disk 'ORCL:DISK010' force
ORA-15032: not all alterations performed
ORA-15054: disk "ORCL:DISK010" does not exist in diskgroup "XXXXXXDATA"
ERROR: alter diskgroup XXXXXXDATA drop disk 'ORCL:DISK010' force

So, you can’t remove this disk. As to remove disk you have to format it’s header using something like this

dd if=/dev/zero of=ORCL:DISK009 bs=4096 count=5000

But, as we didn’t have any data we planned to take some downtime as this was QA database and let SA’s remove SAN and re-configure as it needed. But, When they rebooted box issue got more worse we were not able to see DATA diskgroup and ASM was not able to mount that also.


SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group XXXXXXDATA number=1 incarn=0xb7f88ac9
NOTE: cache began mount (first) of group XXXXXXDATA number=1 incarn=0xb7f88ac9
NOTE: cache registered group XXXXXXFLASH number=2 incarn=0xb7f88aca
NOTE: cache began mount (first) of group XXXXXXFLASH number=2 incarn=0xb7f88aca
NOTE:Loaded lib: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
NOTE: Assigning number (1,1) to disk (ORCL:DISK002)
NOTE: Assigning number (1,2) to disk (ORCL:DISK003)
NOTE: Assigning number (1,3) to disk (ORCL:DISK004)
NOTE: Assigning number (1,4) to disk (ORCL:DISK005)
NOTE: Assigning number (1,5) to disk (ORCL:DISK006)
NOTE: Assigning number (1,8) to disk (ORCL:DISK009)
NOTE: Assigning number (1,9) to disk (ORCL:DISK010)
ERROR: no PST quorum in group 1: required 1, found 0
NOTE: cache dismounting group 1/0xB7F88AC9 (XXXXXXDATA)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0xB7F88AC9 (XXXXXXDATA)
NOTE: cache ending mount (fail) of group XXXXXXDATA number=1 incarn=0xb7f88ac9
kfdp_dismount(): 2
kfdp_dismountBg(): 2
NOTE: De-assigning number (1,1) from disk (ORCL:DISK002)
NOTE: De-assigning number (1,2) from disk (ORCL:DISK003)
NOTE: De-assigning number (1,3) from disk (ORCL:DISK004)
NOTE: De-assigning number (1,4) from disk (ORCL:DISK005)
NOTE: De-assigning number (1,5) from disk (ORCL:DISK006)
NOTE: De-assigning number (1,8) from disk (ORCL:DISK009)
NOTE: De-assigning number (1,9) from disk (ORCL:DISK010)
ERROR: diskgroup XXXXXXDATA was not mounted
NOTE: Assigning number (2,0) to disk (ORCL:DISK007)
NOTE: Assigning number (2,1) to disk (ORCL:DISK008)
NOTE: start heartbeating (grp 2)
kfdp_query(XXXXXXFLASH): 5
kfdp_queryBg(): 5
NOTE: cache opening disk 0 of grp 2: DISK007 label:DISK007
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 2: DISK008 label:DISK008
NOTE: cache mounting (first) group 2/0xB7F88ACA (XXXXXXFLASH)
* allocate domain 2, invalid = TRUE
kjbdomatt send to node 1
NOTE: attached to recovery domain 2
NOTE: cache recovered group 2 to fcn 0.144346
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.144298 ABA
NOTE: seq=13 blk=8268
NOTE: cache mounting group 2/0xB7F88ACA (XXXXXXFLASH) succeeded
NOTE: cache ending mount (success) of group XXXXXXFLASH number=2 incarn=0xb7f88aca
kfdp_query(XXXXXXFLASH): 6
kfdp_queryBg(): 6
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup XXXXXXFLASH was mounted
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "XXXXXXDATA"
ERROR: ALTER DISKGROUP ALL MOUNT

So, Now the issue is the DISK1 Which was wrongly Labeled last time oracle is trying to find that and now it’s not able to find so and it had assigned that DISK35 header value to that which was wrongly overwritten but it’s actually 01 Disk.


----------------------------- DISK REPORT N0008 ------------------------------
                Disk Path: ORCL:DISK035
           Unique Disk ID:
               Disk Label: DISK035
     Physical Sector Size: 512 bytes
                Disk Size: 51200 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000

But Now comes the Cool Part. Opened SR1 and come to know somebody might know this but for me it was New. As, we know here that disk header was wiped out

ORACLE Keep backup of Disk Header Information structure of AUNUM 1 and Blocknum 254. From that you can retrieve this information. So, now KFED was the one to rescue for merging this information.

kfed read /dev/d1 aun=1 blkn=254 text=/tmp/d1.log
kfed merge /dev/d1 text=/tmp/d1.log

And then Run Scandisk. That’s it everthing Came up Good.

Advertisements

About Taral
I am Rookie to Oracle Technology so let's see where it goes

2 Responses to ORA-15075

  1. Anand says:

    Hi Taral,

    Thanks for sharing this!!!!

    Anand

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: