ORA-21500 Bug for 11gR1 Client Using XMLTYPE

Recently one of the user got ORA-21500 and this was due to bug in 11gr1 client. User has 11gr1 client version which connects to 11gr2 DB and this bug is related to securefile handling in 11gr1 client. But, this post it not about going into much deeper details but  just wanted to document what i learned.

CREATE TABLE TARAL.BUG_TEST1(ID NUMBER(38,0) NOT NULL, XCOL1 SYS.XMLTYPE)
 XMLTYPE COLUMN XCOL1 STORE AS SECUREFILE BINARY XML(TABLESPACE USERS ENABLE STORAGE IN ROW CHUNK 8192);

CREATE TABLE TARAL.BUG_TEST2(ID NUMBER(38,0) NOT NULL, XCOL1 SYS.XMLTYPE)
 XMLTYPE COLUMN XCOL1 STORE AS BASICFILE CLOB(TABLESPACE USERS ENABLE STORAGE IN ROW CHUNK 8192);

SQL> select * from bug_test1;
ERROR:
ORA-21500: internal error code, arguments: [%s], [%s], [%s], [%s], [%s], [%s],[%s], [%s]

we are using windows client here,but if you use UNIX then it throws more specific error.

SQL> select * from bug_test1;
Errors in file :
OCI-21500: internal error code, arguments: [qmcxdEvtNextPPP], [], [], [], [], [], [], []
Ð@ø0Ð@ø0Ð@ø0ERROR:
ORA-21500: internal error code, arguments: [%s], [%s], [%s], [%s], [%s], [%s],[%s], [%s]

This indeed tells us it’s an OCI error. But, how to trace this in windows. Well, this is what i learned from this error.


No OCI Trace Generated With 10.2 Oracle Client Using Oracle Data Provider for .Net [ID 1371240.1]

set ORA_CLIENTTRACE_DIR=c:\temp
set EVENT_10842=server=all;user=all;stmt=all;level=15

The following levels are supported:
1 - Trace all server attach and server detach calls for servers listed in "server" attribute of the environment variable.
2 - Trace all session begin, logon, session end, logoff calls for the users listed in "user" attribute of the environment variable.
3 - Trace all prepare, execute, fetch calls for the specified statement types listed in "stmt" attribute of environment variable.
4 - Trace all Bind, Define, Describe calls.
5 - Trace all OCI LOB calls
7 - Get statistical info on all connection pooling /connection related calls
8 - Get statistical info on all session info
9 - Get statistical info on all handle info
10 - Get statistical info on time taken in execute and fetch calls
11 - Get statistical info on transaction related calls
15 - Trace all calls with statistical info.

ORA-00600 [kksgaGetNoAlloc_Int0]

We received this error for Interval/List partition. Here is the sample test case


drop table part_test purge;
CREATE TABLE PART_TEST (
 ID NUMBER,ST_CODE VARCHAR2(1),ST_DESC VARCHAR2(4000),
 CR_DATE date, INFO VARCHAR2(200)
 ) partition by range(ID) interval(1)
 SUBPARTITION BY LIST(ST_CODE)
 SUBPARTITION TEMPLATE
 (SUBPARTITION ST_CODE_a VALUES ('A'),
 SUBPARTITION ST_CODE_s VALUES ('S'),
 SUBPARTITION ST_CODE_v VALUES ('V'),
 SUBPARTITION ST_CODE_e VALUES ('E'),
 SUBPARTITION ST_CODE_oth VALUES (DEFAULT))
 (partition ID_1 values less than (111000));

TARAL.ORALINUX>select * from part_test where id=321000 and st_code='A';
select * from part_test where id=321000 and st_code='A'
 *
ERROR at line 1:
ORA-00600: internal error code, arguments: [kksgaGetNoAlloc_Int0], [1050005], [1048575], [], [], [], [], [], [], [], [], []

Now getting into details found that 1048575 number in that error is nothing but MAX number of allowed partitions in oracle. By the way test was done in 11.2.0.3. So, when we start partition with 111000 and add any partition it should consider one 1 list and 5 sub-partition. But, it’s considering a lot’s more and actually it’s a BUG. It calculate holes in that partition and not the  actual partitions. Here is little more details about that first argument in that error.


TARAL.ORALINUX>select ((321000-111000)*5)+5 from dual;

((321000-111000)*5)+5
---------------------
 1050005

So, it’s calculated partition value and which exceeds partition limit. That means it won’t track lowest created partition.

</pre>
TARAL.ORALINUX>select (1048570/5) + 111000 from dual;

(1048570/5)+111000
------------------
 320714

1 row selected.

TARAL.ORALINUX>select * from part_test where id=320714 and st_code='A';
select * from part_test where id=320714 and st_code='A'
 *
ERROR at line 1:
ORA-00600: internal error code, arguments: [kksgaGetNoAlloc_Int0], [1048575], [1048575], [], [], [], [], [], [], [], [], []

TARAL.ORALINUX>select * from part_test where id=320713 and st_code='A';

no rows selected
<pre>

By the way this is coming from CURSOR compilation itself i believe but not sure.  And here is the BUG

Bug 13722697  ORA-600 [kksgaGetNoAlloc_Int0] on select against composite interval partitioned table

ORA-03137 TTC protocol internal error

Recently received this error on one of our 11.1.0.7 Database. Here is the description from alert log


Errors in file /...../1.trc:
ORA-03137: TTC protocol internal error : [12333] [5] [196] [2] [] [] [] []
----- Error Stack Dump -----
ORA-03137: TTC protocol internal error : [12333] [5] [196] [2] [] [] [] []
----- Current SQL Statement for this session (sql_id=xyzxxxxxxxxxxx) -----
UPDATE test stage_table SET ..... :BIND1 ... :BIND2 ... :BIND3 ........

Well, so what does it mean. After further investigation it’s a bug regarding bind peeking.

Here is what oracle workaround for same.

Bug 8625762 ORA-3137 [12333] due to bind data not read from wire

eg: Typically the trace file will show a PROTOCOL VIOLATION with Page before data showing a SQL statement (or request to execute a statement), the last 3 bytes of which are a bind length and the start of bind data, and then will show “Current Page” with the rest of thebind data.

Workaround
Setting “_optim_peek_user_binds”=false avoids this problem but use this with caution as it can affect execution plans chosen by CBO.

Not so clear. Let’s look into some details. First part of trace


--- PROTOCOL VIOLATION DETECTED ---
----- Dump Cursor sql_id=xyzxxxxxxxxxxx xsc=0x2b99f42bc148 cur=0x2b99f3d01d40 -----
Dump Parent Cursor sql_id=xyzxxxxxxxxxxx phd=0xcccccccc plk=0xyyyyyyyyy
 sqltxt(0xcccccccc)=UPDATE test stage_table SET ..... :BIND1 ... :BIND2 ... :BIND3 ........
  hash=xxxxxxxxxxxxx
  parent=0xYYYYYY maxchild=07 plk=0xyyyyyyyyy ppn=n
.......................
.......................
.......................

  sharing failure(s)=800000040000
----- Bind Info (kkscoacd) -----
 Bind#0
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1000000 frm=00 csi=00 siz=2096 off=0
  kxsbbbfp=2b99f4241e10  bln=22  avl=00  flg=05
 Bind#1
  oacdty=01 mxl=2000(132) mxlc=33 mal=00 scl=00 pre=00
  oacflg=03 fl2=1000010 frm=01 csi=873 siz=0 off=24
  kxsbbbfp=2b99f4241e28  bln=2000  avl=00  flg=01
 Bind#2
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1000000 frm=00 csi=00 siz=0 off=2024
  kxsbbbfp=2b99f42425f8  bln=22  avl=00  flg=01
 Bind#3
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1000000 frm=00 csi=00 siz=0 off=2048
  kxsbbbfp=2b99f4242610  bln=22  avl=00  flg=01
 Bind#4
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1000000 frm=00 csi=00 siz=0 off=2072
  kxsbbbfp=2b99f4242628  bln=22  avl=00  flg=01
 Frames pfr 0x2b99f40af0b0 siz=20744 efr 0x2b99f40aefd8 siz=20720
 kxscphp=0x2b99f4633808 siz=1992 inu=1616 nps=1504
 kxscbhp=0x2b99f4633e98 siz=3104 inu=2368 nps=2120
Starting SQL statement dump

As, you can see from this avl=00 so bind length is Zero and sharing failure(s)=800000040000 which corrospond to USER_BIND_PEEK_MISMATCH + BIND_EQUIV_FAILURE. So, this is related to bind peeking.

Further there is an another section in trace that is


-----Page before-----
XXXXXXXXX 03 5E 19 29 80 00 00 00 00 00 00 01 CC 03 00 00  [.^.)............]
XXXXXXXXX 01 0D 00 00 00 01 01 00 00 00 00 01 00 00 00 00  [................]
..................................
..................................
..................................
..................................
..................................
XXXXXXXXX 00 00 00 00 00 00 00 00 00 00 02 03 00 00 15 00  [................]
XXXXXXXXX 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  [................]
XXXXXXXXX 00 00 00 00 00 00 00 07 05 C4 02                 [...........]
-----Current Page-----
nsbfdbpt 0x1b74mmmm
ifccrcp = 0x1b74mmm, ifccrcl = 0x1b74emmm
buffer length = 56
XXXXXXXXX                                  01 02 1C 21 54             [...!T]
..................................
..................................
XXXXXXXXX 04 C3 47 51 57 06 C5 04 41 64 58 51 06 C5 04 42  [..GQW...AdXQ...B]
XXXXXXXXX 01 58 51                                         [.XQ]

03 5E – The Page before shows we have an OALL8 packet and it looks like bind data has been left on the wire. As you can see last three bytes of bind data is left and read as new packet also they matches our error.

Also, in that last three bytes length of first bind data is 5 second is 21 and so on.

I have change some names of files and other stuff but data which i want to show are same from trace.