ORA600[13011]表与索引数据逻辑错误分析及解决

一、问题概述

1、数据库环境:

Oracle Database 11.2.0.3.0 for Oracle Linux Server release 6.4RAC,虚拟机

2、巡检时发现某数据库alert.log日志报ORA-00600[13011]错误,报错频繁,虽然未导致数据库宕机,但已影响业务,报错如下:

[oracle@NODE1trace]$grep-iora-00600alert*.log|grep13011|sort-u
ORA-00600::[13011],[321401],[33682485],[24],[3],[],[]--TueFeb0600:07:53开始报错
ORA-00600::[13011],[27],[]

trace文件中信息:

从这里可以看出是XXXXXMIN.XXX_XX_XX_XXX_OLD表进行DELEDE操作导致该错误发生

Dumpcontinuedfromfile:/u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_ora_19795.trc
ORA-00600:[13011],[]
=========Dumpforincident49853(ORA600[13011])========
***2018-02-0609:37:44.987
dbkedDefDump():Startingincidentdefaultdumps(flags=0x2,level=3,mask=0x0)
-----CurrentsqlStatementforthissession(sql_id=b6nmg0fpy3smf)-----
deletefrom"XXXXXMIN"."XXX_XX_XX_XXX_OLD"where"AX_ID"=:1


二、问题分析

1、MOS关于ORA 600 [13011]的描述:

Format:ORA-600[13013][a][b]{c}[d][e][f]Arg
[a]PasscountArg
[b]DataObjectnumberArg
{c}TablespaceDecimalRelativeDBA(RDBA)ofblockcontainingtherowtobeupdatedArg
[d]RowSlotnumberArg
[e]DecimalRDBAofblockbeingupdated(Typicallysameas{c})Arg
[f]Code

参考《 New and Improved: ORA-600 [13013] "Unable to get a Stable set of Records" (文档 ID 1438920.1)》和《ORA-600 [13013] "Unable to get a Stable set of Records" (文档 ID 28185.1)文章。该报错是由于对某个表执行DML操作,该表对应的某个索引损坏导致的,解决的办法是找出操作的表和受损的索引,重建索引即可。


2、查找报错对象

根据ORA-00600 [13011],[3]报错代码,查找报错对象:

selectdbms_utility.data_block_address_file(33682485)rfile,dbms_utility.data_block_address_block(33682485)blocksfromdual;

RFILEBLOCKS
--------------------
8128053
selectowner,segment_name,segment_type,tablespace_name,a.partition_namefromdba_extentsawherefile_id=8and128053betweenblock_idandblock_id+blocks-1;

OWNERSEGMENT_NAMESEGMENT_TYPETABLESPACE_NAMEPARTITION_NAME
------------------------------------------------------------------------------------
XXXXXMINXXX_XX_XX_XXX_OLDTABLEXXX

trace文件中信息:

BH(0xf60ee308)file#:8rdba:0x0201f435(8/128053)class:1ba:0xf6c96000--其对象XXXXXMIN.XXX_XX_XX_XXXXX_OLD与查询一致
set:12pool:3bsz:8192bsi:0sflg:1pwc:0,25
dbwrid:0obj:321401objn:321401tsn:8afn:8hint:f
hash:[0x13ef9fd78,0x13ef9fd78]lru:[0xc900efb0,0xaf13f128]
ckptq:[NULL]fileq:[NULL]objq:[0x132d5a950,0x132d5a950]objaq:[0x132d5a940,0x132d5a940]
st:XCURRENTmd:NULLfpin:'kddwh01:kdddel'tch:1le:0xcb0e3ee8
flags:remote_transfered
LRBA:[0x0.0.0]LSCN:[0x0.0]HSCN:[0xffff.ffffffff]HSUB:[65535]
buffertsn:8rdba:0x0201f435(8/128053)--与查询一致,其对象为XXXXXMIN.XXX_XX_XX_XXXXX_OLD
scn:0x0001.084d4f80seq:0x01flg:0x06tail:0x4f800601
frmt:0x02chkval:0x538dtype:0x06=transdata
Hexdumpofblock:st=0,typ_found=1


3、分析异常表

analyzetablexxxxxmin.xxx_xx_xx_xxxxx_oldvalidatestructurecascade;
ERRORatline1:
ORA-01499:table/indexcrossreferencefailure-seetracefile–-根据文档1499.1查找trace文件

OERR: ORA-1499 table/Index Cross Reference Failure - see trace file [ID 1499.1]

ErrorORA-1499isproducedbystatement"ANALIZETABLE|CLUSTERVALIDATESTRUCTURECASCADE"toreportaninconsistencybetweenatableoraclusteranditsindexwhereanindexkeyvalueisnotfoundintheindexorviceversa.
Thecontentofthetracefilehas:
:tsn:rdba:
description:
"rownotfoundinindex"
"Table/Indexrowcountmismatch"
"rowmismatchinindexdba"
"Tablerowcount/Bitmapindexbitcountmismatch"
"kdavls:kdcchkreturns%dwhencheckingclusterdba0x%08lxobjn%d\n"
tsn:TablespaceNumberwheretheINDEXisstored.
rdba:RelativedatablockaddressoftheINDEXsegmentheader.

根据文档 1499.1查找trace文件未找到相应的报错。看来与文档描述的情况不同,需进一步分析。


4、根据ROWID分析

通过前面的分析知道ORA-600 [13013]该报错是由于表与索引之间的逻辑数据不一致导致。查询明确关联的索引:

selectowner,index_name,index_typefromdba_indexeswheretable_name='XXX_XX_XX_XXXXX_OLD'andowner='XXXXXMIN';

OWNERINDEX_NAMEINDEX_TYPE
----------------------------------------------------------------------
XXXXXMINPK_XXX_XX_XX_XXXXXnorMAL

--索引创建语句如下:

DBMS_MetaDATA.GET_DDL(UPPER('INDEX'),UPPER('PK_XXX_XX_XX_XXXXX'),UPPER('XXXXXMIN'))
-----------------------------------------------------------------------------------------------------------
CREATEUNIQUEINDEX"XXXXXMIN"."PK_XXX_XX_XX_XXXXX"ON"XXXXXMIN"."XXX_XX_XX_XXXXX_OLD"("AX_ID","BX_ID")

根据"XXXXXMIN"."PK_XXX_XX_XX_XXXXX"索引的创建语句,该索引为B树索引,它是基于二叉树的,由分支块和叶子块组成,包括每个索引列的值和行所对应的ROWID


通过下面的语句查询出全表扫描时和索引扫描时存在差异的行:

select/*+INDEX_FFS(tpk_xxx_xx_xx_xxx)*/rowid,2dbms_rowid.ROWID_RELATIVE_FNO(rowid)relative_fno,3dbms_rowid.ROWID_BLOCK_NUMBER(rowid)block
4fromXXXXXMIN.XXX_XX_XX_XXXXX_OLDtwhere(t.AX_IdisnotnullorBX_Idisnotnull)
5minus
6select/*+FULL(t1)*/rowid,7dbms_rowid.ROWID_RELATIVE_FNO(rowid)relative_fno,8dbms_rowid.ROWID_BLOCK_NUMBER(rowid)blockfromXXXXXMIN.XXX_XX_XX_XXXXX_OLDt1;

查询结果如下:

ROWIDRELATIVE_FNOBLOCK
----------------------------------------
AABOd5AAIAAAfQ1AAP8128053
AABOd5AAIAAAfQ1AAQ8128053
AABOd5AAIAAAfQ1AAR8128053
AABOd5AAIAAAfQ1AAY8128053
AABOd5AAIAAAfQ1AAZ8128053
AABOd5AAIAAAfQ1AAa8128053
AABOd5AAIAAAfQ1AAb8128053
AABOd5AAIAAAfQ1AAc8128053
AABOd5AAIAAAfQ1AAd8128053
AABOd5AAIAAAfQ1AAe8128053
AABOd5AAIAAAfQ1AAf8128053
AABOd5AAIAAAfQ1AAg8128053
AABOd5AAIAAAfQ1AAq8128053
AABOd5AAIAAAfQ1AAr8128053
AABOd5AAIAAAfQ1AAs8128053
15rowsselected.


5、验证该表全表扫描与索引扫描时存在差异行

根据下面语句找出差异的数据:

selecte.*,rowidfromXXXXXMIN.XXX_XX_XX_XXX_OLDewheree.rowid>(selectmin(x.rowid)fromXXXXXMIN.XXX_XX_XX_XXX_OLDxwherex.AX_ID=e.AX_IDandx.BX_ID=e.BX_ID);

AX_IDBX_IDCOMMITED_XXXXXUNCOMMITED_XXXXXUNSHARED_XXXXXUPDATED_TROWID
-------------------------------------------------------------------------------------------
**************************************************AABOd5AAIAAAzAPAAM
**************************************************AABOd5AAIAAAzAPAAN
**************************************************AABOd5AAIAAAzAPAAP
**************************************************AABOd5AAIAAAzAPAAL
**************************************************AABOd5AAIAAAzAPAAQ
**************************************************AABOd5AAIAABFRCACA
**************************************************AABOd5AAIAABFRCACl
**************************************************AABOd5AAIAABFRCACk
**************************************************AABOd5AAIAAAzAPAAB
**************************************************AABOd5AAIAAAzAPAAE
**************************************************AABOd5AAIAABFRCACC
**************************************************AABOd5AAIAABFRCACm
**************************************************AABOd5AAIAAAzAPAAD
**************************************************AABOd5AAIAABFRCACB
**************************************************AABOd5AAIAAAzAPAAO
15rowsselected.

取其中一条数据来验证走全表扫描和索引扫描时的差异

--sql执行计划通过索引扫描查询的数据

sql>altersessionsetstatistics_level=all;
Sessionaltered.
sql>selecte.*,rownum,rowidfromXXXXXMIN.XXX_XX_XX_XXX_OLDewheree.AX_ID=****ande.BX_ID=**;
AX_IDBX_IDCOMMITED_XXXXXUNCOMMITED_XXXXXUNSHARED_XXXXXUPDATED_TROWNUMROWID
-----------------------------------------------------------------------------------------------------
****************************AABOd5AAIAAAGcaABR
sql>select*fromtable(dbms_xplan.display_cursor(null,null,'allstatslast'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
sql_IDcy48jvzrnuv22,childnumber1
-------------------------------------
selecte.*,rowidfromXXXXXMIN.XXX_XX_XX_XXX_OLDewheree.AX_ID=****ande.BX_ID=**
Planhashvalue:1022151449
--------------------------------------------------------------------------------------------------------------
|Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|
--------------------------------------------------------------------------------------------------------------
|0|SELECTSTATEMENT||1||1|00:00:00.01|3|
|1|COUNT||1||1|00:00:00.01|3|
|2|TABLEACCESSBYINDEXROWID|XXX_XX_XX_XXX_OLD|1|1|1|00:00:00.01|3|
|*3|INDEXUNIQUESCAN|PK_XXX_XX_XX_XXX|1|1|1|00:00:00.01|2|
--------------------------------------------------------------------------------------------------------------
Predicateinformation(identifiedbyoperationid):
---------------------------------------------------
3-access("E"."AX_ID"=****AND"E"."BX_ID"=**)
21rowsselected.

--sql执行计划通过全表查询的数据

sql>select/*+full(e)*/e.*,rowidfromXXXXXMIN.XXX_XX_XX_XXX_OLDewheree.AX_ID=****ande.BX_ID=**;
AX_IDBX_IDCOMMITED_XXXXXUNCOMMITED_XXXXXUNSHARED_XXXXXUPDATED_TROWNUMROWID
-----------------------------------------------------------------------------------------------------
****************************AABOd5AAIAABFRCACk
sql>select*fromtable(dbms_xplan.display_cursor(null,'allstatslast'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
sql_ID14vbv6bu472ty,childnumber1
-------------------------------------
select/*+full(e)*/e.*,rowidfromXXXXXMIN.XXX_XX_XX_XXX_OLDewheree.AX_ID=****ande.BX_ID=**

Planhashvalue:3364144674
----------------------------------------------------------------------------------------------------
|Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|
----------------------------------------------------------------------------------------------------
|0|SELECTSTATEMENT||1||1|00:00:00.01|68|
|1|COUNT||1||1|00:00:00.01|68|
|*2|TABLEACCESSFULL|XXX_XX_XX_XXX_OLD|1|1|1|00:00:00.01|68|
----------------------------------------------------------------------------------------------------
Predicateinformation(identifiedbyoperationid):
---------------------------------------------------
2-filter(("E"."AX_ID"=****AND"E"."BX_ID"=**))
20rowsselected.

通过对比走全表扫描和索引扫描时存在差异

sql>selecte.*,rowidfromXXXXXMIN.XXX_XX_XX_XXX_OLDewheree.AX_ID=****ande.BX_ID=**;

AX_IDBX_IDCOMMITED_XXXXXUNCOMMITED_XXXXXUNSHARED_XXXXXUPDATED_TROWNUMROWID
-----------------------------------------------------------------------------------------------------
****************************AABOd5AAIAAAGcaABR

sql>select/*+full(e)*/e.*,rowidfromXXXXXMIN.XXX_XX_XX_XXX_OLDewheree.AX_ID=****ande.BX_ID=**;

AX_IDBX_IDCOMMITED_XXXXXUNCOMMITED_XXXXXUNSHARED_XXXXXUPDATED_TROWNUMROWID
-----------------------------------------------------------------------------------------------------
****************************AABOd5AAIAABFRCACk

小结:在执行delete from "XXXXXMIN"."XXX_XX_XX_XXXXX_OLD" where "AX_ID" = :1时,该sql的执行计划是走索引扫描,因为表与索引之间的逻辑数据不一致(索引列的值和行所对应的ROWID组成的索引数据与表数据不一致),在变量“:1”的值刚好是异常的值时,导致了ORA 600 [13011]的报错。


三、解决方

1、重建XXXXXMIN.PK_XXX_XX_XX_XXXXX索引

XXXXXMIN.PK_XXX_XX_XX_XXXXX"AX_ID","BX_ID"列的联合主键索引并且AX_ID列作为"XXXXXMIN"."XXX_VX" ("ID")的关联外键,BX_ID列作为 "XXXXXMIN"."XXX_dataxXXXX" ("ID")的关联外键。所以为避免对业务造成影响,使用ONLINE在线重建的方式重建XXXXXMIN.PK_XXX_XX_XX_XXXXX索引。


ALTERINDEXXXXXXMIN.PK_XXX_XX_XX_XXXREBUILDONLINE;


2、验证重建索引后的数据

select/*+INDEX_FFS(tpk_xxx_xx_xx_xxx)*/rowid,8dbms_rowid.ROWID_BLOCK_NUMBER(rowid)blockfromXXXXXMIN.XXX_XX_XX_XXXXX_OLDt1;

norowsselected

相关文章

Java Oracle 结果集是Java语言中处理数据库查询结果的一种方...
Java AES和Oracle AES是现代加密技术中最常使用的两种AES加密...
Java是一种广泛应用的编程语言,具备可靠性、安全性、跨平台...
随着移动互联网的发展,抽奖活动成为了营销活动中不可或缺的...
Java和Oracle都是在计算机领域应用非常广泛的技术,他们经常...
Java 是一门非常流行的编程语言,它可以运行于各种操作系统上...