本站文章除注明转载外,均为本站原创: 转载自love wife & love life —Roger 的Oracle技术博客
朋友反馈其客户的一个库系统损坏,导致oracle 崩溃,最后通过安全模式将数据文件拷贝出来,发现无法启动,非归档环境,
而且只有dmp 备份,之前他们通过dmp 备份进行了恢复,但是发现部分dmp 可能存在问题,导致部分表无法恢复,又尝试使用ODU进行数据文件的抽取,也发现部分表无法抽取(可能是system损坏较为严重,dbv检测有1000多个坏块)。
如下是尝试open时的alert log信息:
Wed Jun 17 17:31:00 2015 Database Characterset is ZHS16GBK Wed Jun 17 17:31:00 2015 Hex dump of (file 1, block 53319) in trace file /spacedb/oracle/app/admin/workflow/bdump/workflow_smon_10972.trc Corrupt block relative dba: 0x0040d047 (file 1, block 53319) Fractured block found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x0040d047 last change scn: 0x0000.0007b0ab seq: 0x1 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x00000000 check value in block header: 0x89cd computed block checksum: 0xd2d1 Reread of rdba: 0x0040d047 (file 1, block 53319) found same corrupted data Wed Jun 17 17:31:00 2015 Errors in file /spacedb/oracle/app/admin/workflow/bdump/workflow_smon_10972.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-08103: object no longer exists Hex dump of (file 1, block 53319) in trace file /spacedb/oracle/app/admin/workflow/bdump/workflow_smon_10972.trc Corrupt block relative dba: 0x0040d047 (file 1, block 53319) Fractured block found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x0040d047 last change scn: 0x0000.0007b0ab seq: 0x1 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x00000000 check value in block header: 0x89cd computed block checksum: 0xd2d1 Reread of rdba: 0x0040d047 (file 1, block 53319) found same corrupted data Wed Jun 17 17:31:00 2015 Errors in file /spacedb/oracle/app/admin/workflow/bdump/workflow_smon_10972.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-08103: object no longer exists Hex dump of (file 1, block 53319) in trace file /spacedb/oracle/app/admin/workflow/bdump/workflow_smon_10972.trc Corrupt block relative dba: 0x0040d047 (file 1, block 53319) Fractured block found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x0040d047 last change scn: 0x0000.0007b0ab seq: 0x1 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x00000000 check value in block header: 0x89cd computed block checksum: 0xd2d1 Reread of rdba: 0x0040d047 (file 1, block 53319) found same corrupted data Wed Jun 17 17:31:00 2015 Errors in file /spacedb/oracle/app/admin/workflow/bdump/workflow_smon_10972.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-08103: object no longer exists Wed Jun 17 17:31:00 2015 Opening with internal Resource Manager plan where NUMA PG = 1, CPUs = 8 Wed Jun 17 17:31:00 2015 Errors in file /spacedb/oracle/app/admin/workflow/udump/workflow_ora_11168.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-08103: object no longer exists Error 604 happened during db open, shutting down database USER: terminating instance due to error 604 Instance terminated by USER, pid = 11168 ORA-1092 signalled during: ALTER DATABASE OPEN...
很明显,Oracle 在执行递归SQL的适合报错了,而且遇到了坏块。通过dbv检测,我发现存在大量的坏块,而且部分块还是连续损坏,极有可能是某个extent都损坏了。如下是dbv的检测结果:
---dbv system01.dbf [oracle@zxzx workflow]$ dbv file=system01.dbf blocksize=8192 logfile=check_system.log DBVERIFY: Release 10.2.0.4.0 - Production on Wed Jun 17 23:59:40 2015 Copyright (c) 1982, 2007, Oracle. All rights reserved. ...... ...... DBVERIFY - Verification complete Total Pages Examined : 72960 Total Pages Processed (Data) : 44288 Total Pages Failing (Data) : 0 Total Pages Processed (Index): 9414 Total Pages Failing (Index): 0 Total Pages Processed (Other): 1837 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 15709 Total Pages Marked Corrupt : 1712 Total Pages Influx : 380 Highest block SCN : 105111246 (0.105111246) [oracle@zxzx workflow]$ [oracle@zxzx workflow]$ cat check_system.log |grep 533 Page 50533 is marked corrupt Corrupt block relative dba: 0x0040c565 (file 1, block 50533) Page 53319 is influx - most likely media corrupt Corrupt block relative dba: 0x0040d047 (file 1, block 53319) Page 53320 is marked corrupt Corrupt block relative dba: 0x0040d048 (file 1, block 53320) Page 53321 is marked corrupt Corrupt block relative dba: 0x0040d049 (file 1, block 53321) Page 53322 is marked corrupt Corrupt block relative dba: 0x0040d04a (file 1, block 53322) Page 53323 is marked corrupt Corrupt block relative dba: 0x0040d04b (file 1, block 53323) Page 53324 is marked corrupt Corrupt block relative dba: 0x0040d04c (file 1, block 53324) Page 53325 is marked corrupt Corrupt block relative dba: 0x0040d04d (file 1, block 53325) Page 53326 is marked corrupt Corrupt block relative dba: 0x0040d04e (file 1, block 53326) Page 53327 is influx - most likely media corrupt Corrupt block relative dba: 0x0040d04f (file 1, block 53327) Page 53383 is influx - most likely media corrupt Corrupt block relative dba: 0x0040d087 (file 1, block 53383) Page 53384 is marked corrupt Corrupt block relative dba: 0x0040d088 (file 1, block 53384) Page 53385 is marked corrupt Corrupt block relative dba: 0x0040d089 (file 1, block 53385) Page 53386 is marked corrupt Corrupt block relative dba: 0x0040d08a (file 1, block 53386) Page 53387 is marked corrupt Corrupt block relative dba: 0x0040d08b (file 1, block 53387) Page 53388 is marked corrupt Corrupt block relative dba: 0x0040d08c (file 1, block 53388) Page 53389 is marked corrupt Corrupt block relative dba: 0x0040d08d (file 1, block 53389) Page 53390 is marked corrupt Corrupt block relative dba: 0x0040d08e (file 1, block 53390) Page 53391 is influx - most likely media corrupt Corrupt block relative dba: 0x0040d08f (file 1, block 53391) Page 55330 is marked corrupt Corrupt block relative dba: 0x0040d822 (file 1, block 55330) Page 55331 is marked corrupt Corrupt block relative dba: 0x0040d823 (file 1, block 55331) Page 55332 is marked corrupt Corrupt block relative dba: 0x0040d824 (file 1, block 55332) Page 55333 is marked corrupt Corrupt block relative dba: 0x0040d825 (file 1, block 55333) Page 55334 is marked corrupt Corrupt block relative dba: 0x0040d826 (file 1, block 55334) Page 55335 is influx - most likely media corrupt Corrupt block relative dba: 0x0040d827 (file 1, block 55335) Page 55533 is marked corrupt Corrupt block relative dba: 0x0040d8ed (file 1, block 55533) Page 60533 is marked corrupt Corrupt block relative dba: 0x0040ec75 (file 1, block 60533) [oracle@zxzx workflow]$ cat check_system.log |grep 538 Page 53839 is influx - most likely media corrupt Corrupt block relative dba: 0x0040d24f (file 1, block 53839) Page 53840 is marked corrupt Corrupt block relative dba: 0x0040d250 (file 1, block 53840) Page 53841 is marked corrupt Corrupt block relative dba: 0x0040d251 (file 1, block 53841) Page 53842 is marked corrupt Corrupt block relative dba: 0x0040d252 (file 1, block 53842) Page 53843 is marked corrupt Corrupt block relative dba: 0x0040d253 (file 1, block 53843) Page 53844 is marked corrupt Corrupt block relative dba: 0x0040d254 (file 1, block 53844) Page 53845 is marked corrupt Corrupt block relative dba: 0x0040d255 (file 1, block 53845) Page 53846 is marked corrupt Corrupt block relative dba: 0x0040d256 (file 1, block 53846) Page 53847 is influx - most likely media corrupt Corrupt block relative dba: 0x0040d257 (file 1, block 53847) Page 57538 is marked corrupt Corrupt block relative dba: 0x0040e0c2 (file 1, block 57538) [oracle@zxzx workflow]$
我们看到,部分坏块是连续的,这种情况处理就比较麻烦了。开始我尝试通过一些其他手段想把库先拉起来,发现不行,如下是10046 trace的内容:
===================== PARSING IN CURSOR #3 len=169 dep=1 uid=0 oct=3 lid=0 tim=1400932967838961 hv=1173719687 ad='ddb76e68' select col#, grantee#, privilege#,max(mod(nvl(option$,0),2)) from objauth$ where obj#=:1 and col# is not null group by privilege#, col#, grantee# order by col#, grantee# END OF STMT EXEC #3:c=0,e=67,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967838959 FETCH #3:c=0,e=18,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967839016 STAT #3 id=1 cnt=0 pid=0 pos=1 obj=0 op='SORT GROUP BY (cr=6 pr=0 pw=0 time=81 us)' STAT #3 id=2 cnt=0 pid=1 pos=1 obj=57 op='TABLE ACCESS BY INDEX ROWID OBJAUTH$ (cr=6 pr=0 pw=0 time=52 us)' STAT #3 id=3 cnt=0 pid=2 pos=1 obj=103 op='INDEX RANGE SCAN I_OBJAUTH1 (cr=6 pr=0 pw=0 time=49 us)' BINDS #7: kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=7f5b4afb40c0 bln=22 avl=02 flg=05 value=16 ===================== PARSING IN CURSOR #7 len=151 dep=1 uid=0 oct=3 lid=0 tim=1400932967839161 hv=4139184264 ad='deedc608' select grantee#,privilege#,nvl(col#,0),max(mod(nvl(option$,0),2))from objauth$ where obj#=:1 group by grantee#,privilege#,nvl(col#,0) order by grantee# END OF STMT EXEC #7:c=0,e=61,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967839158 FETCH #7:c=0,e=29,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=4,tim=1400932967839225 FETCH #7:c=0,e=3,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967839253 STAT #7 id=1 cnt=2 pid=0 pos=1 obj=0 op='SORT GROUP BY (cr=8 pr=1 pw=0 time=160 us)' STAT #7 id=2 cnt=2 pid=1 pos=1 obj=57 op='TABLE ACCESS BY INDEX ROWID OBJAUTH$ (cr=8 pr=1 pw=0 time=113 us)' STAT #7 id=3 cnt=2 pid=2 pos=1 obj=103 op='INDEX RANGE SCAN I_OBJAUTH1 (cr=6 pr=0 pw=0 time=43 us)' BINDS #4: kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=7f5b4afb6a80 bln=22 avl=02 flg=05 value=18 ===================== PARSING IN CURSOR #4 len=169 dep=1 uid=0 oct=3 lid=0 tim=1400932967839460 hv=1173719687 ad='ddb76e68' select col#, grantee#, privilege#,max(mod(nvl(option$,0),2)) from objauth$ where obj#=:1 and col# is not null group by privilege#, col#, grantee# order by col#, grantee# END OF STMT EXEC #4:c=0,e=68,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967839458 FETCH #4:c=0,e=21,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967839518 STAT #4 id=1 cnt=0 pid=0 pos=1 obj=0 op='SORT GROUP BY (cr=8 pr=0 pw=0 time=110 us)' STAT #4 id=2 cnt=0 pid=1 pos=1 obj=57 op='TABLE ACCESS BY INDEX ROWID OBJAUTH$ (cr=8 pr=0 pw=0 time=72 us)' STAT #4 id=3 cnt=0 pid=2 pos=1 obj=103 op='INDEX RANGE SCAN I_OBJAUTH1 (cr=8 pr=0 pw=0 time=67 us)' BINDS #3: kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=7f5b4afb40c0 bln=22 avl=02 flg=05 value=18 ===================== PARSING IN CURSOR #3 len=151 dep=1 uid=0 oct=3 lid=0 tim=1400932967839663 hv=4139184264 ad='deedc608' select grantee#,privilege#,nvl(col#,0),max(mod(nvl(option$,0),2))from objauth$ where obj#=:1 group by grantee#,privilege#,nvl(col#,0) order by grantee# END OF STMT EXEC #3:c=0,e=62,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967839660 WAIT #3: nam='db file sequential read' ela= 20 file#=1 block#=53841 blocks=1 obj#=-1 tim=1400932967839771 FETCH #3:c=0,e=93,p=1,cr=3,cu=0,mis=0,r=0,dep=1,og=4,tim=1400932967839800 ORA-00604: error occurred at recursive SQL level 1 ORA-08103: object no longer exists EXEC #1:c=471928,e=1720329,p=273,cr=3927,cu=157,mis=0,r=0,dep=0,og=1,tim=1400932968820893 ERROR #1:err=1092 tim=438031590
我们可以看到,在访问某个block的适合出问题了,而且该block后面连续几个都是损坏的。我尝试bbed copy修改了几个block都不行。最后发现其实这个递归SQL,可以想办法绕过去的,及通过修改Oracle 二进制文件的方法,可惜的是客户
的机器上已经跑了一个库了,无法停,因此这种方法也就作罢;当然或许还能通过gdb来实现。总的来讲比较麻烦。
考虑到他们本身具有dmp 备份,因此直接dul 抽取dmp 即可,把需要的表弄出来就完了。
如下是dul 抽取dump的步骤:
1. scan dump file: scan dump file spacedb/orabak/wxzx20150604.dmp; 2. cat dul.log|grep YZ_CAR_APPLY 3. 抽取需要的表数据(获得表的偏移量位置) unexp TABLE "YZ_CAR_APPLY" ("ID" NUMBER(20, 0) NOT NULL ENABLE, "APPLY_USERID" VARCHAR2(50), "APPLY_USERNAME" VARCHAR2(50), "DEPT_ID" VARCHAR2(50), "DEPT_NAME" VARCHAR2(50), "CAR_TYPE" VARCHAR2(50), "CAR_CODE" VARCHAR2(50), "RENSHU" NUMBER(3, 0), "SHIYOU" VARCHAR2(1000), "LEAVE_TIME" VARCHAR2(50), "LEAVE_ADDRESS" VARCHAR2(400), "BACK_TIME" VARCHAR2(50), "BACK_ADDRESS" VARCHAR2(400), "LINKMAN" VARCHAR2(50), "LINK_PHONE" VARCHAR2(40), "SUBFLAG" VARCHAR2(50), "USER_LIST" VARCHAR2(2000), "CAR_ID" VARCHAR2(50), "CRE_USERID" VARCHAR2(50), "CRE_DATE" VARCHAR2(50), "ORG_ID" VARCHAR2(50), "FLOW_TYPE" VARCHAR2(50), "FILE_TYPE" VARCHAR2(50), "TITLE" VARCHAR2(200), "FLOWCOURSE" VARCHAR2(4000), "SECRETARIAL_SIGN" VARCHAR2(4000), "SECRETARIAL_IDEA" VARCHAR2(4000), "OFFICELEADER_SIGN" VARCHAR2(4000), "OFFICELEADER_IDEA" VARCHAR2(4000), "SLEADER_SIGN" VARCHAR2(4000), "SLEADER_IDEA" VARCHAR2(4000), "OTHER_NOTION" VARCHAR2(4000), "DRIVER" VARCHAR2(50)) dump file /spacedb/orabak/wxzx20150604.dmp from 23934970472;
Related posts: