联系:手机/微信(+86 17813235971) QQ(107644445)
标题:通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
12.1.0.2数据库由于异常断电,导致无法正常启动,通过alert日志对客户的整个操作过程进行分析(不含我的操作部分)
通过alert日志分析最初故障原因是由于控制文件有坏块导致
Tue Sep 24 11:49:48 2024 alter database open Tue Sep 24 11:49:48 2024 Ping without log force is disabled . Tue Sep 24 11:49:48 2024 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4715.trc: ORA-01113: file 10 needs media recovery ORA-01110: data file 10: '/u01/app/oracle/oradata/xifenfei.dbf' ORA-1113 signalled during: alter database open... alter database recover datafile '/u01/app/oracle/oradata/xifenfei.dbf'
offline 无法正常recover的数据文件
Tue Sep 24 13:13:30 2024 Media Recovery Complete (orcl) Completed: ALTER DATABASE RECOVER datafile 15 ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' END BACKUP ORA-1235 signalled during: ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' END BACKUP... ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' offline Completed: ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' offline Tue Sep 24 13:25:16 2024 ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xff.dbf' offline Completed: ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xff.dbf' offline
然后尝试打开数据库,遭遇ORA-600 4193错误,没有open成功
Tue Sep 24 13:27:06 2024 Media Recovery Complete (orcl) Completed: ALTER DATABASE RECOVER datafile 13 alter database open Tue Sep 24 13:27:16 2024 Ping without log force is disabled . Tue Sep 24 13:27:16 2024 Beginning crash recovery of 1 threads parallel recovery started with 7 processes Tue Sep 24 13:27:16 2024 Started redo scan Tue Sep 24 13:27:16 2024 Completed redo scan read 67 KB redo, 0 data blocks need recovery Tue Sep 24 13:27:16 2024 Started redo application at Thread 1: logseq 7422, block 2, scn 119284797 Tue Sep 24 13:27:16 2024 Recovery of Online Redo Log: Thread 1 Group 3 Seq 7422 Reading mem 0 Mem# 0: /u01/app/oracle/oradata/orcl/redo03.log Tue Sep 24 13:27:16 2024 Completed redo application of 0.00MB Tue Sep 24 13:27:16 2024 Completed crash recovery at Thread 1: logseq 7422, block 136, scn 119284798 0 data blocks read, 0 data blocks written, 67 redo k-bytes read Initializing SCN for created control file Database SCN compatibility initialized to 3 Starting background process TMON Tue Sep 24 13:27:16 2024 TMON started with pid=32, OS id=10617 Tue Sep 24 13:27:16 2024 Thread 1 advanced to log sequence 7423 (thread open) Thread 1 opened at log sequence 7423 Current log# 1 seq# 7423 mem# 0: /u01/app/oracle/oradata/orcl/redo01.log Successful open of redo thread 1 Tue Sep 24 13:27:16 2024 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Tue Sep 24 13:27:16 2024 SMON: enabling cache recovery Tue Sep 24 13:27:20 2024 [10553] Successfully onlined Undo Tablespace 2. Undo initialization finished serial:0 start:6974064 end:6975474 diff:1410 ms (1.4 seconds) Dictionary check beginning Tablespace 'TEMP' #3 found in data dictionary, but not in the controlfile. Adding to controlfile. File #10 is offline, but is part of an online tablespace. data file 10: '/u01/app/oracle/oradata/tbs_data.dbf' File #14 is offline, but is part of an online tablespace. data file 14: '/u01/app/oracle/oradata/corsmf03.dbf' Dictionary check complete Verifying minimum file header compatibility (11g) for tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed Tue Sep 24 13:27:21 2024 SMON: enabling tx recovery Tue Sep 24 13:27:21 2024 ********************************************************************* WARNING: The following temporary tablespaces contain no files. This condition can occur when a backup controlfile has been restored. It may be necessary to add files to these tablespaces. That can be done using the SQL statement: ALTER TABLESPACE <tablespace_name> ADD TEMPFILE Alternatively, if these temporary tablespaces are no longer needed, then they can be dropped. Empty temporary tablespace: TEMP ********************************************************************* Updating character set in controlfile to AL32UTF8 Starting background process SMCO Tue Sep 24 13:27:21 2024 SMCO started with pid=34, OS id=10632 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_10523.trc (incident=108129): ORA-00600: internal error code, arguments: [4193], [21368], [21372], [], [], [], [], [], [], [], [], [] Incident details in:/u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_108129/orcl_smon_10523_i108129.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. ………… Tue Sep 24 13:27:24 2024 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_10553.trc: ORA-00600: internal error code, arguments: [4193], [21652], [21539], [], [] Tue Sep 24 13:27:24 2024 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_10553.trc: ORA-00600: internal error code, arguments: [4193], [21652], [21539], [], [] Error 600 happened during db open, shutting down database USER (ospid: 10553): terminating the instance due to error 600 Tue Sep 24 13:27:25 2024 Instance terminated by USER, pid = 10553 ORA-1092 signalled during: alter database open...
重建了ctl,加入_allow_resetlogs_corruption隐含参数,尝试使用resetlogs方式打开数据库,报ORA-600 2662错误
Tue Sep 24 14:30:22 2024 alter database open RESETLOGS Tue Sep 24 14:32:09 2024 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 119237645 time Online log /u01/app/oracle/oradata/orcl/redo01.log: Thread 1 Group 1 was previously cleared Online log /u01/app/oracle/oradata/orcl/redo02.log: Thread 1 Group 2 was previously cleared Online log /u01/app/oracle/oradata/orcl/redo03.log: Thread 1 Group 3 was previously cleared Tue Sep 24 14:32:09 2024 Setting recovery target incarnation to 2 Tue Sep 24 14:32:09 2024 Ping without log force is disabled . Initializing SCN for created control file Database SCN compatibility initialized to 3 Tue Sep 24 14:32:09 2024 Warning - High Database SCN: Current SCN value is 119237648, threshold SCN value is 0 If you have not previously reported this warning on this database, please notify Oracle Support so that additional diagnosis can be performed. Starting background process TMON Tue Sep 24 14:32:09 2024 TMON started with pid=25, OS id=15032 Tue Sep 24 14:32:09 2024 Assigning activation ID 1708301307 (0x65d29bfb) Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: /u01/app/oracle/oradata/orcl/redo01.log Successful open of redo thread 1 Tue Sep 24 14:32:09 2024 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Tue Sep 24 14:32:09 2024 SMON: enabling cache recovery Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14937.trc (incident=122458): ORA-00600: internal error code, arguments: [2662], [0], [119484861], [0], [119484868], [16777344]…… Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_122458/orcl_ora_14937_i122458.trc Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14937.trc (incident=122459): ……………… Tue Sep 24 14:32:16 2024 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_124802/orcl_ora_14937_i124802.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-00600: internal error code, arguments: [2662], [0], [119484866], [0], [119484868], [16777344]…… ORA-00600: internal error code, arguments: [2662], [0], [119484865], [0], [119484868], [16777344]…… ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00600: internal error code, arguments: [2662], [0], [119484861], [0], [119484868], [16777344]……
客户的自行恢复到此为止,没有成功,这里客户的恢复没有犯原则性错误(破坏文件的resetlogs 信息),同时也没有解决两个ORA-600错误
1. 在offline部分文件的情况下,打开数据库(没有使用resetlogs,避免了进一步破坏offline文件的resetlogs 信息),但是数据库报ORA-600 4193错误没有打开库成功
2. 后面强制拉库之前重建了ctl文件,避免了offline数据文件在resetlogs之后导致文件头resetlogs 信息和其他文件不一致的可能(因为重建ctl,offline的文件自动onlinne)
3. 最初offline数据文件,启动库报ORA-600 4193故障没有解决,这个故障一般是undo异常导致,这个故障大概率在后面强制拉库open过程中还可能遇到
4. 强制拉库过程中遭遇ORA-600 2662问题,需要修改scn,如果这个问题不解决,数据库无法open成功