联系:手机/微信(+86 17813235971) QQ(107644445)
标题:利用flashback快速恢复failover 的备库
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
客户数据库架构为单机+dataguard,一台生产库跑在物理机,备库跑在虚拟化环境中(当时由于成本原因使用了机械盘),今天物理机突然直接罢工,客户要求紧急切换备库
Thu Aug 08 09:52:13 2024 Media Recovery Waiting for thread 1 sequence 189448 (in transit) Recovery of Online Redo Log: Thread 1 Group 12 Seq 189448 Reading mem 0 Mem# 0: /oradata/xff/std_redo12.log Thu Aug 08 09:52:13 2024 Archived Log entry 187514 added for thread 1 sequence 189447 ID 0x2e6bc37f dest 1: Thu Aug 08 10:54:40 2024 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH force Terminal Recovery: Stopping real time apply Thu Aug 08 10:54:40 2024 MRP0: Background Media Recovery cancelled with status 16037 Errors in file /u01/app/oracle/diag/rdbms/xffdg/xff/trace/xff_pr00_17876.trc: ORA-16037: user requested cancel of managed recovery operation Managed Standby Recovery not using Real Time Apply Recovery interrupted! Recovered data files to a consistent state at change 34188310512 Thu Aug 08 10:54:43 2024 MRP0: Background Media Recovery process shutdown (xff) Terminal Recovery: Stopped real time apply Thu Aug 08 10:55:14 2024 Stopping background process MMNL Stopping background process MMON Thu Aug 08 10:55:46 2024 Background process MMON not dead after 30 seconds Killing background process MMON All dispatchers and shared servers shutdown CLOSE: killing server sessions. Active process 17691 user 'oracle' program 'oracle@xffDG (MMON)' Active process 15077 user 'oracle' program 'oracle@xffDG' Active process 17691 user 'oracle' program 'oracle@xffDG (MMON)' Active process 11536 user 'oracle' program 'oracle@xffDG (M000)' Active process 17691 user 'oracle' program 'oracle@xffDG (MMON)' Active process 15077 user 'oracle' program 'oracle@xffDG' Active process 11536 user 'oracle' program 'oracle@xffDG (M000)' Active process 11536 user 'oracle' program 'oracle@xffDG (M000)' Active process 11536 user 'oracle' program 'oracle@xffDG (M000)' CLOSE: all sessions shutdown successfully. Thu Aug 08 10:56:11 2024 SMON: disabling cache recovery Attempt to do a Terminal Recovery (xff) Media Recovery Start: Managed Standby Recovery (xff) started logmerger process Thu Aug 08 10:56:13 2024 Managed Standby Recovery not using Real Time Apply Parallel Media Recovery started with 4 slaves Media Recovery Waiting for thread 1 sequence 189448 (in transit) Killing 4 processes with pids 17733,17729,17731,32533 (all RFS, wait for I/O) in order to disallow current and future RFS connections. Requested by OS process 15184 Thu Aug 08 10:56:16 2024 idle dispatcher 'D000' terminated, pid = (16, 1) Begin: Standby Redo Logfile archival End: Standby Redo Logfile archival Terminal Recovery timestamp is '08/08/2024 10:56:17' Terminal Recovery: applying standby redo logs. Terminal Recovery: thread 1 seq# 189448 redo required Terminal Recovery: Recovery of Online Redo Log: Thread 1 Group 12 Seq 189448 Reading mem 0 Mem# 0: /oradata/xff/std_redo12.log Identified End-Of-Redo (failover) for thread 1 sequence 189448 at SCN 0xffff.ffffffff Incomplete Recovery applied until change 34188310513 time 08/08/2024 11:32:41 Thu Aug 08 10:56:18 2024 Media Recovery Complete (xff) Terminal Recovery: successful completion Thu Aug 08 10:56:18 2024 ARCH: Archival stopped, error occurred. Will continue retrying Forcing ARSCN to IRSCN for TR 7:4123539441 ORACLE Instance xff - Archival Error Attempt to set limbo arscn 7:4123539441 irscn 7:4123539441 Resetting standby activation ID 778814335 (0x2e6bc37f) ORA-16014: log 12 sequence# 189448 not archived, no available destinations ORA-00312: online log 12 thread 1: '/oradata/xff/std_redo12.log' Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH force ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL ORA-16136 signalled during: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL... Thu Aug 08 10:56:28 2024 ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE ALTER DATABASE ACTIVATE [PHYSICAL] STANDBY DATABASE (xff) Begin: Standby Redo Logfile archival End: Standby Redo Logfile archival Thu Aug 08 10:56:28 2024 Archiver process freed from errors. No longer stopped Standby terminal recovery start SCN: 34188310512 RESETLOGS after incomplete recovery UNTIL CHANGE 34188310513 Online log /oradata/xff/redo01.log: Thread 1 Group 1 was previously cleared Online log /oradata/xff/redo02.log: Thread 1 Group 2 was previously cleared Online log /oradata/xff/redo03.log: Thread 1 Group 3 was previously cleared Online log /oradata/xff/redo04.log: Thread 1 Group 4 was previously cleared Standby became primary SCN: 34188310511 Thu Aug 08 10:56:29 2024 Setting recovery target incarnation to 3 ACTIVATE STANDBY: Complete - Database mounted as primary Completed: ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE ARC1: Becoming the 'no SRL' ARCH alter database open Thu Aug 08 10:56:34 2024 Assigning activation ID 832379854 (0x319d1bce) Thread 1 advanced to log sequence 2 (thread open) Thread 1 opened at log sequence 2 Current log# 2 seq# 2 mem# 0: /oradata/xff/redo02.log Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Aug 08 10:56:34 2024 SMON: enabling cache recovery Thu Aug 08 10:56:34 2024 ARC0: LGWR is scheduled to archive destination LOG_ARCHIVE_DEST_2 after log switch Thu Aug 08 10:56:34 2024 NSA2 started with pid=14, OS id=15198 [15133] Successfully onlined Undo Tablespace 2. Undo initialization finished serial:0 start:1087824580 end:1087828220 diff:3640 (36 seconds) Dictionary check beginning Dictionary check complete Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed SMON: enabling tx recovery Thu Aug 08 10:56:38 2024 Database Characterset is ZHS16GBK Starting background process SMCO Thu Aug 08 10:56:39 2024 SMCO started with pid=15, OS id=15200 Thread 1 advanced to log sequence 3 (LGWR switch) Current log# 3 seq# 3 mem# 0: /oradata/xff/redo03.log ****************************************************************** LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2 ****************************************************************** Thu Aug 08 10:56:40 2024 Archived Log entry 187515 added for thread 1 sequence 2 ID 0x319d1bce dest 1: Starting background process QMNC Thu Aug 08 10:56:43 2024 QMNC started with pid=17, OS id=15204 LOGSTDBY: Validating controlfile with logical metadata LOGSTDBY: Validation complete Completed: alter database open
很不幸由于虚拟机资源io太差,无法接管业务,硬件工程师紧急修复好物理机,启动数据库正常,客户直接把业务又切换到物理机中,现在需要恢复dataguard环境(并且客户把虚拟机迁移到ssd环境中),把虚拟机数据库重启到mount状态
[oracle@xffDG ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Thu Aug 8 20:06:30 2024 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup mount; ORACLE instance started. Total System Global Area 2.5655E+10 bytes Fixed Size 2265224 bytes Variable Size 3892318072 bytes Database Buffers 2.1743E+10 bytes Redo Buffers 16896000 bytes Database mounted. SQL> select open_mode,database_role from v$database; OPEN_MODE DATABASE_ROLE -------------------- ---------------- MOUNTED PRIMARY
闪回数据库到备库failover之前scn
SQL> flashback database to scn 34188310500; Flashback complete.
Thu Aug 08 20:09:40 2024 flashback database to scn 34188310500 Flashback Restore Start Thu Aug 08 20:10:34 2024 Flashback Restore Complete Flashback Media Recovery Start Thu Aug 08 20:10:34 2024 Setting recovery target incarnation to 2 started logmerger process Parallel Media Recovery started with 4 slaves Flashback Media Recovery Log /oradata/fast_recovery_area/XFF/archivelog/2024_08_08/o1_mf_1_189448_mc8dzjxn_.arc Thu Aug 08 20:10:35 2024 Identified End-Of-Redo (failover) for thread 1 sequence 189448 at SCN 0x7.f5c837f1 Incomplete Recovery applied until change 34188310501 time 08/08/2024 11:32:40 Flashback Media Recovery Complete Setting recovery target incarnation to 3 Completed: flashback database to scn 34188310500
切换虚拟机库到standby 状态
SQL> alter database convert to physical standby; Database altered. SQL> select database_role from v$database; select database_role from v$database * ERROR at line 1: ORA-01507: database not mounted SQL> alter database mount; alter database mount * ERROR at line 1: ORA-00750: database has been previously mounted and dismounted SQL> shutdown immediate; ORA-01507: database not mounted ORACLE instance shut down. SQL> startup mount; ORACLE instance started. Total System Global Area 2.5655E+10 bytes Fixed Size 2265224 bytes Variable Size 3892318072 bytes Database Buffers 2.1743E+10 bytes Redo Buffers 16896000 bytes Database mounted. SQL> select open_mode,database_role from v$database; OPEN_MODE DATABASE_ROLE -------------------- ---------------- MOUNTED PHYSICAL STANDBY
Thu Aug 08 20:10:46 2024 alter database convert to physical standby ALTER DATABASE CONVERT TO PHYSICAL STANDBY (xff) Flush standby redo logfile failed:1649 Clearing standby activation ID 832379854 (0x319d1bce) The primary database controlfile was created using the 'MAXLOGFILES 16' clause. There is space for up to 12 standby redo logfiles Use the following SQL commands on the standby database to create standby redo logfiles that match the primary database: ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 209715200; ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 209715200; ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 209715200; ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 209715200; ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 209715200; Shutting down archive processes Archiving is disabled Completed: alter database convert to physical standby
开启mrp进程
SQL> alter database open read only; Database altered. SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION; Database altered.