联系:手机/微信(+86 17813235971) QQ(107644445)
标题:ERROR: diskgroup XXXX was not mounted
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
aix平台10.2.0.5 2节点RAC,由于节点2系统盘故障,通过节点1镜像系统,复制到节点2,结果由于节点2磁盘顺序和节点1不匹配,aix工程师进行了相关操作之后,节点1重启之后datadg磁盘组无法mount
SQL> alter diskgroup datadg mount Mon Jun 10 23:23:46 CST 2019 NOTE: cache registered group DATADG number=1 incarn=0x8cf61164 Mon Jun 10 23:23:46 CST 2019 NOTE: Hbeat: instance first (grp 1) Mon Jun 10 23:23:50 CST 2019 NOTE: start heartbeating (grp 1) Mon Jun 10 23:23:50 CST 2019 NOTE: cache dismounting group 1/0x8CF61164 (DATADG) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DATADG was not mounted
检查datadg磁盘组相关信息
Tue Jan 29 19:21:45 CST 2019 NOTE: start heartbeating (grp 2) NOTE: cache opening disk 0 of grp 2: DATADG_0000 path:/dev/rhdisk6 Tue Jan 29 19:21:45 CST 2019 NOTE: F1X0 found on disk 0 fcn 0.0 NOTE: cache opening disk 1 of grp 2: DATADG_0001 path:/dev/rhdisk7 NOTE: cache opening disk 2 of grp 2: DATADG_0002 path:/dev/rhdisk8 NOTE: cache opening disk 3 of grp 2: DATADG_0003 path:/dev/rhdisk9 NOTE: cache mounting (first) group 2/0x60E59155 (DATADG) * allocate domain 2, invalid = TRUE Tue Jan 29 19:21:45 CST 2019 NOTE: attached to recovery domain 2 Tue Jan 29 19:21:45 CST 2019 NOTE: cache recovered group 2 to fcn 0.849668 Tue Jan 29 19:21:45 CST 2019 NOTE: LGWR attempting to mount thread 1 for disk group 2 NOTE: LGWR mounted thread 1 for disk group 2 NOTE: opening chunk 1 at fcn 0.849668 ABA NOTE: seq=21 blk=5394 Tue Jan 29 19:21:46 CST 2019 NOTE: cache mounting group 2/0x60E59155 (DATADG) succeeded SUCCESS: diskgroup DATADG was mounted
通过这里可以看出来datadg磁盘组是由rhdisk6-9 四块磁盘组成,查询相关磁盘信息发现
这里确定rhdisk7磁盘异常,通过kfed分析磁盘情况
D:\BaiduNetdiskDownload\xifenfei>kfed read rhdisk7.dd kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 34 ; 0x001: 0x22 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 49407 ; 0x004: blk=49407 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 58396 ; 0x010: 0x0000e41c kfbh.fcn.wrap: 131072 ; 0x014: 0x00020000 kfbh.spare1: 4294967064 ; 0x018: 0xffffff18 kfbh.spare2: 2105310074 ; 0x01c: 0x7d7c7b7a 005918A00 00002200 0000C0FF 00000000 00000000 [."..............] 005918A10 0000E41C 00020000 FFFFFF18 7D7C7B7A [............z{|}] 005918A20 00000000 00000000 00000000 00000000 [................] Repeat 253 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] D:\BaiduNetdiskDownload\xifenfei>kfed read rhdisk7.dd blkn=1 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 006EF8A00 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] D:\BaiduNetdiskDownload\xifenfei>kfed read rhdisk7.dd blkn=2|more kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 33554432 ; 0x004: blk=33554432 kfbh.block.obj: 16777344 ; 0x008: file=128 kfbh.check: 3844041089 ; 0x00c: 0xe51f6981 kfbh.fcn.base: 1297484544 ; 0x010: 0x4d560b00 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdatb10.aunum: 0 ; 0x000: 0x00000000 kfdatb10.shrink: 49153 ; 0x004: 0xc001 kfdatb10.ub2pad: 20555 ; 0x006: 0x504b kfdatb10.auinfo[0].link.next: 2048 ; 0x008: 0x0800 kfdatb10.auinfo[0].link.prev: 2048 ; 0x00a: 0x0800 kfdatb10.auinfo[0].free: 0 ; 0x00c: 0x0000 kfdatb10.auinfo[0].total: 49153 ; 0x00e: 0xc001 kfdatb10.auinfo[1].link.next: 4096 ; 0x010: 0x1000 kfdatb10.auinfo[1].link.prev: 4096 ; 0x012: 0x1000 kfdatb10.auinfo[1].free: 0 ; 0x014: 0x0000 kfdatb10.auinfo[1].total: 0 ; 0x016: 0x0000 kfdatb10.auinfo[2].link.next: 6144 ; 0x018: 0x1800 kfdatb10.auinfo[2].link.prev: 6144 ; 0x01a: 0x1800 kfdatb10.auinfo[2].free: 0 ; 0x01c: 0x0000 kfdatb10.auinfo[2].total: 0 ; 0x01e: 0x0000 kfdatb10.auinfo[3].link.next: 8192 ; 0x020: 0x2000 kfdatb10.auinfo[3].link.prev: 8192 ; 0x022: 0x2000 kfdatb10.auinfo[3].free: 0 ; 0x024: 0x0000
对比磁盘可能的损坏情况,由于在aix 平台asm disk的block有一个特征一般0082开头,通过工具打开磁盘,检索该标记对比
正常磁盘
异常磁盘
通过上述分析,大概评估rhdisk7 元数据部分损坏的不光是block 0和1,人工修复继续使用的可能性不太大,而且基于客户的数据库不大,采取方案是直接拷贝数据文件、redo、控制文件到文件系统,然后在本地文件系统open库
运气不错,实现完美恢复数据0丢失