联系:QQ(5163721)
标题:11.2中修复CRS不能启动的例子-1-使用正常节点的gpnp profile修复损坏节点
作者:Lunar©版权所有[文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.]
SSD上的一个11.2 RAC的其中一个节点OS不能起来了,鼓捣半天还是不行
想想这个是2013年买的,才两年啊……,不知道是不是这个原因,反正很无语……
另一个10多年前的活动硬盘上那个RedHat 2上的Oracle 8.0.6都还可以使用
.
就这个环境,从其他活动硬盘上复制了节点1的老的备份到SSD上,尝试修复整个RAC。
由于只修改了节点1的IP跟我现在VBOX中的配置一致即可,且节点2是正常的,因此,无需大招。
只要两件事情;
1,在OS层面修改节点1的网络配置:
/etc/hosts /etc/sysconfig/network-sritps/oifcfg-eth1 /etc/sysconfig/network
2,把节点2的gpnp profile传给节点1
.
具体如下:
1,将2个节点的crsd都关闭,把节点2的profile.xml复制到节点1:
确认节点2的crs是关闭的:
[root@RAC2 ~]# ps -ef|grep d.bin root 5026 3697 0 22:49 pts/4 00:00:00 grep d.bin [root@RAC2 ~]#
2,确认当前的节点2的gpnp profile信息是正确的:
我这里主要是私有网络的IP地址应该为192.168.20.0网段:
即
可见这里是正确的。
[root@RAC2 ~]# gpnptool get Warning: some command line parameters were defaulted. Resulting command line: /u01/11.2.0/grid/bin/gpnptool.bin get -o- Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). GPnP service is not running on localhost. Found locally cached profile... <?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="12" ClusterUId="4c62524e6a7dff62ffdae005dc6a08d6" ClusterName="racdb" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.56.0" Adapter="eth0" Use="public,cluster_interconnect"/><gpnp:Network id="net2" Adapter="eth1" IP="192.168.20.0" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/racdb/asmparameterfile/registry.253.814453247"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>s+dn2dL74Wsg58TpBl1wukYt3JM=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>hGmm6px+trv2uYThXOlkXUvuMjSJng7ZXgcWdwGeOugAXWRd58f/cHvHbioeKi2XK0kcUnh5OW2a9Mlhpy52Xi8+QdZdHNh5DSZ02HggiEJf0o0T29TZJr2mafTYyNKqzhHRv3aidAxwrPjLPr80rk6tEhB60hY9Ew+G15Do7D4=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile> Success. Error CLSGPNP_NO_DAEMON getting profile. [root@RAC2 ~]#
注意这里,当所有CRS进程都不启动时,gpnp的信息来自于他自己的一个cache(猜测这个是从文件上保存的profile中读取到他自己的所谓cache的)
.
3,查看节点1当前的gpnp profile,注意,其中的net2的信息,是错误的:
<gpnp:Network id="net2" IP="192.168.88.0" Adapter="eth1" Use="cluster_interconnect"/> [root@RAC1 bin]# ./gpnptool get Warning: some command line parameters were defaulted. Resulting command line: ./gpnptool.bin get -o- Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). GPnP service is not running on localhost. Found locally cached profile... <?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="8" ClusterUId="4c62524e6a7dff62ffdae005dc6a08d6" ClusterName="racdb" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.56.0" Adapter="eth0" Use="public"/><gpnp:Network id="net2" IP="192.168.88.0" Adapter="eth1" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/racdb/asmparameterfile/registry.253.814453247"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>oNh4/PhSSqqj2jARHJa0GCNngHE=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>WThZxj2jGML+m+UrGrPBGZztS5xw6KJMb4eFL5l2MURoPAZCso5Ld9uTJ/taYpZAnamwYNYvkpouc/g/PuTI1WJGc2IBzFmF0ECnWXGGEPcS/8Sm4iiVvkyAYc/kNUA/DLzCvq1hozOopVgmQgfJZ/B+WVP423mBrnPFKYaFRKY=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile> Success. Error CLSGPNP_NO_DAEMON getting profile. [root@RAC1 bin]#
4,确认节点1的crs全部都是关闭的:
[root@RAC1 bin]# ps -ef|grep d.bin root 3523 2865 0 22:48 pts/1 00:00:00 grep d.bin [root@RAC1 bin]#
5,备份节点1当前的gpnp profile:
[root@RAC1 bin]# mkdir /home/oracle/gpnp [root@RAC1 bin]# export GPNPDIR=/home/oracle/gpnp [root@RAC1 bin]# env|grep GPNPDIR GPNPDIR=/home/oracle/gpnp [root@RAC1 bin]# ./gpnptool get -o=$GPNPDIR/profile.original Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). GPnP service is not running on localhost. Found locally cached profile... Resulting profile written to "/home/oracle/gpnp/profile.original". Success. Error CLSGPNP_NO_DAEMON getting profile. [root@RAC1 bin]# ll /home/oracle/gpnp/profile.original -rw-r--r-- 1 root root 1878 Apr 11 22:52 /home/oracle/gpnp/profile.original [root@RAC1 bin]#
6,将节点2的gpnp profile复制到节点1:
[root@RAC2 ~]# ll $GRID_HOME/gpnp/rac2 total 8 drwxr-x--T 6 grid oinstall 4096 May 3 2013 wallets drwxr-x--- 3 grid oinstall 4096 May 3 2013 profiles [root@RAC2 ~]# scp $GRID_HOME/gpnp/rac2/profiles/peer/profile.xml rac1:$GRID_HOME/gpnp/rac1/profiles/peer/profile.xml root@rac1's password: profile.xml 100% 1900 1.9KB/s 00:00 [root@RAC2 ~]#
7,启动节点1和节点2的crs(正常启动即可):
[root@RAC1 bin]# crsctl start crs CRS-4123: Oracle High Availability Services has been started. [root@RAC1 bin]#
8,可以看到,节点1已经可以正常启动了:
[root@RAC1 ~]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ASMDATA.dg ONLINE ONLINE rac1 ora.DATA.dg ONLINE ONLINE rac1 ora.LISTENER.lsnr ONLINE ONLINE rac1 ora.LISTENER_DG.lsnr ONLINE ONLINE rac1 ora.asm ONLINE ONLINE rac1 Started ora.gsd OFFLINE OFFLINE rac1 ora.net1.network ONLINE ONLINE rac1 ora.net2.network ONLINE ONLINE rac1 ora.ons ONLINE ONLINE rac1 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE rac1 ora.cvu 1 ONLINE ONLINE rac1 ora.oc4j 1 ONLINE ONLINE rac1 ora.r-dg1-vip.vip 1 ONLINE ONLINE rac1 ora.r-dg2-vip.vip 1 ONLINE INTERMEDIATE rac1 FAILED OVER ora.rac1.vip 1 ONLINE ONLINE rac1 ora.rac2.vip 1 ONLINE INTERMEDIATE rac1 FAILED OVER ora.racdb.db 1 ONLINE ONLINE rac1 Open 2 ONLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE rac1 [root@RAC1 ~]# gpnptool get Warning: some command line parameters were defaulted. Resulting command line: /u01/11.2.0/grid/bin/gpnptool.bin get -o- <?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="12" ClusterUId="4c62524e6a7dff62ffdae005dc6a08d6" ClusterName="racdb" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.56.0" Adapter="eth0" Use="public,cluster_interconnect"/><gpnp:Network id="net2" Adapter="eth1" IP="192.168.20.0" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/racdb/asmparameterfile/registry.253.814453247"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>s+dn2dL74Wsg58TpBl1wukYt3JM=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>hGmm6px+trv2uYThXOlkXUvuMjSJng7ZXgcWdwGeOugAXWRd58f/cHvHbioeKi2XK0kcUnh5OW2a9Mlhpy52Xi8+QdZdHNh5DSZ02HggiEJf0o0T29TZJr2mafTYyNKqzhHRv3aidAxwrPjLPr80rk6tEhB60hY9Ew+G15Do7D4=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile> Success. [root@RAC1 ~]#
这里看到节点1的网络信息也正常了
再启动节点2:
[root@RAC2 ~]# crsctl start crs CRS-4123: Oracle High Availability Services has been started. [root@RAC2 ~]#
除了network2(用于ADG的网络,还没有修改相应的配置),其他都正常了。
[root@RAC2 ~]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ASMDATA.dg ONLINE ONLINE rac1 ONLINE ONLINE rac2 ora.DATA.dg ONLINE ONLINE rac1 ONLINE ONLINE rac2 ora.LISTENER.lsnr ONLINE ONLINE rac1 ONLINE ONLINE rac2 ora.LISTENER_DG.lsnr ONLINE ONLINE rac1 ONLINE OFFLINE rac2 ora.asm ONLINE ONLINE rac1 Started ONLINE ONLINE rac2 Started ora.gsd OFFLINE OFFLINE rac1 OFFLINE OFFLINE rac2 ora.net1.network ONLINE ONLINE rac1 ONLINE ONLINE rac2 ora.net2.network ONLINE ONLINE rac1 ONLINE OFFLINE rac2 ora.ons ONLINE ONLINE rac1 ONLINE ONLINE rac2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE rac1 ora.cvu 1 ONLINE ONLINE rac1 ora.oc4j 1 ONLINE ONLINE rac1 ora.r-dg1-vip.vip 1 ONLINE ONLINE rac1 ora.r-dg2-vip.vip 1 ONLINE INTERMEDIATE rac1 FAILED OVER ora.rac1.vip 1 ONLINE ONLINE rac1 ora.rac2.vip 1 ONLINE ONLINE rac2 ora.racdb.db 1 ONLINE ONLINE rac1 Open 2 ONLINE ONLINE rac2 Open ora.scan1.vip 1 ONLINE ONLINE rac1 [root@RAC2 ~]#