联系:手机/微信(+86 17813235971) QQ(107644445)
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
客户有一套oracle exadata x3-2的1/4配置(采用高容量磁盘)的机器,反馈由于flash卡异常导致性能很慢,通过临时关闭异常卡所在机器业务恢复正常
相关版本信息
[root@oa0cel03 ~]# imageinfo Kernel version: 2.6.32-400.11.1.el5uek #1 SMP Thu Nov 22 03:29:09 PST 2012 x86_64 Cell version: OSS_11.2.3.2.1_LINUX.X64_130109 Cell rpm version: cell-11.2.3.2.1_LINUX.X64_130109-1 Active image version: 11.2.3.2.1.130109 Active image activated: 2013-06-27 02:24:19 -0700 Active image status: success Active system partition on device: /dev/md6 Active software partition on device: /dev/md8 In partition rollback: Impossible Cell boot usb partition: /dev/sdm1 Cell boot usb version: 11.2.3.2.1.130109 Inactive image version: 11.2.3.2.0.120713 Inactive image activated: 2012-10-14 06:46:16 -0700 Inactive image status: success Inactive system partition on device: /dev/md5 Inactive software partition on device: /dev/md7 [root@oa0cel03 ~]# cellcli CellCLI: Release 11.2.3.2.1 - Production on Thu Jun 20 18:28:37 CST 2024 Copyright (c) 2007, 2012, Oracle. All rights reserved. Cell Efficiency Ratio: 3,617 CellCLI> list cell detail name: oa0cel03 bbuTempThreshold: 60 bbuChargeThreshold: 800 bmcType: IPMI cellVersion: OSS_11.2.3.2.1_LINUX.X64_130109 cpuCount: 24 diagHistoryDays: 7 fanCount: 8/8 fanStatus: normal flashCacheMode: WriteBack id: 1238FM507A interconnectCount: 3 interconnect1: bondib0 iormBoost: 0.0 ipaddress1: 192.168.10.5/22 kernelVersion: 2.6.32-400.11.1.el5uek locatorLEDStatus: off makeModel: Oracle Corporation SUN FIRE X4270 M3 SAS metricHistoryDays: 7 notificationMethod: snmp notificationPolicy: critical,warning,clear offloadEfficiency: 3,616.5 powerCount: 2/2 powerStatus: normal releaseVersion: 11.2.3.2.1 releaseTrackingBug: 14522699 snmpSubscriber: host=oa0db02.qhsrmyy.com,port=3872,community=cell host=oa0db01.qhsrmyy.com,port=3872,community=cell status: online temperatureReading: 28.0 temperatureStatus: normal upTime: 0 days, 3:49 cellsrvStatus: running msStatus: running rsStatus: running
客户第一次换盘之后,依旧有性能问题,先把griddisk给inactive
[root@oa0cel03 ~]# cellcli -e list metriccurrent attributes name,metricvalue where name like 'FC_BY_DIRTY.*' FC_BY_DIRTY 38,820 MB [root@oa0cel03 ~]# cellcli -e "alter flashcache all flush" Flash cache on FD_00_oa0cel03 successfully altered Flash cache on FD_01_oa0cel03 successfully altered Flash cache on FD_02_oa0cel03 successfully altered Flash cache on FD_03_oa0cel03 successfully altered Flash cache on FD_04_oa0cel03 successfully altered Flash cache on FD_05_oa0cel03 successfully altered Flash cache on FD_06_oa0cel03 successfully altered Flash cache on FD_07_oa0cel03 successfully altered Flash cache on FD_09_exastlx01 successfully altered Flash cache on FD_10_exastlx01 successfully altered Flash cache on FD_11_oa0cel03 skipped because FD_11_oa0cel03 is degraded Flash cache on FD_12_oa0cel03 successfully altered Flash cache on FD_13_oa0cel03 successfully altered Flash cache on FD_14_oa0cel03 successfully altered Flash cache on FD_15_oa0cel03 successfully altered [root@oa0cel03 ~]# cellcli -e list metriccurrent attributes name,metricvalue where name like 'FC_BY_DIRTY.*' FC_BY_DIRTY 0.000 MB [root@oa0cel03 ~]# cellcli -e "alter griddisk all inactive" GridDisk DATA_oa0_CD_00_oa0cel03 successfully altered GridDisk DATA_oa0_CD_01_oa0cel03 successfully altered GridDisk DATA_oa0_CD_02_oa0cel03 successfully altered GridDisk DATA_oa0_CD_03_oa0cel03 successfully altered GridDisk DATA_oa0_CD_04_oa0cel03 successfully altered GridDisk DATA_oa0_CD_05_oa0cel03 successfully altered GridDisk DATA_oa0_CD_06_oa0cel03 successfully altered GridDisk DATA_oa0_CD_07_oa0cel03 successfully altered GridDisk DATA_oa0_CD_08_oa0cel03 successfully altered GridDisk DATA_oa0_CD_09_oa0cel03 successfully altered GridDisk DATA_oa0_CD_10_oa0cel03 successfully altered GridDisk DATA_oa0_CD_11_oa0cel03 successfully altered GridDisk DBFS_DG_CD_02_oa0cel03 successfully altered GridDisk DBFS_DG_CD_03_oa0cel03 successfully altered GridDisk DBFS_DG_CD_04_oa0cel03 successfully altered GridDisk DBFS_DG_CD_05_oa0cel03 successfully altered GridDisk DBFS_DG_CD_06_oa0cel03 successfully altered GridDisk DBFS_DG_CD_07_oa0cel03 successfully altered GridDisk DBFS_DG_CD_08_oa0cel03 successfully altered GridDisk DBFS_DG_CD_09_oa0cel03 successfully altered GridDisk DBFS_DG_CD_10_oa0cel03 successfully altered GridDisk DBFS_DG_CD_11_oa0cel03 successfully altered GridDisk RECO_oa0_CD_00_oa0cel03 successfully altered GridDisk RECO_oa0_CD_01_oa0cel03 successfully altered GridDisk RECO_oa0_CD_02_oa0cel03 successfully altered GridDisk RECO_oa0_CD_03_oa0cel03 successfully altered GridDisk RECO_oa0_CD_04_oa0cel03 successfully altered GridDisk RECO_oa0_CD_05_oa0cel03 successfully altered GridDisk RECO_oa0_CD_06_oa0cel03 successfully altered GridDisk RECO_oa0_CD_07_oa0cel03 successfully altered GridDisk RECO_oa0_CD_08_oa0cel03 successfully altered GridDisk RECO_oa0_CD_09_oa0cel03 successfully altered GridDisk RECO_oa0_CD_10_oa0cel03 successfully altered GridDisk RECO_oa0_CD_11_oa0cel03 successfully altered [root@oa0cel03 ~]# cellcli -e list griddisk DATA_oa0_CD_00_oa0cel03 inactive DATA_oa0_CD_01_oa0cel03 inactive DATA_oa0_CD_02_oa0cel03 inactive DATA_oa0_CD_03_oa0cel03 inactive DATA_oa0_CD_04_oa0cel03 inactive DATA_oa0_CD_05_oa0cel03 inactive DATA_oa0_CD_06_oa0cel03 inactive DATA_oa0_CD_07_oa0cel03 inactive DATA_oa0_CD_08_oa0cel03 inactive DATA_oa0_CD_09_oa0cel03 inactive DATA_oa0_CD_10_oa0cel03 inactive DATA_oa0_CD_11_oa0cel03 inactive DBFS_DG_CD_02_oa0cel03 inactive DBFS_DG_CD_03_oa0cel03 inactive DBFS_DG_CD_04_oa0cel03 inactive DBFS_DG_CD_05_oa0cel03 inactive DBFS_DG_CD_06_oa0cel03 inactive DBFS_DG_CD_07_oa0cel03 inactive DBFS_DG_CD_08_oa0cel03 inactive DBFS_DG_CD_09_oa0cel03 inactive DBFS_DG_CD_10_oa0cel03 inactive DBFS_DG_CD_11_oa0cel03 inactive RECO_oa0_CD_00_oa0cel03 inactive RECO_oa0_CD_01_oa0cel03 inactive RECO_oa0_CD_02_oa0cel03 inactive RECO_oa0_CD_03_oa0cel03 inactive RECO_oa0_CD_04_oa0cel03 inactive RECO_oa0_CD_05_oa0cel03 inactive RECO_oa0_CD_06_oa0cel03 inactive RECO_oa0_CD_07_oa0cel03 inactive RECO_oa0_CD_08_oa0cel03 inactive RECO_oa0_CD_09_oa0cel03 inactive RECO_oa0_CD_10_oa0cel03 inactive RECO_oa0_CD_11_oa0cel03 inactive [root@oa0cel03 ~]# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome DATA_oa0_CD_00_oa0cel03 OFFLINE Yes DATA_oa0_CD_01_oa0cel03 OFFLINE Yes DATA_oa0_CD_02_oa0cel03 OFFLINE Yes DATA_oa0_CD_03_oa0cel03 OFFLINE Yes DATA_oa0_CD_04_oa0cel03 OFFLINE Yes DATA_oa0_CD_05_oa0cel03 OFFLINE Yes DATA_oa0_CD_06_oa0cel03 OFFLINE Yes DATA_oa0_CD_07_oa0cel03 OFFLINE Yes DATA_oa0_CD_08_oa0cel03 OFFLINE Yes DATA_oa0_CD_09_oa0cel03 OFFLINE Yes DATA_oa0_CD_10_oa0cel03 OFFLINE Yes DATA_oa0_CD_11_oa0cel03 OFFLINE Yes DBFS_DG_CD_02_oa0cel03 OFFLINE Yes DBFS_DG_CD_03_oa0cel03 OFFLINE Yes DBFS_DG_CD_04_oa0cel03 OFFLINE Yes DBFS_DG_CD_05_oa0cel03 OFFLINE Yes DBFS_DG_CD_06_oa0cel03 OFFLINE Yes DBFS_DG_CD_07_oa0cel03 OFFLINE Yes DBFS_DG_CD_08_oa0cel03 OFFLINE Yes DBFS_DG_CD_09_oa0cel03 OFFLINE Yes DBFS_DG_CD_10_oa0cel03 OFFLINE Yes DBFS_DG_CD_11_oa0cel03 OFFLINE Yes RECO_oa0_CD_00_oa0cel03 OFFLINE Yes RECO_oa0_CD_01_oa0cel03 OFFLINE Yes RECO_oa0_CD_02_oa0cel03 OFFLINE Yes RECO_oa0_CD_03_oa0cel03 OFFLINE Yes RECO_oa0_CD_04_oa0cel03 OFFLINE Yes RECO_oa0_CD_05_oa0cel03 OFFLINE Yes RECO_oa0_CD_06_oa0cel03 OFFLINE Yes RECO_oa0_CD_07_oa0cel03 OFFLINE Yes RECO_oa0_CD_08_oa0cel03 OFFLINE Yes RECO_oa0_CD_09_oa0cel03 OFFLINE Yes RECO_oa0_CD_10_oa0cel03 OFFLINE Yes RECO_oa0_CD_11_oa0cel03 OFFLINE Yes
客户继续换卡尝试,最终确认4号卡槽损坏,放弃这个槽位重建flashcache
[root@oa0cel03 ~]# cellcli -e list celldisk CD_00_oa0cel03 normal CD_01_oa0cel03 normal CD_02_oa0cel03 normal CD_03_oa0cel03 normal CD_04_oa0cel03 normal CD_05_oa0cel03 normal CD_06_oa0cel03 normal CD_07_oa0cel03 normal CD_08_oa0cel03 normal CD_09_oa0cel03 normal CD_10_oa0cel03 normal CD_11_oa0cel03 normal FD_00_oa0cel03 not present FD_01_oa0cel03 not present FD_02_oa0cel03 not present FD_03_oa0cel03 not present FD_04_oa0cel03 normal FD_05_oa0cel03 normal FD_06_oa0cel03 normal FD_07_oa0cel03 normal FD_08_oa0cel03 normal FD_09_oa0cel03 normal FD_10_oa0cel03 normal FD_10_exastlx01 normal FD_12_oa0cel03 normal FD_13_oa0cel03 normal FD_14_oa0cel03 normal FD_15_oa0cel03 normal
这个里面FD_10_exastlx01名字是以前老的卡上面留下来的,太影响视觉感官了,删除重建
[root@oa0cel03 ~]# cellcli -e drop celldisk FD_10_exastlx01 CellDisk FD_10_exastlx01 successfully dropped [root@oa0cel03 ~]# cellcli -e create celldisk all flashdisk CellDisk FD_11_oa0cel03 successfully created [root@oa0cel03 ~]# cellcli -e list celldisk CD_00_oa0cel03 normal CD_01_oa0cel03 normal CD_02_oa0cel03 normal CD_03_oa0cel03 normal CD_04_oa0cel03 normal CD_05_oa0cel03 normal CD_06_oa0cel03 normal CD_07_oa0cel03 normal CD_08_oa0cel03 normal CD_09_oa0cel03 normal CD_10_oa0cel03 normal CD_11_oa0cel03 normal FD_00_oa0cel03 not present FD_01_oa0cel03 not present FD_02_oa0cel03 not present FD_03_oa0cel03 not present FD_04_oa0cel03 normal FD_05_oa0cel03 normal FD_06_oa0cel03 normal FD_07_oa0cel03 normal FD_08_oa0cel03 normal FD_09_oa0cel03 normal FD_10_oa0cel03 normal FD_11_oa0cel03 normal FD_12_oa0cel03 normal FD_13_oa0cel03 normal FD_14_oa0cel03 normal FD_15_oa0cel03 normal
删除flashlog和flashcache
[root@oa0cel03 ~]# cellcli -e drop flashlog Flash log oa0cel03_FLASHLOG successfully dropped [root@oa0cel03 ~]# [root@oa0cel03 ~]# [root@oa0cel03 ~]# [root@oa0cel03 ~]# cellcli -e drop flashcache Flash cache oa0cel03_FLASHCACHE successfully dropped
尝试重建flashlog和flashcache
[root@oa0cel03 ~]# cellcli -e create flashlog all size=512M Flash log oa0cel03_FLASHLOG successfully created, but the following cell disks were degraded because their statuses are not normal: FD_00_oa0cel03, FD_01_oa0cel03, FD_02_oa0cel03, FD_03_oa0cel03
由于有一些celldisk实际硬盘不存在,无法直接创建成功,需要删除对应的celldisk
[root@oa0cel03 ~]# cellcli -e drop celldisk FD_00_oa0cel03 CELL-04519: Cannot complete the drop of cell disk: FD_00_oa0cel03. Received error: CELL-04516: LUN Object cannot be obtained for cell disk: FD_00_oa0cel03 Cell disks not dropped: FD_00_oa0cel03 --强制删除 [root@oa0cel03 ~]# cellcli -e drop celldisk FD_00_oa0cel03 force CellDisk FD_00_oa0cel03 successfully dropped [root@oa0cel03 ~]# cellcli -e drop celldisk FD_01_oa0cel03 force CellDisk FD_01_oa0cel03 successfully dropped [root@oa0cel03 ~]# cellcli -e drop celldisk FD_02_oa0cel03 force CellDisk FD_02_oa0cel03 successfully dropped [root@oa0cel03 ~]# cellcli -e drop celldisk FD_03_oa0cel03 force CellDisk FD_03_oa0cel03 successfully dropped [root@oa0cel03 ~]# cellcli -e list celldisk CD_00_oa0cel03 normal CD_01_oa0cel03 normal CD_02_oa0cel03 normal CD_03_oa0cel03 normal CD_04_oa0cel03 normal CD_05_oa0cel03 normal CD_06_oa0cel03 normal CD_07_oa0cel03 normal CD_08_oa0cel03 normal CD_09_oa0cel03 normal CD_10_oa0cel03 normal CD_11_oa0cel03 normal FD_04_oa0cel03 normal FD_05_oa0cel03 normal FD_06_oa0cel03 normal FD_07_oa0cel03 normal FD_08_oa0cel03 normal FD_09_oa0cel03 normal FD_10_oa0cel03 normal FD_11_oa0cel03 normal FD_12_oa0cel03 normal FD_13_oa0cel03 normal FD_14_oa0cel03 normal FD_15_oa0cel03 normal
创建flashlog和flashcache
[root@oa0cel03 ~]# cellcli -e create flashlog all size=512M Flash log oa0cel03_FLASHLOG successfully created [root@oa0cel03 ~]# cellcli -e list flashlog detail name: oa0cel03_FLASHLOG cellDisk: ………… creationTime: 2024-06-21T18:20:51+08:00 degradedCelldisks: effectiveSize: 384M efficiency: 100.0 id: f3ab3882-fa03-4f49-b0ca-879ef3f2ac05 size: 384M status: normal [root@oa0cel03 ~]# cellcli -e create flashcache all Flash cache oa0cel03_FLASHCACHE successfully created [root@oa0cel03 ~]# cellcli -e list flashcache detail name: oa0cel03_FLASHCACHE cellDisk: ………… creationTime: 2024-06-21T18:21:24+08:00 degradedCelldisks: effectiveCacheSize: 1116.5625G id: 2195ac46-3021-461f-a6d5-5f64ff1da546 size: 1116.5625G status: normal [root@oa0cel03 ~]# cellcli -e list cell detail | grep flashCacheMode flashCacheMode: WriteBack
active griddisk,把这个cell的griddisk加入到asm磁盘组中
[root@oa0cel03 ~]# cellcli -e "alter griddisk all active" GridDisk DATA_oa0_CD_00_oa0cel03 successfully altered GridDisk DATA_oa0_CD_01_oa0cel03 successfully altered GridDisk DATA_oa0_CD_02_oa0cel03 successfully altered GridDisk DATA_oa0_CD_03_oa0cel03 successfully altered GridDisk DATA_oa0_CD_04_oa0cel03 successfully altered GridDisk DATA_oa0_CD_05_oa0cel03 successfully altered GridDisk DATA_oa0_CD_06_oa0cel03 successfully altered GridDisk DATA_oa0_CD_07_oa0cel03 successfully altered GridDisk DATA_oa0_CD_08_oa0cel03 successfully altered GridDisk DATA_oa0_CD_09_oa0cel03 successfully altered GridDisk DATA_oa0_CD_10_oa0cel03 successfully altered GridDisk DATA_oa0_CD_11_oa0cel03 successfully altered GridDisk DBFS_DG_CD_02_oa0cel03 successfully altered GridDisk DBFS_DG_CD_03_oa0cel03 successfully altered GridDisk DBFS_DG_CD_04_oa0cel03 successfully altered GridDisk DBFS_DG_CD_05_oa0cel03 successfully altered GridDisk DBFS_DG_CD_06_oa0cel03 successfully altered GridDisk DBFS_DG_CD_07_oa0cel03 successfully altered GridDisk DBFS_DG_CD_08_oa0cel03 successfully altered GridDisk DBFS_DG_CD_09_oa0cel03 successfully altered GridDisk DBFS_DG_CD_10_oa0cel03 successfully altered GridDisk DBFS_DG_CD_11_oa0cel03 successfully altered GridDisk RECO_oa0_CD_00_oa0cel03 successfully altered GridDisk RECO_oa0_CD_01_oa0cel03 successfully altered GridDisk RECO_oa0_CD_02_oa0cel03 successfully altered GridDisk RECO_oa0_CD_03_oa0cel03 successfully altered GridDisk RECO_oa0_CD_04_oa0cel03 successfully altered GridDisk RECO_oa0_CD_05_oa0cel03 successfully altered GridDisk RECO_oa0_CD_06_oa0cel03 successfully altered GridDisk RECO_oa0_CD_07_oa0cel03 successfully altered GridDisk RECO_oa0_CD_08_oa0cel03 successfully altered GridDisk RECO_oa0_CD_09_oa0cel03 successfully altered GridDisk RECO_oa0_CD_10_oa0cel03 successfully altered GridDisk RECO_oa0_CD_11_oa0cel03 successfully altered [root@oa0cel03 ~]# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome,status DATA_oa0_CD_00_oa0cel03 SYNCING Yes active DATA_oa0_CD_01_oa0cel03 SYNCING Yes active DATA_oa0_CD_02_oa0cel03 SYNCING Yes active DATA_oa0_CD_03_oa0cel03 SYNCING Yes active DATA_oa0_CD_04_oa0cel03 SYNCING Yes active DATA_oa0_CD_05_oa0cel03 SYNCING Yes active DATA_oa0_CD_06_oa0cel03 SYNCING Yes active DATA_oa0_CD_07_oa0cel03 SYNCING Yes active DATA_oa0_CD_08_oa0cel03 SYNCING Yes active DATA_oa0_CD_09_oa0cel03 SYNCING Yes active DATA_oa0_CD_10_oa0cel03 SYNCING Yes active DATA_oa0_CD_11_oa0cel03 SYNCING Yes active DBFS_DG_CD_02_oa0cel03 ONLINE Yes active DBFS_DG_CD_03_oa0cel03 ONLINE Yes active DBFS_DG_CD_04_oa0cel03 ONLINE Yes active DBFS_DG_CD_05_oa0cel03 ONLINE Yes active DBFS_DG_CD_06_oa0cel03 ONLINE Yes active DBFS_DG_CD_07_oa0cel03 ONLINE Yes active DBFS_DG_CD_08_oa0cel03 ONLINE Yes active DBFS_DG_CD_09_oa0cel03 ONLINE Yes active DBFS_DG_CD_10_oa0cel03 ONLINE Yes active DBFS_DG_CD_11_oa0cel03 ONLINE Yes active RECO_oa0_CD_00_oa0cel03 SYNCING Yes active RECO_oa0_CD_01_oa0cel03 SYNCING Yes active RECO_oa0_CD_02_oa0cel03 SYNCING Yes active RECO_oa0_CD_03_oa0cel03 SYNCING Yes active RECO_oa0_CD_04_oa0cel03 SYNCING Yes active RECO_oa0_CD_05_oa0cel03 SYNCING Yes active RECO_oa0_CD_06_oa0cel03 SYNCING Yes active RECO_oa0_CD_07_oa0cel03 SYNCING Yes active RECO_oa0_CD_08_oa0cel03 SYNCING Yes active RECO_oa0_CD_09_oa0cel03 SYNCING Yes active RECO_oa0_CD_10_oa0cel03 SYNCING Yes active RECO_oa0_CD_11_oa0cel03 SYNCING Yes active [root@oa0cel03 ~]# cellcli -e list metriccurrent attributes name,metricvalue where name like 'FC_BY_DIRTY.*' FC_BY_DIRTY 585 MB