Both the voice and ocr in 11gR2 are under the same disk group, so recovery is integrated
View normal backups
[grid@rac1 admin]$
[grid@rac1 admin]$ ocrconfig -showbackup rac1 2019/05/08 12:27:51 /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr rac1 2019/05/07 15:32:37 /u01/app/11.2.0/grid/cdata/rac-cluster/backup01.ocr rac1 2019/05/07 15:32:37 /u01/app/11.2.0/grid/cdata/rac-cluster/day.ocr rac1 2019/05/07 15:32:37 /u01/app/11.2.0/grid/cdata/rac-cluster/week.ocr PROT-25: Manual backups for the Oracle Cluster Registry are not available [grid@rac1 admin]$ crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 88e9adcf20db4f88bfba7ac8848ff68b (ORCL:VDKBACK) [OCR] Located 1 voting disk(s).
close database
[grid@rac1 admin]$ srvctl stop database -d racdb -o immediate
Shut down the cluster
[root@rac1 ~]# crsctl stop cluster -all -f
Check the physical device corresponding to asmdisk used by diskgroup ocr
[grid@rac1 admin]$ oracleasm querydisk -d VDKBACK
Disk "VDKBACK" is a valid ASM disk on device /dev/sdf1[8,81]
Simulated failure
[root@rac1 ~]# dd if=/dev/zero of=/dev/sdf1 bs=1024K count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.006116 seconds, 171 MB/s [root@rac1 ~]#
Open cluster
crsctl start cluster -all
It's stuck. It'll exit in a long time
[root@rac1 ~]# [root@rac1 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager [root@rac1 ~]# crsctl start cluster -all CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1' CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac2' CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'rac1' CRS-2672: Attempting to start 'ora.diskmon' on 'rac1' CRS-2676: Start of 'ora.cssdmonitor' on 'rac2' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'rac2' CRS-2672: Attempting to start 'ora.diskmon' on 'rac2' CRS-2674: Start of 'ora.diskmon' on 'rac1' failed CRS-2679: Attempting to clean 'ora.diskmon' on 'rac1' CRS-2674: Start of 'ora.diskmon' on 'rac2' failed CRS-2679: Attempting to clean 'ora.diskmon' on 'rac2' CRS-2681: Clean of 'ora.diskmon' on 'rac1' succeeded CRS-2681: Clean of 'ora.diskmon' on 'rac2' succeeded
CRS-4404: The following nodes did not reply within the allotted time: rac1, rac2
crsctl start crs
[root@rac1 ~]# crsctl start crs CRS-4640: Oracle High Availability Services is already active CRS-4000: Command Start failed, or completed with errors.
Check logs
No exception found in more / var/log/messages
more $ORACLE_HOME/log/rac1/cssd/ocssd.log
File error found
[root@rac1 ~]# tail -30 $ORACLE_HOME/log/rac1/cssd/ocssd.log 2019-05-08 16:34:29.389: [ CLSF][1146157376]checksum failed for disk:ORCL:VDKOCR1: 2019-05-08 16:34:29.389: [ CLSF][1146157376]Read ASM header off dev:ORCL:VDKOCR1:0:0 2019-05-08 16:34:29.389: [ SKGFD][1146157376]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x72f7190 for disk :ORCL:VDKOCR1: 2019-05-08 16:34:29.389: [ CLSF][1146157376]Read ASM header off dev:ORCL:VDKOCR2:0:0 2019-05-08 16:34:29.390: [ SKGFD][1146157376]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x72f7b30 for disk :ORCL:VDKOCR2: 2019-05-08 16:34:29.401: [ CLSF][1146157376]Read ASM header off dev:ORCL:VDKVOTE:0:0 2019-05-08 16:34:29.401: [ SKGFD][1146157376]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x72f84d0 for disk :ORCL:VDKVOTE: 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmvDiskVerify: Successful discovery of 0 disks 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssnmvFindInitialConfigs: No voting files found 2019-05-08 16:34:29.401: [ CSSD][1146157376]################################### 2019-05-08 16:34:29.401: [ CSSD][1146157376]clssscExit: CSSD signal 11 in thread clssnmvDDiscThread 2019-05-08 16:34:29.401: [ CSSD][1146157376]################################### 2019-05-08 16:34:29.401: [ CSSD][1146157376] ----- Call Stack Trace ----- 2019-05-08 16:34:29.401: [ CSSD][1135667520]clssgmClientShutdown: total iocapables 0 2019-05-08 16:34:29.401: [ CSSD][1135667520]clssgmClientShutdown: graceful shutdown completed. 2019-05-08 16:34:29.401: [ CSSD][1146157376]calling call entry argument values in hex 2019-05-08 16:34:29.402: [ CSSD][1146157376]location type point (? means dubious value) 2019-05-08 16:34:29.402: [ CSSD][1146157376]-------------------- -------- -------------------- ---------------------------- [root@rac1 ~]#
[root@rac1 ~]# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]
[root@rac1 ~]# /etc/init.d/oracleasm listdisks
VDKDATA
VDKOCR1
VDKOCR2
VDKVOTE
Found a missing asmdisk VDKBACK for OCR
Rebuild
[root@rac1 ~]# /usr/sbin/oracleasm createdisk VDKBACK /dev/sdf1
Writing disk header: done
Instantiating disk: done
[root@rac1 ~]#
Close cluster
[root@rac1 ~]# crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1' CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1' CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1' CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1' CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed CRS-4133: Oracle High Availability Services has been stopped.
Start the cluster as - excl -nocrs, which will start the ASM instance but not CRS
[root@rac1 ~]# crsctl start crs -excl -nocrs
[root@rac1 ~]# crsctl start crs -excl CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.gipcd' on 'rac1' CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1' CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1' CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1' CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'rac1' CRS-2679: Attempting to clean 'ora.diskmon' on 'rac1' CRS-2681: Clean of 'ora.diskmon' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.diskmon' on 'rac1' CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.ctssd' on 'rac1' CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac1' CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded CRS-2676: Start of 'ora.drivers.acfs' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'rac1' CRS-2676: Start of 'ora.asm' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.crsd' on 'rac1' CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded [root@rac1 ~]# crs_stat -t CRS-0184: Cannot communicate with the CRS daemon.
[root@rac1 ~]# crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
[root@rac1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager
To rebuild the disk group of the original ocr and votedisk:
Note: This is under grid user
[grid@rac1 admin]$ sqlplus "/as sysasm" SQL*Plus: Release 11.2.0.1.0 Production on Wed May 8 17:03:35 2019 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> col path for a50 SQL> select path,header_status from v$asm_disk; PATH HEADER_STATU -------------------------------------------------- ------------ ORCL:VDKBACK PROVISIONED ORCL:VDKDATA MEMBER ORCL:VDKVOTE MEMBER ORCL:VDKOCR2 MEMBER ORCL:VDKOCR1 MEMBER SQL> create diskgroup OCR EXTERNAL REDUNDANCY DISK 'ORCL:VDKBACK' ; Diskgroup created.
After the ocr is created, the ocr backup content is restored, and an error is reported
[root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup01.ocr PROT-16: Internal Error [root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr PROT-16: Internal Error [root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr PROT-16: Internal Error
Check the network and find the following operations
[grid@rac1 admin]$ sqlplus "/as sysasm" SQL*Plus: Release 11.2.0.1.0 Production on Wed May 8 17:38:49 2019 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> ^[[A " - rest of line ignored. SQL> 042: unknown command " SQL> drop diskgroup OCR; Diskgroup dropped. SQL> create diskgroup OCR EXTERNAL REDUNDANCY DISK 'ORCL:VDKBACK' attribute 'compatible.rdbms' = '11.1.0.0.0','compatible.asm' = '11.1.0.0.0'; Diskgroup created.
[grid @ Rac1 Admin] $ocrconfig - restore / u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr executed successfully
[root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr [root@rac1 ~]#
crsctl replace votedisk +OCR
After the recovery of docr and vot, crs and other services will be automatically started
crsctl query css votedisk
[root@rac1 ~]# crsctl stop crs
[root@rac1 ~]# crsctl start crs