Skip to main content

How to Recover Diskgroup in Case of Transient Failures



 Uses of  DISK_REPAIR_TIME and FAILGROUP_REPAIR_TIME Attribute

DISK_REPAIR_TIME:

DISK_REPAIR_TIME is an attribute for diskgroup and the default value is 3.6Hrs. If ASM unable to perform IO on any disk in the diskgroup it place taht disk offline and wait for time specified in DISK_REPAIR_TIME for the disks be back online. If the disk did not back online with in this time period Oracle ASM then drops that disk from the diskgroup. 

SQL> select name,value from v$asm_attribute where group_number=1 and name like '%disk_repair_time%';
NAME                                     VALUE
---------------------------------------- --------------------------------------------------
disk_repair_time                         3.6 h

Once you hit a failure and the disk is no more available, ASM marks this disk offline and time starts ticking until disk_repair_time value is reached. If the issue is not fixed and disk is still not available asm drops the disk from diskgroup. Once the disk is dropped rebalance operation is triggered, which may take longer to complete depending on many factors like power limit used, amount of data to rebalance etc..
and once the disk is available to server again you will add it back to the diskgroup and again rebalance operation will take place. This is definitely going to take longer time.
NOTE:
If you hit the issue and disk is dropped you can follow the instruction provided in How to Add Dropped Disks Back into Diskgroup to repair it.

If you know how long it may take in your environment to fix an issue like this you can avoid the disk drop adjusting the value of disk_repair_time attribute. So if disk_repair_time (3.6Hrs) not enough for your environment consider increasing it.

 Lets say its 72Hrs. In this case once the disk suffers failure ASM marks it offline and keeps tracking the changes on the disk. Say it took 50 Hrs to fix the disk issue. Disk is still marked offline but oracle ASM has all the changes of 50Hrs. Once  the disk is back you can simply bring it online and oracle has to just synchronize the changes resulting faster recovery of the disk and less administration effort.

How to Bring the Disk Online 

Once the disk is repaired you can put it back online using one of the following command.
ALTER DISKGROUP DATA ONLINE ALL;
ALTER DISKGROUP DATA ONLINE DISK DATA_001;
ALTER DISKGROUP DATA ONLINE DISKS IN FAILGROUP FG2;

NOTE: You can monitor the progress of online disk command in V$ASM_OPERATION view
SQL> SELECT GROUP_NUMBER, PASS, STATE FROM V$ASM_OPERATION;
 
GROUP_NUMBER PASS      STAT
------------ --------- ----
           1 RESYNC    RUN
           1 REBALANCE WAIT
           1 COMPACT   WAIT

    
NOTE: An offline operation does not generate a display in a V$ASM_OPERATION view query because it does not cause any rebalance to occur.
The drop operation but triggers rebalance and again can be monitored from V$ASM_OPERATION view

You can read more about Oracle ASM fast Mirror Resync in post Oracle ASM Fast Mirror Resync

How to increase DISK_REPAIR_TIME Attribute Value

ALTER DISKGROUP DATA SET ATTRIBUTE 'disk_repair_time' = '72.5h';
ALTER DISKGROUP DATA SET ATTRIBUTE 'disk_repair_time' = '4350m';

NOTE: You can specify the disk_group_repair time value either in Hours or in Minutes. Also note that the modified value will only be applicable for new failures. The above modification will not affect already offline disks

FAILGROUP_REPAIR_TIME

FAILGROUP_REPAIR_TIME: This attribute specifies a default repair time for the failure groups in the disk group. Its used when entire failure group has failed.
for Example if all the disks in a failgroup is offline for example (Say complete storage from one of the datacenter is offline).he default value is 24 hours (24h)


If default value 24h is not sufficient for your environment, you can change it using alter diskgroup command

ALTER DISKGROUP DATA SET ATTRIBUTE 'failgroup_repair_time' = '72.5h';


Comments

Popular posts from this blog

How to Power On/off Oracle Exadata Machine

<<Back to Exadata Main Page How to Power On/off Oracle Exadata Machine Oracle Exadata machines can be powered on/off either by pressing the power button on front of the server or by logging in to the ILOM interface. Powering on servers using  button on front of the server The power on sequence is as follows. 1. Start Rack, including switches  Note:- Ensure the switches have had power applied for a few minutes to complete power on  configuration before starting Exadata Storage Servers 2.Start Exadata Storage Servers  Note:- Ensure all Exadata Storage Servers complete the boot process before starting the   database servers 3. Start Database Servers Powering On Servers Remotely using ILOM The ILOM can be accessed using the Web console, the command-line interface (CLI), IPMI, or SNMP. For example, to apply power to server dm01cel01 using IPMI, where dm01cel01-ilom is the host name of the ILOM for the serve...

ORA-28374: typed master key not found in wallet

<<Back to Oracle DB Security Main Page ORA-46665: master keys not activated for all PDBs during REKEY SQL> ADMINISTER KEY MANAGEMENT SET KEY FORCE KEYSTORE IDENTIFIED BY xxxx WITH BACKUP CONTAINER = ALL ; ADMINISTER KEY MANAGEMENT SET KEY FORCE KEYSTORE IDENTIFIED BY xxxx WITH BACKUP CONTAINER = ALL * ERROR at line 1: ORA-46665: master keys not activated for all PDBs during REKEY I found following in the trace file REKEY: Create Key in PDB 3 resulted in error 46658 *** 2019-02-06T15:27:04.667485+01:00 (CDB$ROOT(1)) REKEY: Activation of Key AdnU5OzNP08Qv1mIyXhP/64AAAAAAAAAAAAAAAAAAAAAAAAAAAAA in PDB 3 resulted in error 28374 REKEY: Keystore needs to be restored from the REKEY backup.Aborting REKEY! Cause: All this hassle started because I accidently deleted the wallet and all wallet backup files too and also forgot the keystore password. There was no way to restore the wallet back. Fortunately in my case the PDB which had encrypted data was supposed to be deco...

How to Find VIP of an Oracle RAC Cluster

<<Back to Oracle RAC Main Page How to Find Out VIP of an Oracle RAC Cluster Login clusterware owner (oracle) and execute the below command to find out the VIP hostname used in Oracle RAC $ olsnodes -i node1     node1-vip node2     node2-vip OR $ srvctl config nodeapps -viponly Network 1 exists Subnet IPv4: 10.0.0.0/255.255.0.0/bondeth0, static Subnet IPv6: Ping Targets: Network is enabled Network is individually enabled on nodes: Network is individually disabled on nodes: VIP exists: network number 1, hosting node node1 VIP Name: node1-vip VIP IPv4 Address: 10.0.0.1 VIP IPv6 Address: VIP is enabled. VIP is individually enabled on nodes: VIP is individually disabled on nodes: VIP exists: network number 1, hosting node node2 VIP Name: node2-vip VIP IPv4 Address: 10.0.0.2 VIP IPv6 Address: VIP is enabled. VIP is individually enabled on nodes: VIP is individually disabled on nodes:

ORA-46630: keystore cannot be created at the specified location

<<Back to DB Administration Main Page ORA-46630: keystore cannot be created at the specified location CDB011> ADMINISTER KEY MANAGEMENT CREATE KEYSTORE '+DATAC4/CDB01/wallet/' IDENTIFIED BY "xxxxxxx"; ADMINISTER KEY MANAGEMENT CREATE KEYSTORE '+DATAC4/CDB01/wallet/' IDENTIFIED BY "EncTest123" * ERROR at line 1: ORA-46630: keystore cannot be created at the specified location Cause  Creating a keystore at a location where there is already a keystore exists Solution To solve the problem, use a different location to create a keystore (use ENCRYPTION_WALLET_LOCATION in sqlnet.ora file to specify the keystore location), or move this ewallet.p12 file to some other location. Note: Oracle does not recommend deleting keystore file (ewallet.p12) that belongs to a database. If you have multiple keystores, you can choose to merge them rather than deleting either of them.

ORA-65104: operation not allowed on an inactive pluggable database alter pluggable database open

<<Back to DB Administration Main Page ORA-65104: operation not allowed on an inactive pluggable database SQL> alter pluggable database TEST_CLON open; alter pluggable database TEST_CLON open * ERROR at line 1: ORA-65104: operation not allowed on an inactive pluggable database Cause The pluggable database status was UNUSABLE. It was still being created or there was an error during the create operation. A PDB can only be opened if it is successfully created and its status is marked as NEW in cdb_pdbs.status column SQL> select PDB_NAME,STATUS from cdb_pdbs; PDB_NAME             STATUS -------------------- --------------------------- PDB$SEED             NORMAL TEST_CLON            UNUSABLE Solution:  Drop the PDB and create it again. Related Posts How to Clone Oracle PDB (Pluggable Database) with in the Same Container

ORA-16905: The member was not enabled yet

<<Back to Oracle DataGuard Main Page ORA-16905 Physical Standby Database is disabled DGMGRL> show configuration; Configuration - DG_ORCL1P   Protection Mode: MaxPerformance   Members:   ORCL1PP - Primary database     ORCL1PS - Physical standby database (disabled)       ORA-16905: The member was not enabled yet. Fast-Start Failover:  Disabled Configuration Status: SUCCESS   (status updated 58 seconds ago) DGMGRL> DGMGRL> enable database 'ORCL1PS'; Enabled. DGMGRL>  show configuration; Configuration - DG_ORCL1P   Protection Mode: MaxPerformance   Members:   ORCL1PP - Primary database     ORCL1PS - Physical standby database Fast-Start Failover:  Disabled Configuration Status: SUCCESS   (status updated 38 seconds ago)

How to Switch Log File from All Instances in RAC

<<Back to Oracle RAC Main Page Switch The Log File of All Instances in Oracle RAC. In many cases you need to switch the logfile of the database. You can switch logfile using alter system switch logfile command but if you want to switch the logfile from all the instances you need to execute the command on all the instances individually and therefore you must login on all the instances. You can avoid this and switch logfile of all instances by just running the below command from any of the instance in RAC database SQL> ALTER SYSTEM SWITCH ALL LOGFILE;   System altered.

Starting RMAN and connecting to Database

  <<Back to Oracle Backup & Recovery Main Page Starting RMAN and connecting to Database Starting RMAN and connecting to Database To start RMAN you need to set the environment and type rman and press enter. You can connect to database either using connect command or using command line option. using command line option localhost:$ export ORACLE_HOME=/ora_app/product/18c/dbd2 localhost:$ export PATH=$ORACLE_HOME/bin:$PATH localhost:$ export ORACLE_SID=ORCL1P localhost:$ rman target / Recovery Manager: Release 18.0.0.0.0 - Production on Sun Apr 4 08:11:01 2021 Version 18.11.0.0.0 Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved. connected to target database: ORCL1P (DBID=4215484517) RMAN> using connect option localhost:$ rman RMAN> connect target sys@ORCL1P  target database Password:******** connected to target database: ORCL1P (DBID=4215484517) NOTE: To use connect command you need to ensure that  you have proper TNS sentry...

How to Attach to a Datapump Job and Check Status of Export or Import

<<Back to Oracle DATAPUMP Main Page How to check the progress of  export or import Jobs You can attach to the export/import  job using ATTACH parameter of oracle datapump utility. Once you are attached to the job you check its status by typing STATUS command. Let us see how Step1>  Find the Export/Import Job Name You can find the datapump job information from  DBA_DATAPUMP_JOBS or  USER_DATAPUMP_JOBS view. SQL> SELECT OWNER_NAME,JOB_NAME,OPERATION,JOB_MODE,STATE from DBA_DATAPUMP_JOBS; OWNER_NAME JOB_NAME                       OPERATION            JOB_MODE   STATE ---------- ------------------------------ -------------------- ---------- ---------- SYSTEM     SYS_EXPORT_FULL_02          ...

ORA-46655: no valid keys in the file from which keys are to be imported

<<Back to DB Administration Main Page SQL> administer key management import encryption keys with secret "xxxx" from '/tmp/pdb02_tde_key.exp' force keystore identified by "xxxx" with backup; administer key management import encryption keys with secret "xxxxxx" from '/tmp/pdb02_tde_key.exp' force keystore identified by "xxxxxx" with backup * ERROR at line 1: ORA-46655: no valid keys in the file from which keys are to be imported Cause: Either the keys to be imported already present in the target database or correct container (PDB) is not set. Solution: In my case I got the error because I attempted to import the keys for newly plugged database PDB02 from CDB$ROOT container. To Solve the issue just switched to the correct container and re run the import. SQL> show con_name CON_NAME ------------------------------ CDB$ROOT <===Wrong Container selected  SQL> alter session set container=PDB02; Session alt...