ORACLE DB 的學習者們

2017年8月25日 星期五

ORACLE Data Recovery Advisor

Oracle Data Recovery Advisor (DRA) 可以用來偵測並且處理資料庫失效事件

SCENARIOS 1:非系統的 DATAFILE 遺失

首先進行備份

rman target /
 
backup as backupset tablespace uses ;
backup as backupset incremental level 0 database;
backup database plus archivelog delete all input;
list backup of tablespace uses;
shutdown immediate;
exit;
將資料庫關閉。將某一DATAFILE (非系統) 搬移到其他位置,然後啟動,將發生下列錯誤訊息
出現錯誤訊息。執行RMAN並登入,查詢錯誤訊息(LIST FAILURE)之細節。

呼叫DRA,找出修正方案

執行修正程序

SCENARIOS 2:非系統的 DATAFILE 損壞

首先進行備份

rman target /
 
backup as backupset tablespace noncrit;
backup as backupset incremental level 0 database;
backup database plus archivelog delete all input;
list backup of tablespace noncrit;
shutdown immediate;
exit;
然後使用編輯器去編輯DATAFILE,記得對第一列的DATAFILE做異動

編輯存檔後,當資料庫關閉時將發生錯誤

進入RMAN,透過DRA 查詢錯誤

查詢解決方案

查詢完畢,在RMAN 底下執行 REPAIR FAILURE 指令,將一步一步進行修復

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /oracle/app/oracle/diag/rdbms/prisedb/sedb/hm/reco_107603627.hm

contents of repair script:
   # restore and recover datafile
   restore datafile 14;
   recover datafile 14;

Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script

Starting restore at 2017-08-25 23:05:25
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00014 to /oracle/app/oracle/oradata/sedb/noncrit.dbf
channel ORA_DISK_1: reading from backup piece /oracle/app/oracle/flash_recovery_area/PRISEDB/backupset/2017_08_25/o1_mf_nnndf_TAG20170825T205615_dt1wc0bs_.bkp
channel ORA_DISK_1: piece handle=/oracle/app/oracle/flash_recovery_area/PRISEDB/backupset/2017_08_25/o1_mf_nnndf_TAG20170825T205615_dt1wc0bs_.bkp tag=TAG20170825T205615
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
Finished restore at 2017-08-25 23:05:40

Starting recover at 2017-08-25 23:05:40
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 401 is already on disk as file /oracle/app/oracle/oradata/archivelog/1_401_904775628.dbf
archived log for thread 1 with sequence 402 is already on disk as file /oracle/app/oracle/oradata/archivelog/1_402_904775628.dbf
archived log for thread 1 with sequence 403 is already on disk as file /oracle/app/oracle/oradata/archivelog/1_403_904775628.dbf
archived log file name=/oracle/app/oracle/oradata/archivelog/1_401_904775628.dbf thread=1 sequence=401
media recovery complete, elapsed time: 00:00:07
Finished recover at 2017-08-25 23:05:48
repair failure complete

Do you want to open the database (enter YES or NO)? yes
database opened

SCENARIOS 3:Incomplete Recovery with RMAN (Point-in-Time Recovery)

非完全的復原(Recovery)只能在資料庫處於 MOUNT 狀態下時進行,且須擁有資料庫及ARCHIVE LOG的備份才行。進行該復原只有兩個原因: Complete Recovery 不可能或無法實施;或是刻意回到過去特定時間點,故意忽略或遺失該時間點之後的資料變更。例如,某時間點不慎將某 TABLE 或甚至 TABLESPACE 刪除,或不慎進行了某個條件錯誤的DML,導致使用者資料被錯誤地刪改。

Incomplete Recovery 有以下四個步驟

  1. Mount the database
  2. Restore all the datafiles
  3. Recover the database until a certain point
  4. Open the database with reset logs
Incomplete Recovery 有三個選項

  1. Until time
  2. Until system change number
  3. Until log sequence number
前置工作:進行資料庫與ARCHIVE LOG 的備份

設定 NLS_DATE_FORMAT 環境參數,並進行備份

在UNIX下,執行

export NLS_DATE_FORMAT="dd-mm-yy hh24:mi:ss"

開兩個視窗,一個是RMAN,另一個是SQLPLUS,兩個都去設定前述的 NLS_DATE_FORMAT

RMAN

[oracle@db01 ~]$ rman target /
connected to target database: SEDB (DBID=1140755226)

RMAN> backup as compressed backupset database;
Starting backup at 26-08-17 15:30:36
using target database control file instead of recovery catalog
......
Finished Control File and SPFILE Autobackup at 26-08-17 15:33:16
SQLPLUS 以SYSTEM身分登入SQLPLUS CONSOLE,執行

SQL> alter system switch logfile;

RMAN 再回到RMAN視窗,備份ARCHIVED LOG

RMAN> backup archivelog all delete all input;

Starting backup at 26-08-17 15:44:12
current log archived
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log backup set
......
Finished Control File and SPFILE Autobackup at 26-08-17 15:45:24
SQLPLUS 以SYSTEM身分登入SQLPLUS CONSOLE,執行下列步驟

SQL> alter system switch logfile;

紀錄異動前時間點,並且插入一筆資料

SQL> select sysdate from dual;   --紀錄刪除前時間

SYSDATE               
-----------------    
26-08-17 15:47:57    
SQL> insert into system.ex16 values(sysdate);   --插入一筆資料

1 row created.

SQL> commit;

Commit complete.
SQL> drop tablespace noncrit including contents and datafiles;   --刪除TABLESPACE

Tablespace dropped.
RMAN

產生一個復原用的 CONTROLFILE,它的時間點為刪除TABLESPACE 之前,也就是剛剛執行 SELECT SYSDATE 的時間。

RMAN> run{
2> shutdown immediate;
3> startup mount;
4> set until time = '26-08-17 15:46:00';
5> restore controlfile to '/oracle/app/oracle/oradata/sedb/control03.ctl';
6> }
-- 15:47:57 之後,插入一筆,然後刪除 TABLESPACE
-- 步驟 5 : 將復原用的 CONTROLFILE 寫入到新的 CONTROLFILE 檔案 << CONTROLFILE03.CTL >>
database closed
database dismounted
Oracle instance shut down

connected to target database (not started)
Oracle instance started
database mounted

Total System Global Area     477073408 bytes

Fixed Size                     1337324 bytes
Variable Size                297797652 bytes
Database Buffers             171966464 bytes
Redo Buffers                   5971968 bytes

executing command: SET until clause

Starting restore at 26-08-17 16:04:07
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=18 device type=DISK
......
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 26-08-17 16:04:09
執行畫面如下
SQLPLUS 以SYSTEM身分登入SQLPLUS CONSOLE,執行下列步驟

SQL> alter system set control_files='/oracle/app/oracle/oradata/sedb/control03.ctl' scope=spfile;                                                     
System altered.

SQL> shutdown immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.

Total System Global Area  477073408 bytes
Fixed Size                  1337324 bytes
Variable Size             297797652 bytes
Database Buffers          171966464 bytes
Redo Buffers                5971968 bytes
Database mounted.
SQL> show parameter control_files

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_files                        string      /oracle/app/oracle/oradata/sed
                                                 b/control03.ctl


RMAN

執行前述的4 個步驟,以進行 INCOMPLETE RECOVERY。

RMAN> run {
2> allocate channel d1 type disk;
3> allocate channel d2 type disk;
4> set until time = '26-08-17 15:46:00';
5> restore database;
6> recover database;
7> alter database open resetlogs;
8> }

using target database control file instead of recovery catalog
allocated channel: d1
channel d1: SID=20 device type=DISK

allocated channel: d2
channel d2: SID=21 device type=DISK
....
executing command: SET until clause

Starting restore at 26-08-17 16:46:45
....
searching for all files in the recovery area
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /oracle/app/oracle/flash_recovery_area/PRISEDB/autobackup/2017_08_26/                                                                                        o1_mf_s_953049400_dt3z58tf_.bkp
.....
Finished restore at 26-08-17 16:48:44

Starting recover at 26-08-17 16:48:44

starting media recovery

archived log for thread 1 with sequence 408 is already on disk as file /oracle/a                                                                                        pp/oracle/oradata/archivelog/1_408_904775628.dbf
.....
media recovery complete, elapsed time: 00:00:02
Finished recover at 26-08-17 16:48:48

database opened
released channel: d1
released channel: d2

此時已經完成復原,回到刪除TABLESPACE 之前,而在刪除前INSERT的紀錄也因為復原而消失
檢查 V$LOG,確認 LOG SEQUENCE 是否已經重置

Block Recovery

一般而言,RMAN在備份時若遇到BLOCK ERROR時,會中斷備份。若在備份前設定BLOCK ERROR的可容許範圍,則在發生BLOCK ERROR時,RMAN不會被中斷。

若是出現BLOCK ERROR時,可採取的步驟如下:

RMAN> run {
  set maxcorrupt for datafile 7 to 100;
  backup datafile 7;
  }
  
  RMAN> block recover datafile 7 block 5;
  
  RMAN> block recover datafile 7 blocck 5,6,7 datafile 9 block 21,25;
  -- 一次復原數個BLOCK
  RMAN> block recover datafile 7 block 5 from backupset 1093;
  
  RMAN> block reselect * from v$database_block_corruption;

  RMAN> block recover corruption list until time sysdate - 7;

要查詢有哪些 BLOCK 有損壞,可透過下列 VIEW 來查詢

select * from v$database_block_corruption;

select * from v$backup_corruption;

select * from v$copy_corruption;

管理 RMAN 的PROCESS

RMAN的 PROCESS 可以由 V$PROCESS 和 V$SESSION 兩個VIEW做JOIN 來查詢

select sid, spid, client_info
from v$process p join v$session s on (p.addr = s.paddr)
where client_info like '%rman%';
但是上述查詢要透過欄位 CLIENT_INFO來查詢

透過 SET COMMAND ID 可以將欄位寫入註解資訊以利查詢

run
{
   set command id to 'bkup database';
   backup as compressed backupset database delete all input;
}
插入COMMAND ID 以後,可用下列方式查詢

select sid, spid, client_info
from v$process p join v$session s on (p.addr = s.paddr)
where client_info like '%id=%';
查詢顯示如下
另外一個可以查詢的動態VIEW 是 V$SESSION_LONGOPS,但在查詢以前,必須將參數 STATISTICS_LEVEL 設為 TYPICAL 或是 ALL

設為上述 LEVEL 以後,透過上述設定 RMAN 的 COMMAND ID,在執行RMAN的同時,可以查詢執行結果

select sid, serial#, opname, sofar, totalwork
from v$session_longops
where opname like 'RMAN%'
and sofar <> totalwork;
結果如下