9i中若dg的remove archive stuck卡住可能即便在Max Performance最大性能模式下也会导致Primary database 主库的local archive 本地归档无法正常进行, 进而导致Primary db hang挂起。
Solaris sparce 64 平台上的9.2.0.5 库 的相关日志:
Verifying the issue in the alert log Fri Oct 28 08:10:12 2011 ALTER SYSTEM SET log_archive_dest_1='location=/u24/app/oradata/temp_arch' SCOPE=BOTH; Fri Oct 28 08:10:51 2011 ORACLE Instance maclean - Can not allocate log, archival required Fri Oct 28 08:10:51 2011 ARCH: Connecting to console port... Thread 1 cannot allocate new log, sequence 18963 All online logs needed archiving Current log# 3 seq# 18962 mem# 0: /u23/app/oradata/maclean/redo3a.log Current log# 3 seq# 18962 mem# 1: /u24/app/oradata/maclean/redo3b.log Fri Oct 28 08:12:22 2011 ALTER SYSTEM SET log_archive_dest_1='location=/u04/app/oracle/admin/maclean mandatory' SCOPE=BOTH; <== changing archive destination multiple times Fri Oct 28 08:27:56 2011 Shutting down instance: further logons disabled Shutting down instance (immediate) <===== issued shut immediate License high water mark = 546 Fri Oct 28 08:30:55 2011 PMON failed to delete process, see PMON trace file Fri Oct 28 08:33:11 2011 Active call for process 19878 user 'oracle' program 'oracle@e02k72 (TNS V1-V3)' SHUTDOWN: waiting for active calls to complete. <=== looks there are active trans and waiting to complete Fri Oct 28 08:36:40 2011 Starting ORACLE instance (force) <=== issued startup force Fri Oct 28 08:37:03 2011 ALTER DATABASE MOUNT Fri Oct 28 08:37:08 2011 Successful mount of redo thread 1, with mount id 2399989231 Fri Oct 28 08:37:08 2011 Database mounted in Exclusive Mode. Completed: ALTER DATABASE MOUNT <=== mount completed Fri Oct 28 08:37:08 2011 ALTER DATABASE OPEN Fri Oct 28 08:37:08 2011 Beginning crash recovery of 1 threads <=== started crash recover as expected Fri Oct 28 08:37:08 2011 Started redo scan Fri Oct 28 08:37:08 2011 Completed redo scan 0 redo blocks read, 0 data blocks need recovery Fri Oct 28 08:37:08 2011 Started recovery at Thread 1: logseq 18962, block 409599, scn 2775.3390299666 Recovery of Online Redo Log: Thread 1 Group 3 Seq 18962 Reading mem 0 Mem# 0 errs 0: /u23/app/oradata/maclean/redo3a.log Mem# 1 errs 0: /u24/app/oradata/maclean/redo3b.log Fri Oct 28 08:37:08 2011 Completed redo application Fri Oct 28 08:37:08 2011 Ended recovery at Thread 1: logseq 18962, block 409599, scn 2775.3390319667 0 data blocks read, 0 data blocks written, 0 redo blocks read Crash recovery completed successfully Fri Oct 28 08:37:08 2011 LGWR: Primary database is in CLUSTER CONSISTENT mode Fri Oct 28 08:37:08 2011 ARCH: Evaluating archive log 1 thread 1 sequence 18959 ARCH: Beginning to archive log 1 thread 1 sequence 18959 Creating archive destination LOG_ARCHIVE_DEST_2: 'maclean_dr' Creating archive destination LOG_ARCHIVE_DEST_1: '/u04/app/oracle/admin/maclean/maclean_1_18959.arch' Fri Oct 28 08:41:12 2011 alter database open Fri Oct 28 08:41:12 2011 ORA-1154 signalled during: alter database open... Fri Oct 28 08:59:10 2011 ARCH: Completed archiving log 1 thread 1 sequence 18959 Fri Oct 28 08:59:10 2011 LGWR: Primary database is in CLUSTER CONSISTENT mode Thread 1 advanced to log sequence 18963 Thread 1 opened at log sequence 18963 Current log# 1 seq# 18963 mem# 0: /u23/app/oradata/maclean/redo1a.log Current log# 1 seq# 18963 mem# 1: /u24/app/oradata/maclean/redo1b.log Successful open of redo thread 1 Fri Oct 28 08:59:11 2011 LOG_CHECKPOINT_INTERVAL was set when MTTR advisory was switched on. Fri Oct 28 08:59:11 2011 ARC0: Evaluating archive log 2 thread 1 sequence 18960 ARC0: Beginning to archive log 2 thread 1 sequence 18960 Creating archive destination LOG_ARCHIVE_DEST_2: 'maclean_dr' Fri Oct 28 08:59:11 2011 SMON: enabling cache recovery Fri Oct 28 08:59:11 2011 Creating archive destination LOG_ARCHIVE_DEST_1: '/u04/app/oracle/admin/maclean/maclean_1_18960.arch' Fri Oct 28 08:59:12 2011 Undo Segment 1 Onlined Undo Segment 2 Onlined Undo Segment 3 Onlined Undo Segment 4 Onlined Undo Segment 5 Onlined Undo Segment 6 Onlined Undo Segment 7 Onlined Undo Segment 8 Onlined Undo Segment 9 Onlined Undo Segment 10 Onlined Successfully onlined Undo Tablespace 1. Fri Oct 28 08:59:12 2011 SMON: enabling tx recovery Fri Oct 28 08:59:12 2011 Database Characterset is AL32UTF8 Fri Oct 28 08:59:12 2011 SMON: about to recover undo segment 17 Fri Oct 28 08:59:12 2011 replication_dependency_tracking turned off (no async multimaster replication found) Fri Oct 28 08:59:12 2011 SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 18 SMON: mark undo segment 18 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 18 SMON: mark undo segment 18 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 18 SMON: mark undo segment 18 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: about to recover undo segment 17 SMON: mark undo segment 17 as available SMON: Parallel transaction recovery tried <=== transaction recovery Fri Oct 28 08:59:14 2011 Completed: ALTER DATABASE OPEN <=== database is now open Fri Oct 28 09:02:34 2011 This looks to be an expected behavior, as the database waited for the crash/transaction recovery to complete and it opened the database just after the transaction recovery. Note 414242.1 Database Hangs Because SMON is taking 100% CPU doing transaction recovery Startup Hang Also Please note that 9i is Sustaining Support as of now, so we cannot raise any bug/backport requests for this version of Oracle. So, with limited resources and information We will do our best to solve your issue. However, if there is no available workaround or fix for this version of Oracle, you may need to migrate to a supported version like 11.2.0.3 to solve this problem. Also please note that 9.2.0.5 is no more listed in the certification for Oracle Solaris on SPARC (64-bit) platform. So i request you to upgrade to at least 9.2.0.8 immediately, if upgrade to 11.2.0.3 is not feasible immediately. If it is up are you able to issue an "alter system switch logfile" command and it finish? yes now alter system switch logfile is working. but we had to restart database couple of times by shutdown abort.. we did it around 4-5 times.. why it concerns us a lot bcoz produciton instance was down for 2-3 hours. 3. Upload the output of the command SQL> show parameter archive. SQL> show parameter archive NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ archive_lag_target integer 0 log_archive_dest string log_archive_dest_1 string location=/u04/app/oracle/admin /maclean log_archive_dest_10 string NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ log_archive_dest_2 string service=maclean_dr reopen=60 log_archive_dest_3 string log_archive_dest_4 string log_archive_dest_5 string log_archive_dest_6 string NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ log_archive_dest_7 string log_archive_dest_8 string log_archive_dest_9 string log_archive_dest_state_1 string enable NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ log_archive_dest_state_10 string enable log_archive_dest_state_2 string enable log_archive_dest_state_3 string enable log_archive_dest_state_4 string enable log_archive_dest_state_5 string NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ enable log_archive_dest_state_6 string enable log_archive_dest_state_7 string enable log_archive_dest_state_8 string enable log_archive_dest_state_9 string enable NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ log_archive_duplex_dest string log_archive_format string maclean_%t_%s.arch log_archive_max_processes integer 2 log_archive_min_succeed_dest integer 1 log_archive_start boolean NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ TRUE log_archive_trace integer 0 remote_archive_enable string true standby_archive_dest string ?/dbs/arch SQL> Destination LOG_ARCHIVE_DEST_2 is in CLUSTER CONSISTENT mode Destination LOG_ARCHIVE_DEST_2 is in MAXIMUM PERFORMANCE mode Destination LOG_ARCHIVE_DEST_2 is in CLUSTER CONSISTENT mode Destination LOG_ARCHIVE_DEST_2 is in MAXIMUM PERFORMANCE mode This indicates that your Primary Init.ora parameter log_archive_dest_2 is not in sync with your standby destination. I presume your standby database maclean_dr is working properly? The standby dataguard database may not be in a correct state. Archive logs not being successfully transported. Possible password errors. Possible destination file errors, possible set up errors. standby was in sync. we even defer log_archive_dest_2 during issue to check if log switch works after that. but even after deferring the log_archive_dest_2 log switch didn't work .& database was in bad shape for 2-3 hours.. Fri Oct 28 08:35:32 2011 RFS: Possible network disconnect with primary database <<<<<DISCONNECT FROM PRIMARY - SHUTDOWN IMMEDIATE ON PRIMARY Closing latent archivelog for thread 1 sequence 18960 EOF located at block 47105 low SCN 2775:-904684681 next SCN 16:0 Latent archivelog '/u04/app/oracle/admin/maclean/maclean_1_18960.arch' If you wish to failover to this standby database, you should use the following command to manually register the archivelog for recovery: ALTER DATABASE REGISTER LOGFILE '/u04/app/oracle/admin/maclean/maclean_1_18960.arch'; Fri Oct 28 08:35:32 2011 Errors in file /u01/app/oracle/admin/maclean/udump/maclean_rfs_6173.trc: ORA-00367: checksum error in log file header <<<<<<<<<<<<<<<<<<<<<<CHECKSUM ERROR IN FILES ORA-00332: archived log is too small - may be incompletely archived ORA-00334: archived log: '/u04/app/oracle/admin/maclean/maclean_1_18960.arch' by 0954 the standby had caught up. Fri Oct 28 09:54:41 2011 Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18962.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18963.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18964.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18965.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18966.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18967.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18968.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18969.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18970.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18971.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18972.arch Media Recovery Log /u04/app/oracle/admin/maclean/maclean_1_18973.arch The following notes should help you to reduce some of the errors on the standby side. "Data Guard 9i ORA-332 Error When Registering Partial Archive Log After Primary Server Goes Down (Doc ID 233253.1)" "Note:14579.1 SOLARIS Configuring Keepalive on Solaris (SPARC): " This shows the value of the keep alive timer ndd /dev/tcp tcp_keepalive_interval You can change the parameter with: ndd -set /dev/tcp tcp_keepalive_interval <new-smaller-value> Current value for tcp_keepalive_interval is 7200000, Can you please suggest us the value for this parameter also let us know if this parameter requires server reboot or it can be done online. $ ndd /dev/tcp tcp_keepalive_interval 7200000 Fri Oct 28 04:59:21 2011 Creating archive destination LOG_ARCHIVE_DEST_1: '/u04/app/oracle/admin/maclean/maclean_1_18958.arch' Fri Oct 28 04:59:22 2011 Creating archive destination LOG_ARCHIVE_DEST_1: '/u04/app/oracle/admin/maclean/maclean_1_18957.arch' Fri Oct 28 05:01:48 2011 ORACLE Instance maclean - Can not allocate log, archival required Fri Oct 28 05:01:48 2011 ARCH: Connecting to console port... Thread 1 cannot allocate new log, sequence 18960 All online logs needed archiving Current log# 1 seq# 18959 mem# 0: /u23/app/oradata/maclean/redo1a.log Current log# 1 seq# 18959 mem# 1: /u24/app/oradata/maclean/redo1b.log Fri Oct 28 05:12:10 2011 ARCH: Completed archiving log 2 thread 1 sequence 18957 ARCH: Evaluating archive log 4 thread 1 sequence 18956 ARCH: Unable to archive log 4 thread 1 sequence 18956 Log actively being archived by another process Fri Oct 28 05:12:10 2011 Thread 1 advanced to log sequence 18960 Current log# 2 seq# 18960 mem# 0: /u23/app/oradata/maclean/redo2a.log Current log# 2 seq# 18960 mem# 1: /u24/app/oradata/maclean/redo2b.log Fri Oct 28 05:40:25 2011 ARC1: Completed archiving log 4 thread 1 sequence 18956 ARC1: Evaluating archive log 3 thread 1 sequence 18958 ARC1: Unable to archive log 3 thread 1 sequence 18958 Log actively being archived by another process ARC1: Evaluating archive log 1 thread 1 sequence 18959 ARC1: Beginning to archive log 1 thread 1 sequence 18959 Creating archive destination LOG_ARCHIVE_DEST_2: 'maclean_dr' Creating archive destination LOG_ARCHIVE_DEST_1: '/u04/app/oracle/admin/maclean/maclean_1_18959.arch' Fri Oct 28 05:40:27 2011 ARCH: Evaluating archive log 3 thread 1 sequence 18958 ARCH: Unable to archive log 3 thread 1 sequence 18958 Log actively being archived by another process ARCH: Evaluating archive log 1 thread 1 sequence 18959 ARCH: Unable to archive log 1 thread 1 sequence 18959 Log actively being archived by another process Fri Oct 28 05:40:27 2011 Thread 1 advanced to log sequence 18961 Current log# 4 seq# 18961 mem# 0: /u23/app/oradata/maclean/redo4a.log Current log# 4 seq# 18961 mem# 1: /u24/app/oradata/maclean/redo4b.log Fri Oct 28 06:11:45 2011 ORACLE Instance maclean - Can not allocate log, archival required Fri Oct 28 06:11:45 2011 ARCH: Connecting to console port... Thread 1 cannot allocate new log, sequence 18962 All online logs needed archiving Current log# 4 seq# 18961 mem# 0: /u23/app/oradata/maclean/redo4a.log Current log# 4 seq# 18961 mem# 1: /u24/app/oradata/maclean/redo4b.log Fri Oct 28 07:40:42 2011 Archive Log Stop Archiving is disabled Shutting down archive processes Archiving is disabled Archive Log Start Archiving is enabled ARCH: STARTING ARCH PROCESSES ARCH: STARTING ARCH PROCESSES COMPLETE Fri Oct 28 07:54:01 2011 ALTER SYSTEM SET log_archive_dest_state_2='defer' SCOPE=SPFILE; Fri Oct 28 07:54:09 2011 ALTER SYSTEM SET log_archive_dest_state_2='defer' SCOPE=BOTH; Fri Oct 28 08:03:52 2011 ALTER SYSTEM SET log_archive_dest_state_2='enable' SCOPE=BOTH; Fri Oct 28 08:06:20 2011 ARC0: Completed archiving log 3 thread 1 sequence 18958 Fri Oct 28 08:06:20 2011 ARCH: Evaluating archive log 1 thread 1 sequence 18959 ARCH: Unable to archive log 1 thread 1 sequence 18959 Log actively being archived by another process ARCH: Evaluating archive log 2 thread 1 sequence 18960 ARCH: Beginning to archive log 2 thread 1 sequence 18960 Creating archive destination LOG_ARCHIVE_DEST_2: 'maclean_dr' Fri Oct 28 08:06:20 2011 Thread 1 advanced to log sequence 18962 Current log# 3 seq# 18962 mem# 0: /u23/app/oradata/maclean/redo3a.log Current log# 3 seq# 18962 mem# 1: /u24/app/oradata/maclean/redo3b.log Fri Oct 28 08:06:20 2011 ARCH shutting down ARC0: Archival stopped ARC0: Shutdown aborted (current state is 3) ARC0: Evaluating archive log 1 thread 1 sequence 18959 ARC0: Unable to archive log 1 thread 1 sequence 18959 Log actively being archived by another process ARC0: Evaluating archive log 2 thread 1 sequence 18960 ARC0: Unable to archive log 2 thread 1 sequence 18960 Log actively being archived by another process Fri Oct 28 08:06:21 2011 ARC0: Evaluating archive log 4 thread 1 sequence 18961 Fri Oct 28 08:06:21 2011 Undo Segment 21 Onlined ARC0: Beginning to archive log 4 thread 1 sequence 18961 Creating archive destination LOG_ARCHIVE_DEST_2: 'maclean_dr' Fri Oct 28 08:06:21 2011 Undo Segment 22 Onlined Fri Oct 28 08:06:21 2011 Undo Segment 23 Onlined Fri Oct 28 08:06:21 2011 Undo Segment 24 Onlined Fri Oct 28 08:06:21 2011 Undo Segment 25 Onlined Fri Oct 28 08:06:21 2011 Undo Segment 26 Onlined Fri Oct 28 08:06:21 2011 Creating archive destination LOG_ARCHIVE_DEST_1: '/u04/app/oracle/admin/maclean/maclean_1_18960.arch' Fri Oct 28 08:06:21 2011 Creating archive destination LOG_ARCHIVE_DEST_1: '/u04/app/oracle/admin/maclean/maclean_1_18961.arch' Fri Oct 28 08:10:12 2011 ALTER SYSTEM SET log_archive_dest_1='location=/u24/app/oradata/temp_arch' SCOPE=BOTH; Fri Oct 28 08:10:51 2011 ORACLE Instance maclean - Can not allocate log, archival required Fri Oct 28 08:10:51 2011 ARCH: Connecting to console port... Thread 1 cannot allocate new log, sequence 18963 All online logs needed archiving Current log# 3 seq# 18962 mem# 0: /u23/app/oradata/maclean/redo3a.log Current log# 3 seq# 18962 mem# 1: /u24/app/oradata/maclean/redo3b.log Fri Oct 28 08:12:22 2011 ALTER SYSTEM SET log_archive_dest_1='location=/u04/app/oracle/admin/maclean mandatory' SCOPE=BOTH; Fri Oct 28 08:27:56 2011 Shutting down instance: further logons disabled Shutting down instance (immediate) License high water mark = 546 Fri Oct 28 08:30:55 2011 PMON failed to delete process, see PMON trace file Fri Oct 28 08:33:11 2011 Active call for process 19878 user 'oracle' program 'oracle@e02k72 (TNS V1-V3)' SHUTDOWN: waiting for active calls to complete. Fri Oct 28 08:36:40 2011 Starting ORACLE instance (force) License high water mark = 546 Instance terminated by USER, pid = 28373 FileName ---------------- alert_maclean.log
这个SR提交之后 , 一开始认可能是Solaris上tcp tcp_keepalive_interval 参数的问题, 后来esclated到serverity 1 后Oracle high availability support team接手, 发现是 因为 9i中若 remote archive stuck住 会导致本地归档无法正常, 造成数据库无响应。
Please refer the below Article which seems to match with your issue : Doc ID 260040.1 -- Refining Remote Archival Over a Slow Network with the ARCH Process . Please set the parameter "_LOG_ARCHIVE_CALLOUT" to 'LOCAL_FIRST=TRUE' and also increase the number of ARCH processes from the default value of 2 to 6. SQL> alter system set log_archive_max_processes=6 scope=both; As in Doc ID 260040.1 If the remote archiving goes into unknown state, it can cause primary hang as it doesn't attempts local archiving first in 9..2.0.5 It can still happen in max performance mode. The only workaround to prevent such problem is to set _LOG_ARCHIVE_CALLOUT" to 'LOCAL_FIRST=TRUE' With that parameter, it will do local archiving first before trying remote archiving preventing any impact of standby setup to primary database. You mentioned that you did try deferring remote archiving but it didn't help. If the destination is stuck (in network) already, deferring the destination may not help. But you rather restart instances. Action plan suggested). Please set _LOG_ARCHIVE_CALLOUT" to 'LOCAL_FIRST=TRUE' If problem happens again, get multiple system state dump with 10046 trace on alter system switch log file session. When the parameter "_LOG_ARCHIVE_CALLOUT" is set , 1 ARCH process is reserved for local archiving. So even if the network is slow or the Standby is down, the Primary will not be impacted. Setting the parameter "_LOG_ARCHIVE_CALLOUT" should not cause log shipping issues.
“_LOG_ARCHIVE_CALLOUT”(Secondly, you can change the above outlined archiver behavior, to make sure the local destination is archived first, causing the OPTIONAL destination to behave really optionally. If the following parameter is set then the ARCH process will begin archiving to the local destination first. Once the redo log has been completely and successfully archived to at least one local destination. )
For 9iR2 (9.2.0.5 and above):
_log_archive_callout=’LOCAL_FIRST=TRUE’
–OR–
alter system set “_LOG_ARCHIVE_CALLOUT”=’LOCAL_FIRST=TRUE’ scope=both;
In Oracle 10g, this parameter is replaced by ‘log_archive_local_first’, which defaults to TRUE.
In Oracle 11g, ‘log_archive_local_first’ has been deprecated.
这个隐藏参数可以控制ARCH归档进程优先完成本地归档local archive工作,这样就避免了本地重做日志长期无法归档造成数据库无响应。
文档《Refining Remote Archival Over a Slow Network with the ARCH Process》介绍了更多信息:
Purpose
When archiving locally and remotely using the ARCH process where the remote destination is across a saturated or slow network you can receive the following errors in the alert log:
ARC0: Evaluating archive log 2 thread 1 sequence 100
ARC0: Unable to archive log 2 thread 1 sequence 100
Log actively being archived by another process
If the ARCH process is unable to archive at the rate at which online logs are switched then it is possible for the primary database to suspend while waiting for archiving to complete. The following discussion describes how this can occur.
Default Behavior for 9iR2 and Below
The ARCH process sits in a very tight loop waiting for an update to the controlfile that states an online log needs to be archived. Once the update occurs the ARCH process builds a list of archive destinations that need to be serviced. Once this list is complete, the ARCH process will read a one megabyte chunk of data from the online log that is to be archived. This one megabyte chunk is then sent to the first destination in the list. When the write has completed, the same one megabyte chunk is written to the second destination. This continues until all of the data from the online log being archived has been written to all destinations. So it can be said that archiving is only as fast as the slowest destination.
A common misconception is that if the LOG_ARCHIVE_DEST_n parameter for a particular destination has the OPTIONAL attribute set, then that destination will not impede local archiving. This is true during error situations while archiving to that destination – e.g. a network disconnect error, but not during an archival over a slow network, which is not an error situation. In error situations, whether the destination is marked OPTIONAL or MANDATORY, Data Guard will close that destination and continue transmitting to all other valid destinations. Transmitting to the closed destination will be attempted again only after the time specified in the REOPEN attribute has expired and a log switch has occurred. This process will continue for the number of times specified by the MAX_FAILURE attribute. During this time, it is possible that the log writer process recycles through the available online redo log groups and tries to use the online redo log file which has not yet been transmitted successfully to the remote destination. If the destination is marked OPTIONAL, log writer will reuse the online redo log file for the next set of redo. If the destination is marked MANDATORY, log writer will not be able to reuse that online redo log file, and the primary database will delay processing until that online redo log file has been successfully transmitted to the remote destination.
However, the situation is very different if the transmission is being done over a slow network. In this case, no error is encountered and the destination is not closed. Transmission continues, but is very slow. Ultimately, with the unavailability of any more online redo log groups, Log writer may suspend because the archive process is taking a long time to complete its archival, including local archival.
Refining the Default Behavior
The following underscore parameter was introduced as of 9.2.0.5 to allow the DBA to change this default behavior:
_LOG_ARCHIVE_CALLOUT=’LOCAL_FIRST=TRUE’
This is a dynamic Parameter, so you can set it this Way:
SQL> alter system set “_LOG_ARCHIVE_CALLOUT”=’LOCAL_FIRST=TRUE’ scope=both;
If the above parameter is set then the ARCH process will begin archiving to the local destination first. Once the redo log has been completely and successfully archived to at least one local destination, it will then be transmitted to the remote destination. This is the default behavior beginning with Oracle Database 10g Release 1.
Starting in 9.2.0.7 patchsets, one ARCH process will begin acting as a ‘dedicated’ archiver, handling only local archival duties. It will not perform remote log shipping or service FAL requests. This is a backport of behavior from 10gR1 to 9iR2.
设置_LOG_ARCHIVE_CALLOUT=’LOCAL_FIRST=TRUE’ + ’log_archive_max_processes=6′ 一般可以解决该因为 归档状态异常而引起的实例无响应。
Leave a Reply