前2天有客户报一套10.2.0.3的数据库个别服务进程PGA使用量暴涨,疑似内存泄露(memory leak);遂提供on-site service,赶到用户现场时问题进程已经消失,系统内存使用量恢复正常,客户之前除了保留了v$process动态性能视图的信息外未抓取其他有用的诊断信息。
查看保存的v$process视图信息可以看到进程991714的PGA内存使用量达到13个G:
select spid,program,PGA_USED_MEM,PGA_ALLOC_MEM from v$process; SPID PROGRAM PGA_USED_MEM PGA_ALLOC_MEM ------------------------ ------------------------------------------------ ------------ ------------- 991714 oracleBTS@oam_app_a 14427510986 14432001786 oracle@oam_app_a@/oracle/product/10.2.0/dbs $ ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) 4194304 memory(kbytes) unlimited coredump(blocks) unlimited nofiles(descriptors) unlimited SQL> select x.ksppinm name,y.ksppstvl value 2 from sys.x$ksppi x, sys.x$ksppcv y 3 where x.inst_id=USERENV('Instance') 4 and y.inst_id = USERENV('Instance') 5 and x.indx = y.indx 6 and x.ksppinm like '%pga%'; pga_aggregate_target 200715200 NAME -------------------------------------------------------------------------------- VALUE -------------------------------------------------------------------------------- _pga_max_size 209715200 SQL> select x.ksppinm name,y.ksppstvl value 2 from sys.x$ksppi x, sys.x$ksppcv y 3 where x.inst_id=USERENV('Instance') 4 and y.inst_id = USERENV('Instance') 5 and x.indx = y.indx 6 and x.ksppinm like '%hash_join%'; NAME -------------------------------------------------------------------------------- VALUE -------------------------------------------------------------------------------- _hash_join_enabled TRUE SQL> show parameter work NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ fileio_network_adapters string workarea_size_policy string AUTO
可以看到该系统使用自动PGA管理且pga_aggregate_target参数值为较小的191M,查询隐藏参数_pga_max_size可发现该参数值也为191M。
从告警日志alert.log中找不到任何信息,单个服务进程PGA使用量达到13G居然没有报ORA-04030错误!
到实例的user_dump目录下ls -ltr了一把有意外收获,找到了该991714进程最近的trace文件:
Dump file /oracle/product/10.2.0/admin/BTS/udump/bts_ora_991714.trc Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production With the Partitioning, OLAP and Data Mining options ORACLE_HOME = /oracle/product/10.2.0 System name: AIX Node name: oam_app_a Release: 3 Version: 5 Machine: 0000E5DBD600 Instance name: BTS Redo thread mounted by this instance: 1 Oracle process number: 46 Unix process pid: 991714, image: oracleBTS@oam_app_a *** 2011-04-19 18:27:07.766 *** SERVICE NAME:(SYS$USERS) 2011-04-19 18:27:07.733 *** SESSION ID:(248.45987) 2011-04-19 18:27:07.733 WARNING: out of private memory [1]
以上trace文件中唯一有用的信息就是”WARNING: out of private memory [1]”了,但在metalink上搜索”out of private memory”找不到任何有用的信息;因为未发生ORA-04030错误,而用户也没有手动去收集过PGA Heap的堆使用情况,所以未产生任何对该进程PGA内存使用细节描述的转储信息文件,难以从dump信息中获取线索继续探索,线索断裂。
通过查询PGA状态历史记录视图可以发现在之前的2个快照时间窗口内,inused PGA总量为13G,之后PGA内存使用量又恢复正常:
select * from dba_hist_pgastat where name='total PGA inuse' order by value asc SNAP_ID DBID INSTANCE_NUMBER NAME VALUE ---------- ---------- --------------- ---------------------------------------------------------------- ---------- 16048 3731271451 1 total PGA inuse 1.4883E+10 16047 3731271451 1 total PGA inuse 1.4888E+10
查询在以上时间窗口中991714的活动会话记录,可以发现该进程一直处于”cursor: pin S wait on X”中;Oracle 10.2.0.2以后利用mutex来替代latch保护SQL游标,当硬解析发生时会出现”cursor: pin s wait on X”等待。这里可以看到需要硬解析的SQL的ID为”bp7y9fvtqra1h”,而且此处SQL_CHILD_NUMBER为-1,说明Oracle共享池中之前没有缓存该SQL游标,该问题进程是首次为该SQL语句执行硬解析。
SNAP_ID DBID INSTANCE_NUMBER SAMPLE_ID SAMPLE_TIME SESSION_ID SESSION_SERIAL# USER_ID SQL_ID SQL_CHILD_NUMBER SQL_PLAN_HASH_VALUE FORCE_MATCHING_SIGNATURE SQL_OPCODE PLSQL_ENTRY_OBJECT_ID PLSQL_ENTRY_SUBPROGRAM_ID PLSQL_OBJECT_ID PLSQL_SUBPROGRAM_ID SERVICE_HASH SESSION_TYPE SESSION_STATE QC_SESSION_ID QC_INSTANCE_ID BLOCKING_SESSION BLOCKING_SESSION_STATUS BLOCKING_SESSION_SERIAL# EVENT EVENT_ID SEQ# P1TEXT P1 P2TEXT P2 P3TEXT P3 WAIT_CLASS WAIT_CLASS_ID WAIT_TIME TIME_WAITED XID CURRENT_OBJ# CURRENT_FILE# CURRENT_BLOCK# PROGRAM MODULE ACTION CLIENT_ID
1 16047 3731271451 1 57716200 19-4月 -11 04.00.35.625 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 1140 idn 4083918896 value 1.14676E+12 where|sleeps 21474836525 Concurrency 3875070507 0 9772 -1 0 0
2 16047 3731271451 1 57716210 19-4月 -11 04.00.45.629 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 2137 idn 4083918896 value 1.14676E+12 where|sleeps 21474837522 Concurrency 3875070507 0 9773 -1 0 0
3 16047 3731271451 1 57716220 19-4月 -11 04.00.55.633 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 3134 idn 4083918896 value 1.14676E+12 where|sleeps 21474838519 Concurrency 3875070507 0 9775 -1 0 0
4 16047 3731271451 1 57716230 19-4月 -11 04.01.05.637 下午 248 45987 25 bp7y9fvtqra1h -1 0 0 0 3427055676 FOREGROUND WAITING UNKNOWN cursor: pin S wait on X 1729366244 4126 idn 4083918896 value 1.14676E+12 where|sleeps 21474839511 Concurrency 3875070507 0 9772 -1 0 0
但无论是从SQL历史记录视图(Workload Repository)还是v$SQL中都无法找到该条语句,可以判断该问题进程最后也没能成功完成对该’bp7y9fvtqra1h’语句的解析!因为没有该SQL的记录,所以也无法了解该SQL的执行计划和workarea工作区的使用状况,线索再次断裂!
SQL> select * from dba_hist_sqltext where sql_id='bp7y9fvtqra1h'; no rows selected SQL> select * From v$sql where sql_id='bp7y9fvtqra1h'; no rows selected
通过查询进程内存使用总结历史视图dba_hist_process_mem_summary可以发现占用主要内存的分类是Other,猜测这些Other内存是用来解析游标使用的临时内存(call heap).
SQL> select snap_id,category,num_processes,non_zero_allocs,used_total,allocated_total,allocated_max from dba_hist_process_mem_summary where snap_id in (16047,16048,16049) order by allocated_total desc ,snap_id ; SNAP_ID CATEGORY NUM_PROCESSES NON_ZERO_ALLOCS USED_TOTAL ALLOCATED_TOTAL ALLOCATED_MAX ---------- --------------- ------------- --------------- ---------- --------------- ------------- 16047 Other 95 95 1.5062E+10 1.4665E+10 16048 Other 96 96 1.5053E+10 1.4664E+10 16049 Other 94 94 386117660 53794202 16047 Freeable 36 36 0 36372480 3670016 16048 Freeable 33 33 0 32112640 3670016 16047 SQL 78 31 27346656 29785568 28016616 16048 SQL 79 34 26812992 29240400 27885544 16049 Freeable 27 27 0 26738688 3670016 16049 SQL 77 31 591528 1242168 125816 16048 PL/SQL 96 96 272224 601040 68944 16049 PL/SQL 94 94 218104 536592 68944 16047 PL/SQL 95 95 212816 506536 68800
因为证据断裂,无法根据现有的信息为该服务进程过量使用PGA内存来定位Bug。到MOS上搜索找不到类似的Bug,但即使有也很难定论,因为没有heapdump的话即便提交SR,Oracle GCS(Global Customer Service)也不太愿定位到Bug。
那么如果出现以上类似的PGA内存泄露的问题,我们因当如何第一时间收集有用的信息,以供后续诊断呢?我在这里提供一些可选的方案:
- 定期收集系统内PGA/UGA的使用情况,具体可以使用脚本<Script To Monitor RDBMS Session UGA and PGA Current And Maximum Usage>
- 防御式地在系统级别设置4030 dump heapdump 536870917级别的event dump事件,虽然此案例中未发生ORA-04030错误,但不代表下一次也不发生
- 在问题发生时,第一时间使用oradebug PGA_DETAIL_GET命令填充v$process_memory_detail视图,以便了解问题进程PGA内存的使用细节
- 若觉得v$process_memory_detail动态性能视图的信息还不够丰富的话,也可以做systemstate 266和heapdump 536870917级别的dump
虽然以上我们介绍了一些针对PGA内存泄露问题发生时的诊断方法,但可能你还是不了解具体要如何实践,接下来我们通过实践来体会一下,首先我们特意构建一个PGA内存泄露的假象(测试说明,不要用于生产环境!!):
[oracle@rh2 ~]$ wc -l very_large.sql 18526 very_large.sql /* 这里very_large.sql是我们"精心"构造的一条万分复杂的SQL语句,解析该SQL语句需要消耗大量的PGA内存 !!* / SQL> @very_large /* 执行very_large SQL语句开始模拟内存泄露,将持续较长时间 */ SQL> @MEMORY_USAGE_SCRIPT.SQL Oracle Memory Usage Report: PGA And UGA Memory Usage Per Session Host........: rh2.oracle.com Name........: PROD1 Version.....: 11.2.0.2.0 Startup Time: 2011-04-20 19:41:32 Current Time: 2011.04.21-19:57:16 Worst possible value of concurrent PGA + UGA memory usage per session: SID AND SERIAL# USERNAME OR PROGRAM SUM(VALUE) SESSION START TIME ------------------- ------------------------------------------------ ---------- ------------------- 29,465 SYS 180444688 2011-04-21 19:52:16 152,987 SYS 67781616 2011-04-21 18:40:59 146,1 [email protected] (ARC3) 37598184 2011-04-20 19:41:43 19,5 [email protected] (ARC2) 36484072 2011-04-20 19:41:43 17,7 [email protected] (ARC0) 33141736 2011-04-20 19:41:42 145,1 [email protected] (ARC1) 19837928 2011-04-20 19:41:43 125,7 [email protected] (CJQ0) 15826432 2011-04-20 19:41:50 135,1 [email protected] (LGWR) 13480936 2011-04-20 19:41:33 131,1 [email protected] (LMS0) 11973608 2011-04-20 19:41:33 7,1 [email protected] (LMS1) 11973608 2011-04-20 19:41:33 6,1 [email protected] (LMD0) 11842536 2011-04-20 19:41:33 5,1 [email protected] (DIA0) 10580296 2011-04-20 19:41:33 25,57 SYS 9854112 2011-04-21 19:56:59 10,1 [email protected] (DBW0) 9105992 2011-04-20 19:41:33 136,1 [email protected] (SMON) 8777056 2011-04-20 19:41:33 140,1 [email protected] (MARK) 8565736 2011-04-20 19:41:33 130,1 [email protected] (LMON) 8238120 2011-04-20 19:41:33 138,1 [email protected] (MMON) 7215184 2011-04-20 19:41:33 31,1 [email protected] (SMCO) 7123896 2011-04-20 19:43:52 3,1 [email protected] (DIAG) 6730728 2011-04-20 19:41:33 16,1 [email protected] (RSMN) 5420008 2011-04-20 19:41:35 150,5 [email protected] (Q000) 5001608 2011-04-20 19:41:46 23,1 [email protected] (Q001) 3445984 2011-04-20 19:41:46 22,1 [email protected] (QMNC) 3314960 2011-04-20 19:41:45 12,1 [email protected] (RECO) 3249448 2011-04-20 19:41:33 11,1 [email protected] (CKPT) 3086120 2011-04-20 19:41:33 128,1 [email protected] (DBRM) 2667304 2011-04-20 19:41:33 14,1 [email protected] (MMNL) 2143208 2011-04-20 19:41:33 127,1 [email protected] (GEN0) 2012136 2011-04-20 19:41:33 158,183 SYS 1758344 2011-04-21 07:44:57 143,23 SYS 1692808 2011-04-21 07:45:01 142,1 [email protected] (LCK0) 1299288 2011-04-20 19:41:34 149,1 [email protected] (RCBG) 1160120 2011-04-20 19:41:45 33,59 [email protected] (W000) 963512 2011-04-21 19:55:14 4,1 [email protected] (PING) 898024 2011-04-20 19:41:33 126,1 [email protected] (PSP0) 832488 2011-04-20 19:41:32 13,1 [email protected] (ASMB) 832488 2011-04-20 19:41:33 134,1 [email protected] (MMAN) 832488 2011-04-20 19:41:33 144,1 [email protected] (O000) 832488 2011-04-20 19:41:36 129,1 [email protected] (ACMS) 832488 2011-04-20 19:41:33 133,1 [email protected] (RMS0) 832488 2011-04-20 19:41:33 1,1 [email protected] (PMON) 832488 2011-04-20 19:41:32 9,1 [email protected] (LMHB) 832488 2011-04-20 19:41:33 21,1 [email protected] (GTX0) 832488 2011-04-20 19:41:45 18,3 [email protected] (O001) 832488 2011-04-20 19:41:37 137,1 [email protected] (RBAL) 832488 2011-04-20 19:41:33 2,1 [email protected] (VKTM) 832488 2011-04-20 19:41:33 Worst possible total and average values of concurrent PGA + UGA memory usage: 564679192 bytes (total) and ~6007225 bytes (average), for ~47 sessions. Approximate value of current PGA + UGA memory usage per session: SID AND SERIAL# USERNAME OR PROGRAM SUM(VALUE) SESSION START TIME ------------------- ------------------------------------------------ ---------- ------------------- 29,465 SYS 178083824 2011-04-21 19:52:16 146,1 [email protected] (ARC3) 36484072 2011-04-20 19:41:43 19,5 [email protected] (ARC2) 35369960 2011-04-20 19:41:43 17,7 [email protected] (ARC0) 33141736 2011-04-20 19:41:42 145,1 [email protected] (ARC1) 19837928 2011-04-20 19:41:43 135,1 [email protected] (LGWR) 13480936 2011-04-20 19:41:33 7,1 [email protected] (LMS1) 11973608 2011-04-20 19:41:33 131,1 [email protected] (LMS0) 11973608 2011-04-20 19:41:33 6,1 [email protected] (LMD0) 11842536 2011-04-20 19:41:33 5,1 [email protected] (DIA0) 10580296 2011-04-20 19:41:33 10,1 [email protected] (DBW0) 8712776 2011-04-20 19:41:33 140,1 [email protected] (MARK) 8565736 2011-04-20 19:41:33 130,1 [email protected] (LMON) 8238120 2011-04-20 19:41:33 3,1 [email protected] (DIAG) 6730728 2011-04-20 19:41:33 152,987 SYS 6224040 2011-04-21 18:40:59 16,1 [email protected] (RSMN) 5420008 2011-04-20 19:41:35 125,7 [email protected] (CJQ0) 4854824 2011-04-20 19:41:50 25,57 SYS 4738504 2011-04-21 19:56:59 138,1 [email protected] (MMON) 4165448 2011-04-20 19:41:33 136,1 [email protected] (SMON) 3863504 2011-04-20 19:41:33 150,5 [email protected] (Q000) 3108848 2011-04-20 19:41:46 11,1 [email protected] (CKPT) 2561832 2011-04-20 19:41:33 12,1 [email protected] (RECO) 2538120 2011-04-20 19:41:33 31,1 [email protected] (SMCO) 2536376 2011-04-20 19:43:52 128,1 [email protected] (DBRM) 2339768 2011-04-20 19:41:33 23,1 [email protected] (Q001) 2339672 2011-04-20 19:41:46 22,1 [email protected] (QMNC) 2242336 2011-04-20 19:41:45 127,1 [email protected] (GEN0) 2012136 2011-04-20 19:41:33 14,1 [email protected] (MMNL) 1946600 2011-04-20 19:41:33 158,183 SYS 1692856 2011-04-21 07:44:57 143,23 SYS 1561784 2011-04-21 07:45:01 142,1 [email protected] (LCK0) 1299288 2011-04-20 19:41:34 149,1 [email protected] (RCBG) 1160120 2011-04-20 19:41:45 33,59 [email protected] (W000) 963512 2011-04-21 19:55:14 4,1 [email protected] (PING) 898024 2011-04-20 19:41:33 13,1 [email protected] (ASMB) 832488 2011-04-20 19:41:33 134,1 [email protected] (MMAN) 832488 2011-04-20 19:41:33 129,1 [email protected] (ACMS) 832488 2011-04-20 19:41:33 1,1 [email protected] (PMON) 832488 2011-04-20 19:41:32 9,1 [email protected] (LMHB) 832488 2011-04-20 19:41:33 21,1 [email protected] (GTX0) 832488 2011-04-20 19:41:45 18,3 [email protected] (O001) 832488 2011-04-20 19:41:37 137,1 [email protected] (RBAL) 832488 2011-04-20 19:41:33 126,1 [email protected] (PSP0) 832488 2011-04-20 19:41:32 133,1 [email protected] (RMS0) 832488 2011-04-20 19:41:33 2,1 [email protected] (VKTM) 832488 2011-04-20 19:41:33 144,1 [email protected] (O000) 832488 2011-04-20 19:41:36 Current total and average values of concurrent PGA + UGA memory usage: 463473320 bytes (total) and ~4930567 bytes (average), for ~47 sessions. Maximum value of PGA memory usage per session: SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME ------------------- ------------------------------------------------ ---------- ------------------- 29,465 SYS 177212856 2011-04-21 19:52:16 152,987 SYS 57208040 2011-04-21 18:40:59 146,1 [email protected] (ARC3) 37416168 2011-04-20 19:41:43 19,5 [email protected] (ARC2) 36302056 2011-04-20 19:41:43 17,7 [email protected] (ARC0) 32959720 2011-04-20 19:41:42 145,1 [email protected] (ARC1) 19655912 2011-04-20 19:41:43 135,1 [email protected] (LGWR) 13298920 2011-04-20 19:41:33 125,7 [email protected] (CJQ0) 13045176 2011-04-20 19:41:50 131,1 [email protected] (LMS0) 11791592 2011-04-20 19:41:33 7,1 [email protected] (LMS1) 11791592 2011-04-20 19:41:33 6,1 [email protected] (LMD0) 11660520 2011-04-20 19:41:33 5,1 [email protected] (DIA0) 10398280 2011-04-20 19:41:33 10,1 [email protected] (DBW0) 8923976 2011-04-20 19:41:33 140,1 [email protected] (MARK) 8383720 2011-04-20 19:41:33 130,1 [email protected] (LMON) 8056104 2011-04-20 19:41:33 31,1 [email protected] (SMCO) 6876392 2011-04-20 19:43:52 3,1 [email protected] (DIAG) 6548712 2011-04-20 19:41:33 25,57 SYS 6163896 2011-04-21 19:56:59 136,1 [email protected] (SMON) 5893352 2011-04-20 19:41:33 138,1 [email protected] (MMON) 5294872 2011-04-20 19:41:33 16,1 [email protected] (RSMN) 5237992 2011-04-20 19:41:35 150,5 [email protected] (Q000) 3910216 2011-04-20 19:41:46 11,1 [email protected] (CKPT) 2904104 2011-04-20 19:41:33 23,1 [email protected] (Q001) 2551016 2011-04-20 19:41:46 22,1 [email protected] (QMNC) 2485480 2011-04-20 19:41:45 12,1 [email protected] (RECO) 2485480 2011-04-20 19:41:33 128,1 [email protected] (DBRM) 2223336 2011-04-20 19:41:33 14,1 [email protected] (MMNL) 1961192 2011-04-20 19:41:33 127,1 [email protected] (GEN0) 1830120 2011-04-20 19:41:33 158,183 SYS 1510840 2011-04-21 07:44:57 143,23 SYS 1445304 2011-04-21 07:45:01 142,1 [email protected] (LCK0) 1117272 2011-04-20 19:41:34 149,1 [email protected] (RCBG) 912616 2011-04-20 19:41:45 33,59 [email protected] (W000) 716008 2011-04-21 19:55:14 4,1 [email protected] (PING) 716008 2011-04-20 19:41:33 144,1 [email protected] (O000) 650472 2011-04-20 19:41:36 137,1 [email protected] (RBAL) 650472 2011-04-20 19:41:33 134,1 [email protected] (MMAN) 650472 2011-04-20 19:41:33 133,1 [email protected] (RMS0) 650472 2011-04-20 19:41:33 129,1 [email protected] (ACMS) 650472 2011-04-20 19:41:33 126,1 [email protected] (PSP0) 650472 2011-04-20 19:41:32 21,1 [email protected] (GTX0) 650472 2011-04-20 19:41:45 18,3 [email protected] (O001) 650472 2011-04-20 19:41:37 13,1 [email protected] (ASMB) 650472 2011-04-20 19:41:33 9,1 [email protected] (LMHB) 650472 2011-04-20 19:41:33 2,1 [email protected] (VKTM) 650472 2011-04-20 19:41:33 1,1 [email protected] (PMON) 650472 2011-04-20 19:41:32 Worst possible total and average values of concurrent PGA memory usage: 528694504 bytes (total) and ~11248819 bytes (average), for ~47 sessions. Maximum value of UGA memory usage per session: SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME ------------------- ------------------------------------------------ ---------- ------------------- 152,987 SYS 10573576 2011-04-21 18:40:59 25,57 SYS 3690216 2011-04-21 19:56:59 29,465 SYS 3231832 2011-04-21 19:52:16 136,1 [email protected] (SMON) 2883704 2011-04-20 19:41:33 125,7 [email protected] (CJQ0) 2781256 2011-04-20 19:41:50 138,1 [email protected] (MMON) 1920312 2011-04-20 19:41:33 150,5 [email protected] (Q000) 1091392 2011-04-20 19:41:46 23,1 [email protected] (Q001) 894968 2011-04-20 19:41:46 22,1 [email protected] (QMNC) 829480 2011-04-20 19:41:45 12,1 [email protected] (RECO) 763968 2011-04-20 19:41:33 128,1 [email protected] (DBRM) 443968 2011-04-20 19:41:33 158,183 SYS 247504 2011-04-21 07:44:57 149,1 [email protected] (RCBG) 247504 2011-04-20 19:41:45 143,23 SYS 247504 2011-04-21 07:45:01 33,59 [email protected] (W000) 247504 2011-04-21 19:55:14 31,1 [email protected] (SMCO) 247504 2011-04-20 19:43:52 146,1 [email protected] (ARC3) 182016 2011-04-20 19:41:43 145,1 [email protected] (ARC1) 182016 2011-04-20 19:41:43 144,1 [email protected] (O000) 182016 2011-04-20 19:41:36 142,1 [email protected] (LCK0) 182016 2011-04-20 19:41:34 140,1 [email protected] (MARK) 182016 2011-04-20 19:41:33 137,1 [email protected] (RBAL) 182016 2011-04-20 19:41:33 135,1 [email protected] (LGWR) 182016 2011-04-20 19:41:33 134,1 [email protected] (MMAN) 182016 2011-04-20 19:41:33 133,1 [email protected] (RMS0) 182016 2011-04-20 19:41:33 131,1 [email protected] (LMS0) 182016 2011-04-20 19:41:33 130,1 [email protected] (LMON) 182016 2011-04-20 19:41:33 129,1 [email protected] (ACMS) 182016 2011-04-20 19:41:33 127,1 [email protected] (GEN0) 182016 2011-04-20 19:41:33 126,1 [email protected] (PSP0) 182016 2011-04-20 19:41:32 21,1 [email protected] (GTX0) 182016 2011-04-20 19:41:45 19,5 [email protected] (ARC2) 182016 2011-04-20 19:41:43 18,3 [email protected] (O001) 182016 2011-04-20 19:41:37 17,7 [email protected] (ARC0) 182016 2011-04-20 19:41:42 16,1 [email protected] (RSMN) 182016 2011-04-20 19:41:35 14,1 [email protected] (MMNL) 182016 2011-04-20 19:41:33 13,1 [email protected] (ASMB) 182016 2011-04-20 19:41:33 11,1 [email protected] (CKPT) 182016 2011-04-20 19:41:33 10,1 [email protected] (DBW0) 182016 2011-04-20 19:41:33 9,1 [email protected] (LMHB) 182016 2011-04-20 19:41:33 7,1 [email protected] (LMS1) 182016 2011-04-20 19:41:33 6,1 [email protected] (LMD0) 182016 2011-04-20 19:41:33 5,1 [email protected] (DIA0) 182016 2011-04-20 19:41:33 4,1 [email protected] (PING) 182016 2011-04-20 19:41:33 3,1 [email protected] (DIAG) 182016 2011-04-20 19:41:33 2,1 [email protected] (VKTM) 182016 2011-04-20 19:41:33 1,1 [email protected] (PMON) 182016 2011-04-20 19:41:32 Worst possible total and average values of concurrent UGA memory usage: 35984688 bytes (total) and ~765631 bytes (average), for ~47 sessions. Current value of PGA memory usage per session: SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME ------------------- ------------------------------------------------ ---------- ------------------- 29,465 SYS 177802680 2011-04-21 19:52:16 146,1 [email protected] (ARC3) 36302056 2011-04-20 19:41:43 19,5 [email protected] (ARC2) 35187944 2011-04-20 19:41:43 17,7 [email protected] (ARC0) 32959720 2011-04-20 19:41:42 145,1 [email protected] (ARC1) 19655912 2011-04-20 19:41:43 135,1 [email protected] (LGWR) 13298920 2011-04-20 19:41:33 131,1 [email protected] (LMS0) 11791592 2011-04-20 19:41:33 7,1 [email protected] (LMS1) 11791592 2011-04-20 19:41:33 6,1 [email protected] (LMD0) 11660520 2011-04-20 19:41:33 5,1 [email protected] (DIA0) 10398280 2011-04-20 19:41:33 10,1 [email protected] (DBW0) 8530760 2011-04-20 19:41:33 140,1 [email protected] (MARK) 8383720 2011-04-20 19:41:33 130,1 [email protected] (LMON) 8056104 2011-04-20 19:41:33 3,1 [email protected] (DIAG) 6548712 2011-04-20 19:41:33 16,1 [email protected] (RSMN) 5237992 2011-04-20 19:41:35 152,987 SYS 4582632 2011-04-21 18:40:59 125,7 [email protected] (CJQ0) 3935672 2011-04-20 19:41:50 25,57 SYS 3787544 2011-04-21 19:56:59 136,1 [email protected] (SMON) 3140840 2011-04-20 19:41:33 138,1 [email protected] (MMON) 3066648 2011-04-20 19:41:33 150,5 [email protected] (Q000) 2468424 2011-04-20 19:41:46 11,1 [email protected] (CKPT) 2379816 2011-04-20 19:41:33 31,1 [email protected] (SMCO) 2288872 2011-04-20 19:43:52 12,1 [email protected] (RECO) 2223336 2011-04-20 19:41:33 128,1 [email protected] (DBRM) 2092264 2011-04-20 19:41:33 23,1 [email protected] (Q001) 1961192 2011-04-20 19:41:46 22,1 [email protected] (QMNC) 1961192 2011-04-20 19:41:45 127,1 [email protected] (GEN0) 1830120 2011-04-20 19:41:33 14,1 [email protected] (MMNL) 1764584 2011-04-20 19:41:33 158,183 SYS 1510840 2011-04-21 07:44:57 143,23 SYS 1379768 2011-04-21 07:45:01 142,1 [email protected] (LCK0) 1117272 2011-04-20 19:41:34 149,1 [email protected] (RCBG) 912616 2011-04-20 19:41:45 33,59 [email protected] (W000) 716008 2011-04-21 19:55:14 4,1 [email protected] (PING) 716008 2011-04-20 19:41:33 144,1 [email protected] (O000) 650472 2011-04-20 19:41:36 137,1 [email protected] (RBAL) 650472 2011-04-20 19:41:33 134,1 [email protected] (MMAN) 650472 2011-04-20 19:41:33 133,1 [email protected] (RMS0) 650472 2011-04-20 19:41:33 129,1 [email protected] (ACMS) 650472 2011-04-20 19:41:33 126,1 [email protected] (PSP0) 650472 2011-04-20 19:41:32 21,1 [email protected] (GTX0) 650472 2011-04-20 19:41:45 18,3 [email protected] (O001) 650472 2011-04-20 19:41:37 13,1 [email protected] (ASMB) 650472 2011-04-20 19:41:33 9,1 [email protected] (LMHB) 650472 2011-04-20 19:41:33 2,1 [email protected] (VKTM) 650472 2011-04-20 19:41:33 1,1 [email protected] (PMON) 650472 2011-04-20 19:41:32 Current total and average values of concurrent PGA memory usage: 449247816 bytes (total) and ~9558464 bytes (average), for ~47 sessions. Current value of UGA memory usage per session: SID AND SERIAL# USERNAME OR PROGRAM VALUE SESSION START TIME ------------------- ------------------------------------------------ ---------- ------------------- 152,987 SYS 1641408 2011-04-21 18:40:59 138,1 [email protected] (MMON) 1098800 2011-04-20 19:41:33 25,57 SYS 950960 2011-04-21 19:56:59 125,7 [email protected] (CJQ0) 919152 2011-04-20 19:41:50 136,1 [email protected] (SMON) 722664 2011-04-20 19:41:33 150,5 [email protected] (Q000) 640424 2011-04-20 19:41:46 23,1 [email protected] (Q001) 378480 2011-04-20 19:41:46 12,1 [email protected] (RECO) 314784 2011-04-20 19:41:33 29,465 SYS 281144 2011-04-21 19:52:16 22,1 [email protected] (QMNC) 281144 2011-04-20 19:41:45 149,1 [email protected] (RCBG) 247504 2011-04-20 19:41:45 128,1 [email protected] (DBRM) 247504 2011-04-20 19:41:33 33,59 [email protected] (W000) 247504 2011-04-21 19:55:14 31,1 [email protected] (SMCO) 247504 2011-04-20 19:43:52 158,183 SYS 182016 2011-04-21 07:44:57 146,1 [email protected] (ARC3) 182016 2011-04-20 19:41:43 145,1 [email protected] (ARC1) 182016 2011-04-20 19:41:43 144,1 [email protected] (O000) 182016 2011-04-20 19:41:36 143,23 SYS 182016 2011-04-21 07:45:01 142,1 [email protected] (LCK0) 182016 2011-04-20 19:41:34 140,1 [email protected] (MARK) 182016 2011-04-20 19:41:33 137,1 [email protected] (RBAL) 182016 2011-04-20 19:41:33 135,1 [email protected] (LGWR) 182016 2011-04-20 19:41:33 134,1 [email protected] (MMAN) 182016 2011-04-20 19:41:33 133,1 [email protected] (RMS0) 182016 2011-04-20 19:41:33 131,1 [email protected] (LMS0) 182016 2011-04-20 19:41:33 130,1 [email protected] (LMON) 182016 2011-04-20 19:41:33 129,1 [email protected] (ACMS) 182016 2011-04-20 19:41:33 127,1 [email protected] (GEN0) 182016 2011-04-20 19:41:33 126,1 [email protected] (PSP0) 182016 2011-04-20 19:41:32 21,1 [email protected] (GTX0) 182016 2011-04-20 19:41:45 19,5 [email protected] (ARC2) 182016 2011-04-20 19:41:43 18,3 [email protected] (O001) 182016 2011-04-20 19:41:37 17,7 [email protected] (ARC0) 182016 2011-04-20 19:41:42 16,1 [email protected] (RSMN) 182016 2011-04-20 19:41:35 14,1 [email protected] (MMNL) 182016 2011-04-20 19:41:33 13,1 [email protected] (ASMB) 182016 2011-04-20 19:41:33 11,1 [email protected] (CKPT) 182016 2011-04-20 19:41:33 10,1 [email protected] (DBW0) 182016 2011-04-20 19:41:33 9,1 [email protected] (LMHB) 182016 2011-04-20 19:41:33 7,1 [email protected] (LMS1) 182016 2011-04-20 19:41:33 6,1 [email protected] (LMD0) 182016 2011-04-20 19:41:33 5,1 [email protected] (DIA0) 182016 2011-04-20 19:41:33 4,1 [email protected] (PING) 182016 2011-04-20 19:41:33 3,1 [email protected] (DIAG) 182016 2011-04-20 19:41:33 2,1 [email protected] (VKTM) 182016 2011-04-20 19:41:33 1,1 [email protected] (PMON) 182016 2011-04-20 19:41:32 Current total and average values of concurrent UGA memory usage: 14225504 bytes (total) and ~302670 bytes (average), for ~47 sessions. Current SGA structure sizings: Total System Global Area 939495424 bytes Fixed Size 2232088 bytes Variable Size 398459112 bytes Database Buffers 532676608 bytes Redo Buffers 6127616 bytes Some initialization parameter values at instance startup: large_pool_size=0 pga_aggregate_target=0 sga_target=0 shared_pool_size=0 sort_area_size=65536 streams_pool_size=0 Current Time: 2011.04.21-19:57:16 /* 可以从以上输出看到sid,serial=29,465会话的PGA内存使用量异常,达到了170M, 虽然跟以上案例中的PGA泄露情况比较不算什么 */ /* 使用sid和serial定位到具体的操作系统进程号 */ SQL> select spid,pid,PGA_USED_MEM,PGA_MAX_MEM from v$process 2 where addr=(select paddr from v$session where sid=&1 and serial#=&2); Enter value for 1: 29 Enter value for 2: 465 old 2: where addr=(select paddr from v$session where sid=&1 and serial#=&2) new 2: where addr=(select paddr from v$session where sid=29 and serial#=465) SPID PID PGA_USED_MEM PGA_MAX_MEM ------------------------ ---------- ------------ ----------- 26932 48 129716228 130034996 1 row selected. SQL> oradebug setospid 26932; Oracle pid: 48, Unix process pid: 26932, image: [email protected] (TNS V1-V3) SQL> oradebug dump heapdump 536870917; Statement processed. SQL> oradebug dump processstate 10; Statement processed. SQL> oradebug tracefile_name; /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc /* 接下来对堆转储文件进行分析,通过grep可以找出其中较大的SubHEAP子堆 */ [oracle@rh2 ~]$ egrep "HEAP DUMP heap name|Total heap size|Permanent space" /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc HEAP DUMP heap name="session heap" desc=0x2ac0b37a67f8 Total heap size = 130840 Permanent space = 62680 HEAP DUMP heap name="Alloc environm" desc=0x2ac0b37ce090 Total heap size = 4040 Permanent space = 4040 HEAP DUMP heap name="PLS UGA hp" desc=0x2ac0b37be7f0 Total heap size = 1992 Permanent space = 1080 HEAP DUMP heap name="koh-kghu sessi" desc=0x2ac0b37cf660 Total heap size = 1128 Permanent space = 80 HEAP DUMP heap name="pga heap" desc=0xb7c8ba0 Total heap size = 2689432 Permanent space = 660560 HEAP DUMP heap name="Alloc environm" desc=0x2ac0b35ba5c8 Total heap size = 1706816 Permanent space = 464 HEAP DUMP heap name="Alloc server h" desc=0x2ac0b35b9000 Total heap size = 1704400 Permanent space = 1694816 HEAP DUMP heap name="diag pga" desc=0x2ac0b32537e0 Total heap size = 65448 Permanent space = 3672 HEAP DUMP heap name="KFK_IO_SUBHEAP" desc=0x2ac0b35eb2b0 Total heap size = 10992 Permanent space = 80 HEAP DUMP heap name="peshm.c:Proces" desc=0x2ac0b35e7ad0 Total heap size = 4000 Permanent space = 80 HEAP DUMP heap name="KSFQ heap" desc=0x2ac0b35c6d70 Total heap size = 3256 Permanent space = 3256 HEAP DUMP heap name="top call heap" desc=0xb7ce3c0 Total heap size =155918560 Permanent space = 448 HEAP DUMP heap name="callheap" desc=0xb7cd4c0 Total heap size =152906784 Permanent space = 80 HEAP DUMP heap name="TCHK^30c42b7a" desc=0x2ac0b378ff48 Total heap size =151414512 Permanent space = 80 HEAP DUMP heap name="kggec.c.kggfa" desc=0x2ac0b4e76ec8 Total heap size = 1016 Permanent space = 736 HEAP DUMP heap name="kxs-heap-c" desc=0x2ac0b37800c0 Total heap size = 1489464 Permanent space = 1485928 HEAP DUMP heap name="top uga heap" desc=0xb7ce5e0 Total heap size = 131024 Permanent space = 80 HEAP DUMP heap name="session heap" desc=0x2ac0b37a67f8 Total heap size = 130840 Permanent space = 62680 HEAP DUMP heap name="Alloc environm" desc=0x2ac0b37ce090 Total heap size = 4040 Permanent space = 4040 HEAP DUMP heap name="PLS UGA hp" desc=0x2ac0b37be7f0 Total heap size = 1992 Permanent space = 1080 HEAP DUMP heap name="koh-kghu sessi" desc=0x2ac0b37cf660 Total heap size = 1128 Permanent space = 80 HEAP DUMP heap name="SQLA^30c42b7a" desc=0x6f4c3ab8 Total heap size = 4919904 Permanent space = 80 HEAP DUMP heap name="KGLH0^30c42b7a" desc=0x6ef44290 Total heap size = 4032 Permanent space = 2648 /* 以上heapdump表明TCHK^30c42b7a子堆占用了PGA中绝大多数的内存, 其中子堆的包含结构为PGA->top call heap -> call heap -> TCHK */ /* 接着processstate dump还可以让我们了解该问题进程的最近活动历史,及之前所运行的SQL语句 以便进一步诊断,以下为其调用堆栈 */ ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1900<-sspuser()+112<-__sighandler()<-qcsfccc()+206 <-qcsIsColInFro()+309<-qcsRslvColWithinQbc()+179<-qcsWeakColRslv()+94<-qcsRslvName()+2541<-qcsridn()+105 <-qcsraic()+455<-qcspqbDescendents()+527<-qcspqb()+260<-qcspqbDescendents()+2744<-qcspqb()+260 <-kkmdrv()+182<-opiSem()+1947<-opiprs()+293<-__PGOSF632_kksParseChildCursor()+572<-rpiswu2()+1618 <-kksLoadChild()+5167<-kxsGetRuntimeLock()+2066<-kksfbc()+14527<-kkspsc0()+2025<-kksParseCursor()+144 <-opiosq0()+2027<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+1670<-opiino()+966 <-opiodr()+910<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<-__libc_start_main()+244 <-_start()+36
通过以上获取的信息"call heap-> TCHK"可以从MOS查找到11.2上的”Bug 11782790: ORA-4030 IS GENERATED IN HARD PARSE"和"Bug 12360198: ORA-04030 (TCHK^82665CD8,CHEDEF : QCUATC)";虽然我们这里故意为之而非真实的Bug,也可以看出"TCHK subheap"在11.2中充当SQL解析时的临时调用子堆(parse call subheap)。
to be continued.....
Leave a Reply