Probe how does your PGA consume

前2天有客户报一套10.2.0.3的数据库个别服务进程PGA使用量暴涨,疑似内存泄露(memory leak);遂提供on-site service,赶到用户现场时问题进程已经消失,系统内存使用量恢复正常,客户之前除了保留了v$process动态性能视图的信息外未抓取其他有用的诊断信息。

查看保存的v$process视图信息可以看到进程991714的PGA内存使用量达到13个G:

select spid,program,PGA_USED_MEM,PGA_ALLOC_MEM from v$process;
SPID                     PROGRAM                                          PGA_USED_MEM PGA_ALLOC_MEM
------------------------ ------------------------------------------------ ------------ -------------
991714                         oracleBTS@oam_app_a             14427510986 14432001786

oracle@oam_app_a@/oracle/product/10.2.0/dbs $ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        4194304
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) unlimited

SQL> select x.ksppinm name,y.ksppstvl value
  2  from sys.x$ksppi x, sys.x$ksppcv y
  3  where x.inst_id=USERENV('Instance')
  4  and y.inst_id = USERENV('Instance')
  5  and x.indx = y.indx
  6  and x.ksppinm like '%pga%';

pga_aggregate_target
200715200

NAME
--------------------------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
_pga_max_size
209715200

SQL> select x.ksppinm name,y.ksppstvl value
  2  from sys.x$ksppi x, sys.x$ksppcv y
  3  where x.inst_id=USERENV('Instance')
  4  and y.inst_id = USERENV('Instance')
  5  and x.indx = y.indx
  6  and x.ksppinm like '%hash_join%';

NAME
--------------------------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
_hash_join_enabled
TRUE

SQL> show parameter work

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fileio_network_adapters              string
workarea_size_policy                 string      AUTO

可以看到该系统使用自动PGA管理且pga_aggregate_target参数值为较小的191M,查询隐藏参数_pga_max_size可发现该参数值也为191M。

从告警日志alert.log中找不到任何信息,单个服务进程PGA使用量达到13G居然没有报ORA-04030错误!

到实例的user_dump目录下ls -ltr了一把有意外收获,找到了该991714进程最近的trace文件:

Dump file /oracle/product/10.2.0/admin/BTS/udump/bts_ora_991714.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle/product/10.2.0
System name:    AIX
Node name:      oam_app_a
Release:        3
Version:        5
Machine:        0000E5DBD600
Instance name: BTS
Redo thread mounted by this instance: 1
Oracle process number: 46
Unix process pid: 991714, image: oracleBTS@oam_app_a

*** 2011-04-19 18:27:07.766
*** SERVICE NAME:(SYS$USERS) 2011-04-19 18:27:07.733
*** SESSION ID:(248.45987) 2011-04-19 18:27:07.733
WARNING: out of private memory [1]

以上trace文件中唯一有用的信息就是”WARNING: out of private memory [1]”了,但在metalink上搜索”out of private memory”找不到任何有用的信息;因为未发生ORA-04030错误,而用户也没有手动去收集过PGA Heap的堆使用情况,所以未产生任何对该进程PGA内存使用细节描述的转储信息文件,难以从dump信息中获取线索继续探索,线索断裂。

通过查询PGA状态历史记录视图可以发现在之前的2个快照时间窗口内,inused PGA总量为13G,之后PGA内存使用量又恢复正常:

select * from dba_hist_pgastat where name='total PGA inuse' order by value asc
 SNAP_ID       DBID INSTANCE_NUMBER NAME                                                                  VALUE
---------- ---------- --------------- ---------------------------------------------------------------- ----------
     16048 3731271451               1 total PGA inuse                                                  1.4883E+10
     16047 3731271451               1 total PGA inuse                                                  1.4888E+10

查询在以上时间窗口中991714的活动会话记录,可以发现该进程一直处于”cursor: pin S wait on X”中;Oracle 10.2.0.2以后利用mutex来替代latch保护SQL游标,当硬解析发生时会出现”cursor: pin s wait on X”等待。这里可以看到需要硬解析的SQL的ID为”bp7y9fvtqra1h”,而且此处SQL_CHILD_NUMBER为-1,说明Oracle共享池中之前没有缓存该SQL游标,该问题进程是首次为该SQL语句执行硬解析。

     	SNAP_ID	DBID	INSTANCE_NUMBER	SAMPLE_ID	SAMPLE_TIME	SESSION_ID	SESSION_SERIAL#	USER_ID	SQL_ID	SQL_CHILD_NUMBER	SQL_PLAN_HASH_VALUE	FORCE_MATCHING_SIGNATURE	SQL_OPCODE	PLSQL_ENTRY_OBJECT_ID	PLSQL_ENTRY_SUBPROGRAM_ID	PLSQL_OBJECT_ID	PLSQL_SUBPROGRAM_ID	SERVICE_HASH	SESSION_TYPE	SESSION_STATE	QC_SESSION_ID	QC_INSTANCE_ID	BLOCKING_SESSION	BLOCKING_SESSION_STATUS	BLOCKING_SESSION_SERIAL#	EVENT	EVENT_ID	SEQ#	P1TEXT	P1	P2TEXT	P2	P3TEXT	P3	WAIT_CLASS	WAIT_CLASS_ID	WAIT_TIME	TIME_WAITED	XID	CURRENT_OBJ#	CURRENT_FILE#	CURRENT_BLOCK#	PROGRAM	MODULE	ACTION	CLIENT_ID
1	16047	3731271451	1	57716200	19-4月 -11 04.00.35.625 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	1140	idn	4083918896	value	1.14676E+12	where|sleeps	21474836525	Concurrency	3875070507	0	9772		-1	0	0
2	16047	3731271451	1	57716210	19-4月 -11 04.00.45.629 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	2137	idn	4083918896	value	1.14676E+12	where|sleeps	21474837522	Concurrency	3875070507	0	9773		-1	0	0
3	16047	3731271451	1	57716220	19-4月 -11 04.00.55.633 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	3134	idn	4083918896	value	1.14676E+12	where|sleeps	21474838519	Concurrency	3875070507	0	9775		-1	0	0
4	16047	3731271451	1	57716230	19-4月 -11 04.01.05.637 下午	248	45987	25	bp7y9fvtqra1h	-1	0	0	0					3427055676	FOREGROUND	WAITING				UNKNOWN		cursor: pin S wait on X	1729366244	4126	idn	4083918896	value	1.14676E+12	where|sleeps	21474839511	Concurrency	3875070507	0	9772		-1	0	0

但无论是从SQL历史记录视图(Workload Repository)还是v$SQL中都无法找到该条语句,可以判断该问题进程最后也没能成功完成对该’bp7y9fvtqra1h’语句的解析!因为没有该SQL的记录,所以也无法了解该SQL的执行计划和workarea工作区的使用状况,线索再次断裂!

SQL> select * from dba_hist_sqltext where sql_id='bp7y9fvtqra1h';

no rows selected

SQL> select * From v$sql where sql_id='bp7y9fvtqra1h';

no rows selected

通过查询进程内存使用总结历史视图dba_hist_process_mem_summary可以发现占用主要内存的分类是Other,猜测这些Other内存是用来解析游标使用的临时内存(call heap).

SQL> select snap_id,category,num_processes,non_zero_allocs,used_total,allocated_total,allocated_max from dba_hist_process_mem_summary where snap_id in (16047,16048,16049) order by allocated_total desc ,snap_id ;

   SNAP_ID CATEGORY        NUM_PROCESSES NON_ZERO_ALLOCS USED_TOTAL ALLOCATED_TOTAL ALLOCATED_MAX
---------- --------------- ------------- --------------- ---------- --------------- -------------
     16047 Other                      95              95                 1.5062E+10    1.4665E+10
     16048 Other                      96              96                 1.5053E+10    1.4664E+10
     16049 Other                      94              94                  386117660      53794202
     16047 Freeable                   36              36          0        36372480       3670016
     16048 Freeable                   33              33          0        32112640       3670016
     16047 SQL                        78              31   27346656        29785568      28016616
     16048 SQL                        79              34   26812992        29240400      27885544
     16049 Freeable                   27              27          0        26738688       3670016
     16049 SQL                        77              31     591528         1242168        125816
     16048 PL/SQL                     96              96     272224          601040         68944
     16049 PL/SQL                     94              94     218104          536592         68944
     16047 PL/SQL                     95              95     212816          506536         68800

因为证据断裂,无法根据现有的信息为该服务进程过量使用PGA内存来定位Bug。到MOS上搜索找不到类似的Bug,但即使有也很难定论,因为没有heapdump的话即便提交SR,Oracle GCS(Global Customer Service)也不太愿定位到Bug。

那么如果出现以上类似的PGA内存泄露的问题,我们因当如何第一时间收集有用的信息,以供后续诊断呢?我在这里提供一些可选的方案:

  1. 定期收集系统内PGA/UGA的使用情况,具体可以使用脚本<Script To Monitor RDBMS Session UGA and PGA Current And Maximum Usage>
  2. 防御式地在系统级别设置4030 dump heapdump 536870917级别的event dump事件,虽然此案例中未发生ORA-04030错误,但不代表下一次也不发生
  3. 在问题发生时,第一时间使用oradebug PGA_DETAIL_GET命令填充v$process_memory_detail视图,以便了解问题进程PGA内存的使用细节
  4. 若觉得v$process_memory_detail动态性能视图的信息还不够丰富的话,也可以做systemstate 266和heapdump 536870917级别的dump

虽然以上我们介绍了一些针对PGA内存泄露问题发生时的诊断方法,但可能你还是不了解具体要如何实践,接下来我们通过实践来体会一下,首先我们特意构建一个PGA内存泄露的假象(测试说明,不要用于生产环境!!):

[oracle@rh2 ~]$ wc -l very_large.sql 
18526 very_large.sql

/* 这里very_large.sql是我们"精心"构造的一条万分复杂的SQL语句,解析该SQL语句需要消耗大量的PGA内存 !!* /

SQL> @very_large

/* 执行very_large SQL语句开始模拟内存泄露,将持续较长时间  */

SQL> @MEMORY_USAGE_SCRIPT.SQL

Oracle Memory Usage Report: PGA And UGA Memory Usage Per Session

Host........: rh2.oracle.com

Name........: PROD1

Version.....: 11.2.0.2.0

Startup Time: 2011-04-20 19:41:32

Current Time: 2011.04.21-19:57:16

Worst possible value of concurrent PGA + UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                              SUM(VALUE) SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               180444688 2011-04-21 19:52:16
152,987             SYS                                                67781616 2011-04-21 18:40:59
146,1               [email protected] (ARC3)                       37598184 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       36484072 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       33141736 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19837928 2011-04-20 19:41:43
125,7               [email protected] (CJQ0)                       15826432 2011-04-20 19:41:50
135,1               [email protected] (LGWR)                       13480936 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                       11973608 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11973608 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11842536 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10580296 2011-04-20 19:41:33
25,57               SYS                                                 9854112 2011-04-21 19:56:59
10,1                [email protected] (DBW0)                        9105992 2011-04-20 19:41:33
136,1               [email protected] (SMON)                        8777056 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8565736 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8238120 2011-04-20 19:41:33
138,1               [email protected] (MMON)                        7215184 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        7123896 2011-04-20 19:43:52
3,1                 [email protected] (DIAG)                        6730728 2011-04-20 19:41:33
16,1                [email protected] (RSMN)                        5420008 2011-04-20 19:41:35
150,5               [email protected] (Q000)                        5001608 2011-04-20 19:41:46
23,1                [email protected] (Q001)                        3445984 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        3314960 2011-04-20 19:41:45
12,1                [email protected] (RECO)                        3249448 2011-04-20 19:41:33
11,1                [email protected] (CKPT)                        3086120 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                        2667304 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        2143208 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                        2012136 2011-04-20 19:41:33
158,183             SYS                                                 1758344 2011-04-21 07:44:57
143,23              SYS                                                 1692808 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1299288 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                        1160120 2011-04-20 19:41:45
33,59               [email protected] (W000)                         963512 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         898024 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         832488 2011-04-20 19:41:32
13,1                [email protected] (ASMB)                         832488 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         832488 2011-04-20 19:41:33
144,1               [email protected] (O000)                         832488 2011-04-20 19:41:36
129,1               [email protected] (ACMS)                         832488 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         832488 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         832488 2011-04-20 19:41:32
9,1                 [email protected] (LMHB)                         832488 2011-04-20 19:41:33
21,1                [email protected] (GTX0)                         832488 2011-04-20 19:41:45
18,3                [email protected] (O001)                         832488 2011-04-20 19:41:37
137,1               [email protected] (RBAL)                         832488 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         832488 2011-04-20 19:41:33

Worst possible total and average values of concurrent PGA + UGA memory usage:

564679192 bytes (total) and ~6007225 bytes (average), for ~47 sessions.

Approximate value of current PGA + UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                              SUM(VALUE) SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               178083824 2011-04-21 19:52:16
146,1               [email protected] (ARC3)                       36484072 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       35369960 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       33141736 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19837928 2011-04-20 19:41:43
135,1               [email protected] (LGWR)                       13480936 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11973608 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                       11973608 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11842536 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10580296 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                        8712776 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8565736 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8238120 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                        6730728 2011-04-20 19:41:33
152,987             SYS                                                 6224040 2011-04-21 18:40:59
16,1                [email protected] (RSMN)                        5420008 2011-04-20 19:41:35
125,7               [email protected] (CJQ0)                        4854824 2011-04-20 19:41:50
25,57               SYS                                                 4738504 2011-04-21 19:56:59
138,1               [email protected] (MMON)                        4165448 2011-04-20 19:41:33
136,1               [email protected] (SMON)                        3863504 2011-04-20 19:41:33
150,5               [email protected] (Q000)                        3108848 2011-04-20 19:41:46
11,1                [email protected] (CKPT)                        2561832 2011-04-20 19:41:33
12,1                [email protected] (RECO)                        2538120 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        2536376 2011-04-20 19:43:52
128,1               [email protected] (DBRM)                        2339768 2011-04-20 19:41:33
23,1                [email protected] (Q001)                        2339672 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        2242336 2011-04-20 19:41:45
127,1               [email protected] (GEN0)                        2012136 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        1946600 2011-04-20 19:41:33
158,183             SYS                                                 1692856 2011-04-21 07:44:57
143,23              SYS                                                 1561784 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1299288 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                        1160120 2011-04-20 19:41:45
33,59               [email protected] (W000)                         963512 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         898024 2011-04-20 19:41:33
13,1                [email protected] (ASMB)                         832488 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         832488 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         832488 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         832488 2011-04-20 19:41:32
9,1                 [email protected] (LMHB)                         832488 2011-04-20 19:41:33
21,1                [email protected] (GTX0)                         832488 2011-04-20 19:41:45
18,3                [email protected] (O001)                         832488 2011-04-20 19:41:37
137,1               [email protected] (RBAL)                         832488 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         832488 2011-04-20 19:41:32
133,1               [email protected] (RMS0)                         832488 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         832488 2011-04-20 19:41:33
144,1               [email protected] (O000)                         832488 2011-04-20 19:41:36

Current total and average values of concurrent PGA + UGA memory usage:

463473320 bytes (total) and ~4930567 bytes (average), for ~47 sessions.

Maximum value of PGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               177212856 2011-04-21 19:52:16
152,987             SYS                                                57208040 2011-04-21 18:40:59
146,1               [email protected] (ARC3)                       37416168 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       36302056 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       32959720 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19655912 2011-04-20 19:41:43
135,1               [email protected] (LGWR)                       13298920 2011-04-20 19:41:33
125,7               [email protected] (CJQ0)                       13045176 2011-04-20 19:41:50
131,1               [email protected] (LMS0)                       11791592 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11791592 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11660520 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10398280 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                        8923976 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8383720 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8056104 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        6876392 2011-04-20 19:43:52
3,1                 [email protected] (DIAG)                        6548712 2011-04-20 19:41:33
25,57               SYS                                                 6163896 2011-04-21 19:56:59
136,1               [email protected] (SMON)                        5893352 2011-04-20 19:41:33
138,1               [email protected] (MMON)                        5294872 2011-04-20 19:41:33
16,1                [email protected] (RSMN)                        5237992 2011-04-20 19:41:35
150,5               [email protected] (Q000)                        3910216 2011-04-20 19:41:46
11,1                [email protected] (CKPT)                        2904104 2011-04-20 19:41:33
23,1                [email protected] (Q001)                        2551016 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        2485480 2011-04-20 19:41:45
12,1                [email protected] (RECO)                        2485480 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                        2223336 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        1961192 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                        1830120 2011-04-20 19:41:33
158,183             SYS                                                 1510840 2011-04-21 07:44:57
143,23              SYS                                                 1445304 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1117272 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                         912616 2011-04-20 19:41:45
33,59               [email protected] (W000)                         716008 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         716008 2011-04-20 19:41:33
144,1               [email protected] (O000)                         650472 2011-04-20 19:41:36
137,1               [email protected] (RBAL)                         650472 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         650472 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         650472 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         650472 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         650472 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         650472 2011-04-20 19:41:45
18,3                [email protected] (O001)                         650472 2011-04-20 19:41:37
13,1                [email protected] (ASMB)                         650472 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         650472 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         650472 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         650472 2011-04-20 19:41:32

Worst possible total and average values of concurrent PGA memory usage:

528694504 bytes (total) and ~11248819 bytes (average), for ~47 sessions.

Maximum value of UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
152,987             SYS                                                10573576 2011-04-21 18:40:59
25,57               SYS                                                 3690216 2011-04-21 19:56:59
29,465              SYS                                                 3231832 2011-04-21 19:52:16
136,1               [email protected] (SMON)                        2883704 2011-04-20 19:41:33
125,7               [email protected] (CJQ0)                        2781256 2011-04-20 19:41:50
138,1               [email protected] (MMON)                        1920312 2011-04-20 19:41:33
150,5               [email protected] (Q000)                        1091392 2011-04-20 19:41:46
23,1                [email protected] (Q001)                         894968 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                         829480 2011-04-20 19:41:45
12,1                [email protected] (RECO)                         763968 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                         443968 2011-04-20 19:41:33
158,183             SYS                                                  247504 2011-04-21 07:44:57
149,1               [email protected] (RCBG)                         247504 2011-04-20 19:41:45
143,23              SYS                                                  247504 2011-04-21 07:45:01
33,59               [email protected] (W000)                         247504 2011-04-21 19:55:14
31,1                [email protected] (SMCO)                         247504 2011-04-20 19:43:52
146,1               [email protected] (ARC3)                         182016 2011-04-20 19:41:43
145,1               [email protected] (ARC1)                         182016 2011-04-20 19:41:43
144,1               [email protected] (O000)                         182016 2011-04-20 19:41:36
142,1               [email protected] (LCK0)                         182016 2011-04-20 19:41:34
140,1               [email protected] (MARK)                         182016 2011-04-20 19:41:33
137,1               [email protected] (RBAL)                         182016 2011-04-20 19:41:33
135,1               [email protected] (LGWR)                         182016 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         182016 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         182016 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                         182016 2011-04-20 19:41:33
130,1               [email protected] (LMON)                         182016 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         182016 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                         182016 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         182016 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         182016 2011-04-20 19:41:45
19,5                [email protected] (ARC2)                         182016 2011-04-20 19:41:43
18,3                [email protected] (O001)                         182016 2011-04-20 19:41:37
17,7                [email protected] (ARC0)                         182016 2011-04-20 19:41:42
16,1                [email protected] (RSMN)                         182016 2011-04-20 19:41:35
14,1                [email protected] (MMNL)                         182016 2011-04-20 19:41:33
13,1                [email protected] (ASMB)                         182016 2011-04-20 19:41:33
11,1                [email protected] (CKPT)                         182016 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                         182016 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         182016 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                         182016 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                         182016 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                         182016 2011-04-20 19:41:33
4,1                 [email protected] (PING)                         182016 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                         182016 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         182016 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         182016 2011-04-20 19:41:32

Worst possible total and average values of concurrent UGA memory usage:

35984688 bytes (total) and ~765631 bytes (average), for ~47 sessions.

Current value of PGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
29,465              SYS                                               177802680 2011-04-21 19:52:16
146,1               [email protected] (ARC3)                       36302056 2011-04-20 19:41:43
19,5                [email protected] (ARC2)                       35187944 2011-04-20 19:41:43
17,7                [email protected] (ARC0)                       32959720 2011-04-20 19:41:42
145,1               [email protected] (ARC1)                       19655912 2011-04-20 19:41:43
135,1               [email protected] (LGWR)                       13298920 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                       11791592 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                       11791592 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                       11660520 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                       10398280 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                        8530760 2011-04-20 19:41:33
140,1               [email protected] (MARK)                        8383720 2011-04-20 19:41:33
130,1               [email protected] (LMON)                        8056104 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                        6548712 2011-04-20 19:41:33
16,1                [email protected] (RSMN)                        5237992 2011-04-20 19:41:35
152,987             SYS                                                 4582632 2011-04-21 18:40:59
125,7               [email protected] (CJQ0)                        3935672 2011-04-20 19:41:50
25,57               SYS                                                 3787544 2011-04-21 19:56:59
136,1               [email protected] (SMON)                        3140840 2011-04-20 19:41:33
138,1               [email protected] (MMON)                        3066648 2011-04-20 19:41:33
150,5               [email protected] (Q000)                        2468424 2011-04-20 19:41:46
11,1                [email protected] (CKPT)                        2379816 2011-04-20 19:41:33
31,1                [email protected] (SMCO)                        2288872 2011-04-20 19:43:52
12,1                [email protected] (RECO)                        2223336 2011-04-20 19:41:33
128,1               [email protected] (DBRM)                        2092264 2011-04-20 19:41:33
23,1                [email protected] (Q001)                        1961192 2011-04-20 19:41:46
22,1                [email protected] (QMNC)                        1961192 2011-04-20 19:41:45
127,1               [email protected] (GEN0)                        1830120 2011-04-20 19:41:33
14,1                [email protected] (MMNL)                        1764584 2011-04-20 19:41:33
158,183             SYS                                                 1510840 2011-04-21 07:44:57
143,23              SYS                                                 1379768 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                        1117272 2011-04-20 19:41:34
149,1               [email protected] (RCBG)                         912616 2011-04-20 19:41:45
33,59               [email protected] (W000)                         716008 2011-04-21 19:55:14
4,1                 [email protected] (PING)                         716008 2011-04-20 19:41:33
144,1               [email protected] (O000)                         650472 2011-04-20 19:41:36
137,1               [email protected] (RBAL)                         650472 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         650472 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         650472 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         650472 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         650472 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         650472 2011-04-20 19:41:45
18,3                [email protected] (O001)                         650472 2011-04-20 19:41:37
13,1                [email protected] (ASMB)                         650472 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         650472 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         650472 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         650472 2011-04-20 19:41:32

Current total and average values of concurrent PGA memory usage:

449247816 bytes (total) and ~9558464 bytes (average), for ~47 sessions.

Current value of UGA memory usage per session:

SID AND SERIAL#     USERNAME OR PROGRAM                                   VALUE SESSION START TIME
------------------- ------------------------------------------------ ---------- -------------------
152,987             SYS                                                 1641408 2011-04-21 18:40:59
138,1               [email protected] (MMON)                        1098800 2011-04-20 19:41:33
25,57               SYS                                                  950960 2011-04-21 19:56:59
125,7               [email protected] (CJQ0)                         919152 2011-04-20 19:41:50
136,1               [email protected] (SMON)                         722664 2011-04-20 19:41:33
150,5               [email protected] (Q000)                         640424 2011-04-20 19:41:46
23,1                [email protected] (Q001)                         378480 2011-04-20 19:41:46
12,1                [email protected] (RECO)                         314784 2011-04-20 19:41:33
29,465              SYS                                                  281144 2011-04-21 19:52:16
22,1                [email protected] (QMNC)                         281144 2011-04-20 19:41:45
149,1               [email protected] (RCBG)                         247504 2011-04-20 19:41:45
128,1               [email protected] (DBRM)                         247504 2011-04-20 19:41:33
33,59               [email protected] (W000)                         247504 2011-04-21 19:55:14
31,1                [email protected] (SMCO)                         247504 2011-04-20 19:43:52
158,183             SYS                                                  182016 2011-04-21 07:44:57
146,1               [email protected] (ARC3)                         182016 2011-04-20 19:41:43
145,1               [email protected] (ARC1)                         182016 2011-04-20 19:41:43
144,1               [email protected] (O000)                         182016 2011-04-20 19:41:36
143,23              SYS                                                  182016 2011-04-21 07:45:01
142,1               [email protected] (LCK0)                         182016 2011-04-20 19:41:34
140,1               [email protected] (MARK)                         182016 2011-04-20 19:41:33
137,1               [email protected] (RBAL)                         182016 2011-04-20 19:41:33
135,1               [email protected] (LGWR)                         182016 2011-04-20 19:41:33
134,1               [email protected] (MMAN)                         182016 2011-04-20 19:41:33
133,1               [email protected] (RMS0)                         182016 2011-04-20 19:41:33
131,1               [email protected] (LMS0)                         182016 2011-04-20 19:41:33
130,1               [email protected] (LMON)                         182016 2011-04-20 19:41:33
129,1               [email protected] (ACMS)                         182016 2011-04-20 19:41:33
127,1               [email protected] (GEN0)                         182016 2011-04-20 19:41:33
126,1               [email protected] (PSP0)                         182016 2011-04-20 19:41:32
21,1                [email protected] (GTX0)                         182016 2011-04-20 19:41:45
19,5                [email protected] (ARC2)                         182016 2011-04-20 19:41:43
18,3                [email protected] (O001)                         182016 2011-04-20 19:41:37
17,7                [email protected] (ARC0)                         182016 2011-04-20 19:41:42
16,1                [email protected] (RSMN)                         182016 2011-04-20 19:41:35
14,1                [email protected] (MMNL)                         182016 2011-04-20 19:41:33
13,1                [email protected] (ASMB)                         182016 2011-04-20 19:41:33
11,1                [email protected] (CKPT)                         182016 2011-04-20 19:41:33
10,1                [email protected] (DBW0)                         182016 2011-04-20 19:41:33
9,1                 [email protected] (LMHB)                         182016 2011-04-20 19:41:33
7,1                 [email protected] (LMS1)                         182016 2011-04-20 19:41:33
6,1                 [email protected] (LMD0)                         182016 2011-04-20 19:41:33
5,1                 [email protected] (DIA0)                         182016 2011-04-20 19:41:33
4,1                 [email protected] (PING)                         182016 2011-04-20 19:41:33
3,1                 [email protected] (DIAG)                         182016 2011-04-20 19:41:33
2,1                 [email protected] (VKTM)                         182016 2011-04-20 19:41:33
1,1                 [email protected] (PMON)                         182016 2011-04-20 19:41:32

Current total and average values of concurrent UGA memory usage:

14225504 bytes (total) and ~302670 bytes (average), for ~47 sessions.

Current SGA structure sizings:

Total System Global Area  939495424 bytes
Fixed Size                  2232088 bytes
Variable Size             398459112 bytes
Database Buffers          532676608 bytes
Redo Buffers                6127616 bytes

Some initialization parameter values at instance startup:

large_pool_size=0
pga_aggregate_target=0
sga_target=0
shared_pool_size=0
sort_area_size=65536
streams_pool_size=0

Current Time: 2011.04.21-19:57:16

/* 可以从以上输出看到sid,serial=29,465会话的PGA内存使用量异常,达到了170M,
    虽然跟以上案例中的PGA泄露情况比较不算什么  */

/* 使用sid和serial定位到具体的操作系统进程号 */

SQL> select spid,pid,PGA_USED_MEM,PGA_MAX_MEM from v$process
  2  where addr=(select paddr from v$session where sid=&1 and serial#=&2);
Enter value for 1: 29
Enter value for 2: 465
old   2: where addr=(select paddr from v$session where sid=&1 and serial#=&2)
new   2: where addr=(select paddr from v$session where sid=29 and serial#=465)

SPID                            PID PGA_USED_MEM PGA_MAX_MEM
------------------------ ---------- ------------ -----------
26932                            48    129716228   130034996

1 row selected.

SQL> oradebug setospid 26932;
Oracle pid: 48, Unix process pid: 26932, image: [email protected] (TNS V1-V3)

SQL> oradebug dump heapdump 536870917;
Statement processed.

SQL> oradebug dump processstate 10;
Statement processed.

SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc

/* 接下来对堆转储文件进行分析,通过grep可以找出其中较大的SubHEAP子堆  */

[oracle@rh2 ~]$ egrep "HEAP DUMP heap name|Total heap size|Permanent space"
/s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_26932.trc

HEAP DUMP heap name="session heap"  desc=0x2ac0b37a67f8
Total heap size    =   130840
Permanent space    =    62680
HEAP DUMP heap name="Alloc environm"  desc=0x2ac0b37ce090
Total heap size    =     4040
Permanent space    =     4040
HEAP DUMP heap name="PLS UGA hp"  desc=0x2ac0b37be7f0
Total heap size    =     1992
Permanent space    =     1080
HEAP DUMP heap name="koh-kghu sessi"  desc=0x2ac0b37cf660
Total heap size    =     1128
Permanent space    =       80
HEAP DUMP heap name="pga heap"  desc=0xb7c8ba0
Total heap size    =  2689432
Permanent space    =   660560
HEAP DUMP heap name="Alloc environm"  desc=0x2ac0b35ba5c8
Total heap size    =  1706816
Permanent space    =      464
HEAP DUMP heap name="Alloc server h"  desc=0x2ac0b35b9000
Total heap size    =  1704400
Permanent space    =  1694816
HEAP DUMP heap name="diag pga"  desc=0x2ac0b32537e0
Total heap size    =    65448
Permanent space    =     3672
HEAP DUMP heap name="KFK_IO_SUBHEAP"  desc=0x2ac0b35eb2b0
Total heap size    =    10992
Permanent space    =       80
HEAP DUMP heap name="peshm.c:Proces"  desc=0x2ac0b35e7ad0
Total heap size    =     4000
Permanent space    =       80
HEAP DUMP heap name="KSFQ heap"  desc=0x2ac0b35c6d70
Total heap size    =     3256
Permanent space    =     3256
HEAP DUMP heap name="top call heap"  desc=0xb7ce3c0
Total heap size    =155918560
Permanent space    =      448
HEAP DUMP heap name="callheap"  desc=0xb7cd4c0
Total heap size    =152906784
Permanent space    =       80
HEAP DUMP heap name="TCHK^30c42b7a"  desc=0x2ac0b378ff48
Total heap size    =151414512
Permanent space    =       80
HEAP DUMP heap name="kggec.c.kggfa"  desc=0x2ac0b4e76ec8
Total heap size    =     1016
Permanent space    =      736
HEAP DUMP heap name="kxs-heap-c"  desc=0x2ac0b37800c0
Total heap size    =  1489464
Permanent space    =  1485928
HEAP DUMP heap name="top uga heap"  desc=0xb7ce5e0
Total heap size    =   131024
Permanent space    =       80
HEAP DUMP heap name="session heap"  desc=0x2ac0b37a67f8
Total heap size    =   130840
Permanent space    =    62680
HEAP DUMP heap name="Alloc environm"  desc=0x2ac0b37ce090
Total heap size    =     4040
Permanent space    =     4040
HEAP DUMP heap name="PLS UGA hp"  desc=0x2ac0b37be7f0
Total heap size    =     1992
Permanent space    =     1080
HEAP DUMP heap name="koh-kghu sessi"  desc=0x2ac0b37cf660
Total heap size    =     1128
Permanent space    =       80
HEAP DUMP heap name="SQLA^30c42b7a"  desc=0x6f4c3ab8
Total heap size    =  4919904
Permanent space    =       80
HEAP DUMP heap name="KGLH0^30c42b7a"  desc=0x6ef44290
Total heap size    =     4032
Permanent space    =     2648


/* 以上heapdump表明TCHK^30c42b7a子堆占用了PGA中绝大多数的内存,
    其中子堆的包含结构为PGA->top call heap -> call heap  -> TCHK 
*/

/* 接着processstate dump还可以让我们了解该问题进程的最近活动历史,及之前所运行的SQL语句
    以便进一步诊断,以下为其调用堆栈      */

ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1900<-sspuser()+112<-__sighandler()<-qcsfccc()+206
<-qcsIsColInFro()+309<-qcsRslvColWithinQbc()+179<-qcsWeakColRslv()+94<-qcsRslvName()+2541<-qcsridn()+105
<-qcsraic()+455<-qcspqbDescendents()+527<-qcspqb()+260<-qcspqbDescendents()+2744<-qcspqb()+260
<-kkmdrv()+182<-opiSem()+1947<-opiprs()+293<-__PGOSF632_kksParseChildCursor()+572<-rpiswu2()+1618
<-kksLoadChild()+5167<-kxsGetRuntimeLock()+2066<-kksfbc()+14527<-kkspsc0()+2025<-kksParseCursor()+144
<-opiosq0()+2027<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+1670<-opiino()+966
<-opiodr()+910<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<-__libc_start_main()+244
<-_start()+36

通过以上获取的信息"call heap-> TCHK"可以从MOS查找到11.2上的”Bug 11782790: ORA-4030 IS GENERATED IN HARD PARSE"和"Bug 12360198: ORA-04030 (TCHK^82665CD8,CHEDEF : QCUATC)";虽然我们这里故意为之而非真实的Bug,也可以看出"TCHK subheap"在11.2中充当SQL解析时的临时调用子堆(parse call subheap)。

to be continued.....


Posted

in

by

Tags:

Comments

5 responses to “Probe how does your PGA consume”

  1. admin Avatar
    admin

    Hdr: 12360198 11.2.0.2 RDBMS 11.2.0.2 PRODID-5 PORTID-226 ORA-4030
    Abstract: ORA-4030 (TCHK^82665CD8,CHEDEF : QCUATC)

    PROBLEM:
    ——–
    Customer is hit with frequent ORA-4030 (TCHK^82665cd8,chedef : qcuatc) ,
    whenever he runs the query .

    DIAGNOSTIC ANALYSIS:
    ——————–
    Requested to have the heapdump level 536870917 set to extract the trace file

    Since its around 6 GB , the awk script is running forever.

    I tried to get the sample output using grep command it shows at many area the
    size is over allocated.

    TCHK^82665cd8 ds=0x2ad1cca8ff48 dsprt=0xb7cd520
    TCHK^82665cd8 ds=0x2ad1cca8ff48 dsprt=0xb7cd520
    TCHK^82665cd8 ds=0x2ad1cca8ff48 dsprt=0xb7cd520
    TCHK^82665cd8 ds=0x2ad1cca8ff48 dsprt=0xb7cd520
    TCHK^82665cd8 ds=0x2ad1cca8ff48 dsprt=0xb7cd520
    TCHK^82665cd8 ds=0x2ad1cca8ff48 dsprt=0xb7cd520
    TCHK^82665cd8 ds=0x2ad1cca8ff48 dsprt=0xb7cd520
    4051 MB, 1 heap: “TCHK^82665cd8 ” ##### —- High Value .

    3463 MB, 90298598 chunks: “chedef : qcuatc ” ##### — High Value
    3463 MB, 90298598 chunks: “chedef : qcuatc ” ##### — High
    Value.

    From the customer update :
    ========================

    nls_comp string LINGUISTIC
    nls_sort string BINARY_CI

    They have set the above settings in their database , they have used the
    workaround as specified in the Bug 9055646, closed as duplicated of
    unpublished Bug 9279524

    Set NLS_SORT=BINARY and NLS_COMP=BINARY.

    They had no luck , since the Bug 9055646, closed as duplicated of unpublished
    Bug 9279524 is already fixed on 11.2.0.2 , the issue is not relevant here .

    ++ customer also tried the workaround as per the bug which was shown in the
    release 11.2.0.2

    bug 11782790

    alter session set “_push_join_union_view” = false;
    alter session set “_optimizer_push_pred_cost_based” = false;
    alter session set “_optimizer_multi_level_push_pred” = false;
    alter session set “_optimizer_cost_based_transformation” = off;

    no luck on that as well .

    WORKAROUND:
    ———–
    None

    RELATED BUGS:
    ————-
    bug 11782790
    Bug 9279524 ++ this is fixed in 11.2.0.2

    REPRODUCIBILITY:
    —————-
    100 % at customer site

    TEST CASE:
    ———-
    None

    STACK TRACE:
    ————
    calling call entry argument values in hex

    location type point (? means dubious value)

    ——————– ——– ——————–
    —————————-
    mmap(offset=207937536, len=8192) failed with errno=12 for the file
    oracleTRAINING1
    mmap(offset=207937536, len=8192) failed with errno=12 for the file
    oracleTRAINING1
    skdstdst()+36 call kgdsdst() 000000000 ? 000000000 ?
    7FFFD44DD398 ? 000000001 ?
    000000001 ? 000000002 ?
    ksedst1()+98 call skdstdst() 000000000 ? 000000000 ?
    7FFFD44DD398 ? 000000001 ?
    000000000 ? 000000002 ?
    ksedst()+34 call ksedst1() 000000000 ? 000000001 ?
    7FFFD44DD398 ? 000000001 ?
    000000000 ? 000000002 ?
    dbkedDefDump()+2741 call ksedst() 000000000 ? 000000001 ?
    7FFFD44DD398 ? 000000001 ?
    000000000 ? 000000002 ?
    ksedmp()+36 call dbkedDefDump() 000000003 ? 000000002 ?
    7FFFD44DD398 ? 000000001 ?
    000000000 ? 000000002 ?
    0000000004708F80 call ksedmp() 000000003 ? 000000002 ?
    7FFFD44DD398 ? 000000001 ?
    000000000 ? 000000002 ?

  2. admin Avatar
    admin

    Hdr: 11782790 11.2.0.2 RDBMS 11.2.0.2 QRY OPTIMIZER PRODID-5 PORTID-226 ORA-4030
    Abstract: ORA-4030 IS GENERATED IN HARD PARSE

    *** 02/17/11 12:32 am ***
    —-

    PROBLEM:
    ——–
    The area of kxs-heap-c grows by executing SELECT sentence,
    and ORA-4030 is generated finally.

    *** 21:51:44.896
    63% 3039 MB, 150022 chunks: “permanent memory ” SQL
    kxs-heap-c ds=0x2aeecf7301e0 dsprt=0xb7cf3c0
    9% 447 MB, 11711738 chunks: “kkqctCollectColStats : st ”
    TCHK^72b863a6 ds=0x2aeecf73ff48 dsprt=0xb7ce4c0

    DIAGNOSTIC ANALYSIS:
    ——————–
    It was confirmed that the error had occurred at state=SYNTAX
    from Cursor dump.

    In this SELECT sentence, RIGHT OUTER JOIN, LEFT OUTER JOIN,
    and UNION ALL are used, and 12 tables are united.

    Following information is requested from the customer now.

    alter session set events ‘10261 trace name context forever, level 512000’;
    alter session set events ‘600 heapdump(536870941) errorstack(3)’;

    WORKAROUND:
    ———–
    n/a

    RELATED BUGS:
    ————-
    9914628

    REPRODUCIBILITY:
    —————-
    Rep? Platform RDBMS Ver.
    ——- ———– ——————
    Y(100%) Linux x86-64 11.2.0.2(customer)

    TEST CASE:
    ———-
    n/a

    STACK TRACE:
    ————
    It seems not to be output normally.

    mmap(offset=207941632, len=8192) failed with errno=12 for the file
    oracledmdb3
    mmap(offset=207941632, len=8192) failed with errno=12 for the file
    oracledmdb3
    skdstdst()+36 call kgdsdst() 000000000 ? 000000000 ?
    7FFF90BC0178 ? 000000001 ?
    000000001 ? 000000002 ?
    ksedst1()+98 call skdstdst() 000000000 ? 000000000 ?
    7FFF90BC0178 ? 000000001 ?
    000000000 ? 000000002 ?
    ksedst()+34 call ksedst1() 000000000 ? 000000001 ?
    7FFF90BC0178 ? 000000001 ?
    000000000 ? 000000002 ?
    dbkedDefDump()+2741 call ksedst() 000000000 ? 000000001 ?
    7FFF90BC0178 ? 000000001 ?
    000000000 ? 000000002 ?
    ksedmp()+36 call dbkedDefDump() 000000003 ? 000000002 ?
    7FFF90BC0178 ? 000000001 ?
    000000000 ? 000000002 ?
    0000000004714C90 call ksedmp() 000000003 ? 000000002 ?
    7FFF90BC0178 ? 000000001 ?
    000000000 ? 000000002 ?

    It won’t work well.

    I assume customer require several minutes before getting ORA-4030,
    so prepare 3 terminal and

    1. connect to db
    2.
    alter session set events ‘10235 level 65536’;
    3.
    prepare to run select.sql
    4. connect as sysdba from other terminal
    5.
    oradebug setsospid 6. run select.sql and wait for a while to see process size
    getting huge.
    7. take pstack against server process many times
    8. oradebug dump heapdump 0x20000001
    take same heapdump twice

    And I recommend you to gather object definition and statistics,
    problem will be reproduced without data.
    *** 02/17/11 01:10 am ***
    Query join about 10 tables/views, ansi left outer join, union all is used.

    Truning off cost based query transformation will be workaround.

    RELEASE NOTES:
    ]]Reuse of CBQT cost annotations does no duplicate query block traversals.
    REDISCOVERY INFORMATION:
    If parsing of complex, deeply nested queries with cost-based transformations
    is using up a lot of cpu and memory and possibly failing with out of memory
    ORA-4030 or ORA-4031 errors and disabling reuse of cost annotations avoids
    the problem, then you may be encountering this bug.
    WORKAROUND:
    Disable reuse of query block cost annotations using the hidden parameter
    _optimizer_reuse_cost_annotations = FALSE.

  3. maclean Avatar

    sort subheap dump

    alter session set workarea_size_policy=MANUAL;
    alter session set sort_area_size=52428800;
    alter session set sort_area_size=52428800;
    alter session set sort_area_retained_size=0;
    alter session set sort_area_retained_size=0;

    /* sort begin here! */

    select * from
    (select rownum rn from dual connect by level <900000) order by 1;

    oradebug setospid
    oradebug dump heapdump 536870917;

    [oracle@rh2 trace]$ egrep "HEAP DUMP heap name|Total heap size|Permanent space" /s01/orabase/diag/rdbms/prod/PROD1/trace/PROD1_ora_18647.trc
    HEAP DUMP heap name="session heap" desc=0x2b0f807367f8
    Total heap size = 65537368
    Permanent space = 52144
    HEAP DUMP heap name="kxs-heap-w" desc=0x2b0f80750500
    Total heap size = 62005672
    Permanent space = 43724488
    HEAP DUMP heap name="sort subheap" desc=0x2b0f80816258
    Total heap size = 18264512
    Permanent space = 18250032
    HEAP DUMP heap name="kxs-heap-p" desc=0x2b0f807510d0
    Total heap size = 7688
    Permanent space = 304
    HEAP DUMP heap name="kxs-heap-p" desc=0x2b0f80750e90
    Total heap size = 7592
    Permanent space = 640
    HEAP DUMP heap name="kxs-heap-p" desc=0x2b0f80751310
    Total heap size = 4952
    Permanent space = 208
    HEAP DUMP heap name="Alloc environm" desc=0x2b0f8075e090
    Total heap size = 4040
    Permanent space = 4040
    HEAP DUMP heap name="pga heap" desc=0xb7c8ba0
    Total heap size = 985528
    Permanent space = 660560
    HEAP DUMP heap name="diag pga" desc=0x2b0f801ea7e0
    Total heap size = 65448
    Permanent space = 3672
    HEAP DUMP heap name="Alloc environm" desc=0x2b0f805515c8
    Total heap size = 20560
    Permanent space = 464
    HEAP DUMP heap name="Alloc server h" desc=0x2b0f80550000
    Total heap size = 18104
    Permanent space = 1296
    HEAP DUMP heap name="KFK_IO_SUBHEAP" desc=0x2b0f805822b0
    Total heap size = 10992
    Permanent space = 80
    HEAP DUMP heap name="peshm.c:Proces" desc=0x2b0f8057ead0
    Total heap size = 4000
    Permanent space = 80
    HEAP DUMP heap name="KSFQ heap" desc=0x2b0f8055dd70
    Total heap size = 3256
    Permanent space = 3256
    HEAP DUMP heap name="top call heap" desc=0xb7ce3c0
    Total heap size = 524096
    Permanent space = 736
    HEAP DUMP heap name="callheap" desc=0xb7cd4c0
    Total heap size = 968
    Permanent space = 80
    HEAP DUMP heap name="top uga heap" desc=0xb7ce5e0
    Total heap size = 65577512
    Permanent space = 80
    HEAP DUMP heap name="session heap" desc=0x2b0f807367f8
    Total heap size = 65537368
    Permanent space = 52144
    HEAP DUMP heap name="kxs-heap-w" desc=0x2b0f80750500
    Total heap size = 62005672
    Permanent space = 43724488
    HEAP DUMP heap name="sort subheap" desc=0x2b0f80816258
    Total heap size = 18264512
    Permanent space = 18250032
    HEAP DUMP heap name="kxs-heap-p" desc=0x2b0f807510d0
    Total heap size = 7688
    Permanent space = 304
    HEAP DUMP heap name="kxs-heap-p" desc=0x2b0f80750e90
    Total heap size = 7592
    Permanent space = 640
    HEAP DUMP heap name="kxs-heap-p" desc=0x2b0f80751310
    Total heap size = 4952
    Permanent space = 208
    HEAP DUMP heap name="Alloc environm" desc=0x2b0f8075e090
    Total heap size = 4040
    Permanent space = 4040

    alter session set workarea_size_policy=MANUAL;
    alter session set sort_area_size=52428800;
    alter session set sort_area_size=52428800;
    alter session set sort_area_retained_size=52428800;
    alter session set sort_area_retained_size=52428800;

    / * another sort with large sort_area_retained_size */

    select * from
    (select rownum rn from dual connect by level <900000) order by 1;

    HEAP DUMP heap name="session heap" desc=0x2ba5848667f8
    Total heap size = 65471896
    Permanent space = 46768
    HEAP DUMP heap name="kxs-heap-w" desc=0x2ba584880500
    Total heap size = 62005672
    Permanent space = 43724488
    HEAP DUMP heap name="sort subheap" desc=0x2ba584886778
    Total heap size = 18264512
    Permanent space = 18250032
    HEAP DUMP heap name="Alloc environm" desc=0x2ba58488e090
    Total heap size = 4040
    Permanent space = 4040
    HEAP DUMP heap name="PLS UGA hp" desc=0x2ba58487e7f0
    Total heap size = 1992
    Permanent space = 1080
    HEAP DUMP heap name="koh-kghu sessi" desc=0x2ba58488f660
    Total heap size = 1128
    Permanent space = 80
    HEAP DUMP heap name="kxs-heap-p" desc=0x2ba5848801f0
    Total heap size = 984
    Permanent space = 80
    HEAP DUMP heap name="pga heap" desc=0xb7c8ba0
    Total heap size = 985528
    Permanent space = 660560
    HEAP DUMP heap name="diag pga" desc=0x2ba5843177e0
    Total heap size = 57232
    Permanent space = 3672
    HEAP DUMP heap name="Alloc environm" desc=0x2ba58467e5c8
    Total heap size = 12192
    Permanent space = 464
    HEAP DUMP heap name="Alloc server h" desc=0x2ba58467d000
    Total heap size = 9816
    Permanent space = 1296
    HEAP DUMP heap name="KFK_IO_SUBHEAP" desc=0x2ba5846af2b0
    Total heap size = 10992
    Permanent space = 80
    HEAP DUMP heap name="peshm.c:Proces" desc=0x2ba5846abad0
    Total heap size = 4000
    Permanent space = 80
    HEAP DUMP heap name="KSFQ heap" desc=0x2ba58468ad70
    Total heap size = 3256
    Permanent space = 3256
    HEAP DUMP heap name="top call heap" desc=0xb7ce3c0
    Total heap size = 65512
    Permanent space = 448
    HEAP DUMP heap name="callheap" desc=0xb7cd4c0
    Total heap size = 968
    Permanent space = 80
    HEAP DUMP heap name="top uga heap" desc=0xb7ce5e0
    Total heap size = 65512000
    Permanent space = 80
    HEAP DUMP heap name="session heap" desc=0x2ba5848667f8
    Total heap size = 65471896
    Permanent space = 46768
    HEAP DUMP heap name="kxs-heap-w" desc=0x2ba584880500
    Total heap size = 62005672
    Permanent space = 43724488
    HEAP DUMP heap name="sort subheap" desc=0x2ba584886778
    Total heap size = 18264512
    Permanent space = 18250032
    HEAP DUMP heap name="Alloc environm" desc=0x2ba58488e090
    Total heap size = 4040
    Permanent space = 4040
    HEAP DUMP heap name="PLS UGA hp" desc=0x2ba58487e7f0
    Total heap size = 1992
    Permanent space = 1080
    HEAP DUMP heap name="koh-kghu sessi" desc=0x2ba58488f660
    Total heap size = 1128
    Permanent space = 80
    HEAP DUMP heap name="kxs-heap-p" desc=0x2ba5848801f0
    Total heap size = 984
    Permanent space = 80

  4. admin Avatar
    admin

    1) pga_aggregate_target is a target, as opposed to a hard limit – so it isn’t unusual to go above
    that. 13G above that, now that’s unusual though! There IS an enhancement request in, to make
    a hard-limit setting, but that does not currently exist.
    2) There is a known bug in 10203 with certain statements burning up memory – bug 5947623 – however,
    the 10203/aix version of this patch is 64-bit, and the SR header says you are on 32-bit, so that isn’t
    an option….and 10203 is old enough that I can’t get a new version of the patch made.

    So at this point, there’s a couple of things we can do:
    1) Can you re-create this, or is it a one-time event that cannot be re-created to troubleshoot?
    2) Can you upgrade to 10205 so that you’re “current”, to see if it still happens there, and if it does,
    we’d be in position to file a new bug?

    Oh good (about being 64 bit) – I’ll update the SR header with that. Given that, I’d install the
    patch for – bug 5947623 – we can’t say for sure why it happened, but we can try to stop
    it from happening again with that patch! 🙂

  5. […] Maclean Slide:Upgrade 11.2.0.1 GI/CRS to 11.2.0.2 in Linux 了解Oracle RAC Brain Split Resolution Probe how does your PGA consume Does Rman Backup benefit from Large Pool? Oracle Enterprise Manager Cloud Control 12c(12.1) […]

Leave a Reply

Your email address will not be published. Required fields are marked *