有应用人员反映某套Linux上的11.2.0.1数据库系统中出现了UNION ALL后返回的结果集不正确的问题,我们具体分析下出现问题的其中一条语句:
SELECT MTL_SECONDARY_INVENTORIES.SECONDARY_INVENTORY_NAME, MTL_SECONDARY_INVENTORIES.ORGANIZATION_ID, MTL_SECONDARY_INVENTORIES.DESCRIPTION, MTL_SECONDARY_INVENTORIES.AVAILABILITY_TYPE, MTL_SECONDARY_INVENTORIES.MATERIAL_ACCOUNT, MTL_SECONDARY_INVENTORIES.MATERIAL_OVERHEAD_ACCOUNT, MTL_SECONDARY_INVENTORIES.RESOURCE_ACCOUNT, MTL_SECONDARY_INVENTORIES.OVERHEAD_ACCOUNT, MTL_SECONDARY_INVENTORIES.OUTSIDE_PROCESSING_ACCOUNT, MTL_SECONDARY_INVENTORIES.ASSET_INVENTORY, MTL_SECONDARY_INVENTORIES.EXPENSE_ACCOUNT, MTL_SECONDARY_INVENTORIES.ENCUMBRANCE_ACCOUNT, MTL_SECONDARY_INVENTORIES.ATTRIBUTE3, MTL_SECONDARY_INVENTORIES.ATTRIBUTE5, WORKFLOW_START_TIMES.WORKFLOW_START_TIME FROM REPEMEAERP.MTL_SECONDARY_INVENTORIES, REPEMEAERP.WORKFLOW_START_TIMES WHERE MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT > TO_DATE('01/01/1900 00:00:00', 'MM/DD/YYYY HH24:MI:SS') AND MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT <= WORKFLOW_START_TIMES.WORKFLOW_START_TIME AND WORKFLOW_START_TIMES.WORKFLOW_NAME = LTRIM(RTRIM('w_int_FreqBatch_EMEA')) /*以上是QUERY A*/ UNION ALL /*以下是QUERY B*/ SELECT DISTINCT 'WORKORDERS', MTL_SECONDARY_INVENTORIES.ORGANIZATION_ID, 'WORK ORDERS WITH WIP AS CATEGORY VALUE', 1, 0, 0, 0, 0, 0, 1, 0, 0, 'MOI', '0', WORKFLOW_START_TIMES.WORKFLOW_START_TIME FROM REPEMEAERP.MTL_SECONDARY_INVENTORIES, EIMMAINT.WORKFLOW_START_TIMES WHERE MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT > TO_DATE('01/01/1900 00:00:00', 'MM/DD/YYYY HH24:MI:SS') AND MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT <= WORKFLOW_START_TIMES.WORKFLOW_START_TIME AND WORKFLOW_START_TIMES.WORKFLOW_NAME = LTRIM(RTRIM('w_int_FreqBatch_EMEA')) / 138 rows selected.
以上查询语句中,QUERY A部分(也就是UNION ALL之前的SELECT语句)单独查询时返回返回69条记录,QUERY B部分单独查询时返回15记录,UNION ALL后返回的结果却是138条记录,而非84条记录。实际上这套系统也是最近才从10g迁移到11gr2上,之前在10g中同样的应用没有出过类似的问题,可以猜测是11g中新引入的某种特性存在可能引发wrong result的Bug。
具体思路虽然有了,但仍无法确定问题的关键所在;我们来看看该SQL的执行计划:
----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 2443 | 52 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 7 | 2443 | 52 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL | WORKFLOW_START_TIMES | 1 | 29 | 48 (0)| 00:00:01 | | 3 | VIEW | VW_JF_SET$9BAED2EA | 1 | 320 | 4 (0)| 00:00:01 | | 4 | UNION ALL PUSHED PREDICATE | | | | | | |* 5 | FILTER | | | | | | | 6 | TABLE ACCESS BY INDEX ROWID| MTL_SECONDARY_INVENTORIES | 3 | 336 | 2 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | IDX_MTL_SECONDARY_INVENTORIES | 1 | | 1 (0)| 00:00:01 | |* 8 | FILTER | | | | | | | 9 | TABLE ACCESS BY INDEX ROWID| MTL_SECONDARY_INVENTORIES | 3 | 36 | 2 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | IDX_MTL_SECONDARY_INVENTORIES | 1 | | 1 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("WORKFLOW_START_TIMES"."WORKFLOW_NAME"='w_int_FreqBatch_EMEA') 5 - filter(TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')<"WORKFLOW_START_TIMES"."WORKFLOW_START_TIME") 7 - access("MTL_SECONDARY_INVENTORIES"."DW_UPDATE_DT">TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "MTL_SECONDARY_INVENTORIES"."DW_UPDATE_DT"<="WORKFLOW_START_TIMES"."WORKFLOW_START_TIME" ) 8 - filter(TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')<"WORKFLOW_START_TIMES"."WORKFLOW_START_TIME") 10 - access("MTL_SECONDARY_INVENTORIES"."DW_UPDATE_DT">TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "MTL_SECONDARY_INVENTORIES"."DW_UPDATE_DT"<="WORKFLOW_START_TIMES"."WORKFLOW_START_TIME" )
你可能从以上执行计划中发现了两处十分陌生的字眼:UNION ALL PUSHED PREDICATE和VW_JF_SET$。它们是什么!?
先来说说JF,JF是join factorization的缩写,你可以把它翻译作链接因式分解,如果你学过离散数学或者数据库原理的话,那么这种在11.2.0.1中最新推出的基于成本的变换操作对你来说并不陌生。用公式的样式来表达大概是下面这样:
YYA,YYB和YYC是3个关联的数据对象亦或者是3个关联的结果集; (YYA JOIN YYB) UNION [ALL] (YYA JOIN YYC) 可以转换成为: YYA JOIN (YYB UNION [ALL] YYC)
这样做YYA部分只需要读取一次,还可以少做一次JOIN,听上去是挺不错的吧!
下面我们来看一个Oracle使用join factorization的十分简单的实例:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production SQL> drop table yya; drop table yya * ERROR at line 1: ORA-00942: table or view does not exist SQL> drop table yyb; drop table yyb * ERROR at line 1: ORA-00942: table or view does not exist SQL> create table yya as select rownum id1,rownum id2,rownum id3 from dual connect by level<=20000; Table created. SQL> create table yyb as select rownum id1,rownum id2,rownum id3 from dual connect by level<=20000; Table created. SQL> explain plan for 2 select * from yya ,yyb where yya.id1=yyb.id1 3 union all 4 select * from yya, yyb where yya.id1=yyb.id1; Explained. SQL> set linesize 100 pagesize 1400; SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 744914999 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 40000 | 2500K| 49 (3)| 00:00:01 | |* 1 | HASH JOIN | | 40000 | 2500K| 49 (3)| 00:00:01 | | 2 | TABLE ACCESS FULL | YYA | 20000 | 234K| 16 (0)| 00:00:01 | | 3 | VIEW | VW_JF_SET$6E3F6682 | 40000 | 2031K| 32 (0)| 00:00:01 | | 4 | UNION-ALL | | | | | | | 5 | TABLE ACCESS FULL| YYB | 20000 | 761K| 16 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL| YYB | 20000 | 761K| 16 (0)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("YYA"."ID1"="ITEM_1") /*执行计划中出现了VW_JF_SET$F22B2A93,Oracle选择了使用join factorization,该执行计划总成本49*/ SQL> alter session set "_optimizer_join_factorization"=false; Session altered. /*隐藏参数_optimizer_join_factorization决定了优化器是否可以选用join factorization,现在我们禁用它*/ SQL> explain plan for 2 select * from yya join yyb on yya.id1=yyb.id1 3 union all 4 select * from yya join yyb on yya.id1=yyb.id1; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 3439541885 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 40000 | 1992K| 66 (52)| 00:00:01 | | 1 | UNION-ALL | | | | | | |* 2 | HASH JOIN | | 20000 | 996K| 33 (4)| 00:00:01 | | 3 | TABLE ACCESS FULL| YYA | 20000 | 234K| 16 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL| YYB | 20000 | 761K| 16 (0)| 00:00:01 | |* 5 | HASH JOIN | | 20000 | 996K| 33 (4)| 00:00:01 | | 6 | TABLE ACCESS FULL| YYA | 20000 | 234K| 16 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL| YYB | 20000 | 761K| 16 (0)| 00:00:01 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("YYA"."ID1"="YYB"."ID1") 5 - access("YYA"."ID1"="YYB"."ID1") /*禁用链接因式分解后,Oracle使用了常规的"笨办法",成本上升到66*/ /*有趣的是下面的测试*/ SQL> alter session set "_optimizer_join_factorization"=true; Session altered. SQL> create table yyc as select * from yyb; Table created. SQL> explain plan for 2 select * from yya,yyc where yya.id1=yyc.id1 3 union all 4 select * from yya,yyb where yya.id1=yyb.id1; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 4240055274 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 40000 | 1992K| 66 (52)| 00:00:01 | | 1 | UNION-ALL | | | | | | |* 2 | HASH JOIN | | 20000 | 996K| 33 (4)| 00:00:01 | | 3 | TABLE ACCESS FULL| YYA | 20000 | 234K| 16 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL| YYC | 20000 | 761K| 16 (0)| 00:00:01 | |* 5 | HASH JOIN | | 20000 | 996K| 33 (4)| 00:00:01 | | 6 | TABLE ACCESS FULL| YYA | 20000 | 234K| 16 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL| YYB | 20000 | 761K| 16 (0)| 00:00:01 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("YYA"."ID1"="YYC"."ID1") 5 - access("YYA"."ID1"="YYB"."ID1") /*confused,Oracle有什么理由在这里反而不用join factorization了呢?看起来短期内join factorization的实际应用还有待"商榷" */ /*10053事件能解释这一问题吗?*/ SQL> alter system flush shared_pool; System altered. SQL> oradebug setmypid; Statement processed. SQL> oradebug event 10053 trace name context forever,level 1; Statement processed. SQL> explain plan for 2 select * from yya join yyb on yya.id1=yyb.id1 3 union all 4 select * from yya join yyc on yya.id1=yyc.id1; Explained. SQL> oradebug event 10053 trace name context off; Statement processed. SQL> oradebug tracefile_name; /home/maclean/app/maclean/diag/rdbms/prod/PROD/trace/PROD_ora_7907.trc view /home/maclean/app/maclean/diag/rdbms/prod/PROD/trace/PROD_ora_7907.trc *********************************** Cost-Based Join Factorization *********************************** Join-Factorization on query block SET$1 (#1) JF: Using search type: exhaustive JF: Generate basic transformation units Validating JF unit: (branch: {2, 3} table: {YYA, YYA}) rejected: join predicates do not match JF: Generate transformation units from basic units JF: No state generated. /*优化器认为其链接谓词不符合使用join factorization的条件,JF题案被驳回,"悬案"!*/
join factorization是很棒的新技术,这点没错,但新技术往往又是horrible(可怕的),最近我常用这个词。我们的问题是不是这个新来的引起的呢?通过join factorization关键字检索MOS,可以发现一个今年(2010)3月出现的Bug 9504322,quote:
Hdr: 9504322 11.2.0.1 RDBMS 11.2.0.1 QRY OPTIMIZER PRODID-5 PORTID-226
Abstract: WRONG RESULTS WITH UNION_ALL AND INLINE VIEWS*** 03/24/10 05:38 am ***
PROBLEM:
--------
Wrong results on 11.2 for queries of type:SELECT * FROM
(
SELECT ... FROM view, table WHERE ...
UNION ALL
SELECT ... FROM view, table WHERE NOT ...
);DIAGNOSTIC ANALYSIS:
--------------------
Problem seen between 10.2.0.4 and 11.2.0.1.
If we remove the use of inline view the correct results are returned.WORKAROUND:
-----------
N/ARELATED BUGS:
-------------REPRODUCIBILITY:
----------------
It is reproducing on generic 11.2.0.1
呵呵,似乎有点眉目了,不过实践是检验真理的唯一标准:
SQL> alter session set "_optimizer_join_factorization"=true; Session altered. SELECT MTL_SECONDARY_INVENTORIES.SECONDARY_INVENTORY_NAME, MTL_SECONDARY_INVENTORIES.ORGANIZATION_ID, MTL_SECONDARY_INVENTORIES.DESCRIPTION, MTL_SECONDARY_INVENTORIES.AVAILABILITY_TYPE, MTL_SECONDARY_INVENTORIES.MATERIAL_ACCOUNT, MTL_SECONDARY_INVENTORIES.MATERIAL_OVERHEAD_ACCOUNT, MTL_SECONDARY_INVENTORIES.RESOURCE_ACCOUNT, MTL_SECONDARY_INVENTORIES.OVERHEAD_ACCOUNT, MTL_SECONDARY_INVENTORIES.OUTSIDE_PROCESSING_ACCOUNT, MTL_SECONDARY_INVENTORIES.ASSET_INVENTORY, MTL_SECONDARY_INVENTORIES.EXPENSE_ACCOUNT, MTL_SECONDARY_INVENTORIES.ENCUMBRANCE_ACCOUNT, MTL_SECONDARY_INVENTORIES.ATTRIBUTE3, MTL_SECONDARY_INVENTORIES.ATTRIBUTE5, WORKFLOW_START_TIMES.WORKFLOW_START_TIME FROM REPEMEAERP.MTL_SECONDARY_INVENTORIES, REPEMEAERP.WORKFLOW_START_TIMES WHERE MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT > TO_DATE('01/01/1900 00:00:00', 'MM/DD/YYYY HH24:MI:SS') AND MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT <= WORKFLOW_START_TIMES.WORKFLOW_START_TIME AND WORKFLOW_START_TIMES.WORKFLOW_NAME = LTRIM(RTRIM('w_int_FreqBatch_EMEA')) /*以上是QUERY A*/ UNION ALL /*以下是QUERY B*/ SELECT DISTINCT 'WORKORDERS', MTL_SECONDARY_INVENTORIES.ORGANIZATION_ID, 'WORK ORDERS WITH WIP AS CATEGORY VALUE', 1, 0, 0, 0, 0, 0, 1, 0, 0, 'MOI', '0', WORKFLOW_START_TIMES.WORKFLOW_START_TIME FROM REPEMEAERP.MTL_SECONDARY_INVENTORIES, EIMMAINT.WORKFLOW_START_TIMES WHERE MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT > TO_DATE('01/01/1900 00:00:00', 'MM/DD/YYYY HH24:MI:SS') AND MTL_SECONDARY_INVENTORIES.DW_UPDATE_DT <= WORKFLOW_START_TIMES.WORKFLOW_START_TIME AND WORKFLOW_START_TIMES.WORKFLOW_NAME = LTRIM(RTRIM('w_int_FreqBatch_EMEA')) / 138 rows selected.
结果和我们猜想的大相径庭,join factorization并非罪魁,找不到终点让我们回到原点。
至此UNION ALL PUSHED PREDICATE有了极大的嫌疑,什么是PUSH PREDICATE?我把它叫做谓词前推,这玩样最早出现在10g上,但一直问题多多!它到底是何种OPERATION呢?让我们来看看下面的例子:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production SQL> create table youyus (t1 int,t2 varchar2(20)); Table created. SQL> alter table youyus add primary key(t1); Table altered. SQL> explain plan for 2 select * 3 from youyus 4 union all 5 select * from youyus; Explained. /*在之后的语句中将用到这个子查询*/ SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 1959159425 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 50 | 4 (50)| 00:00:01 | | 1 | UNION-ALL | | | | | | | 2 | TABLE ACCESS FULL| YOUYUS | 1 | 25 | 2 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| YOUYUS | 1 | 25 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------- /*在之后的语句中将用到这个子查询,这里它的"原始"执行计划十分简单*/ SQL> explain plan for 2 select v2.t1, v2.t2 3 from (select t1 from youyus where rownum=1) v1, 4 (select * 5 from youyus 6 union all 7 select * from youyus) v2 8 where v1.t1 = v2.t1; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 2456530141 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 27 | 1 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 1 | 27 | 1 (0)| 00:00:01 | | 2 | VIEW | | 1 | 13 | 1 (0)| 00:00:01 | |* 3 | COUNT STOPKEY | | | | | | | 4 | INDEX FULL SCAN | SYS_C0010819 | 1 | 13 | 1 (0)| 00:00:01 | | 5 | VIEW | | 1 | 14 | 0 (0)| 00:00:01 | | 6 | UNION ALL PUSHED PREDICATE | | | | | | | 7 | TABLE ACCESS BY INDEX ROWID| YOUYUS | 1 | 25 | 0 (0)| 00:00:01 | |* 8 | INDEX UNIQUE SCAN | SYS_C0010819 | 1 | | 0 (0)| 00:00:01 | | 9 | TABLE ACCESS BY INDEX ROWID| YOUYUS | 1 | 25 | 0 (0)| 00:00:01 | |* 10 | INDEX UNIQUE SCAN | SYS_C0010819 | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(ROWNUM=1) 8 - access("YOUYUS"."T1"="V1"."T1") 10 - access("YOUYUS"."T1"="V1"."T1") /* PUSHED PREDICATE将谓词逻辑前推到UNION ALL的子查询中,其优势在于可以避免全表扫描,利用索引*/ SQL> set linesize 100 pagesize 1400; SQL> SQL> explain plan for 2 select /*+ no_push_pred(v2) */ v2.t1, v2.t2 3 from (select t1 from youyus where rownum=1) v1, 4 (select * 5 from youyus 6 union all 7 select * from youyus) v2 8 where v1.t1 = v2.t1; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 2769827061 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 38 | 6 (17)| 00:00:01 | |* 1 | HASH JOIN | | 1 | 38 | 6 (17)| 00:00:01 | | 2 | VIEW | | 1 | 13 | 1 (0)| 00:00:01 | |* 3 | COUNT STOPKEY | | | | | | | 4 | INDEX FULL SCAN | SYS_C0010819 | 1 | 13 | 1 (0)| 00:00:01 | | 5 | VIEW | | 2 | 50 | 4 (0)| 00:00:01 | | 6 | UNION-ALL | | | | | | | 7 | TABLE ACCESS FULL| YOUYUS | 1 | 25 | 2 (0)| 00:00:01 | | 8 | TABLE ACCESS FULL| YOUYUS | 1 | 25 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("V1"."T1"="V2"."T1") 3 - filter(ROWNUM=1) /*no_push_pred hint让Oracle 放弃使用PUSHED PREDICATE,使用常规UNION-ALL操作后,子查询执行计划回归成全表扫描,整个计划成本上升*/
Leave a Reply