在调查一个逻辑从库(数据库版本10.2.0.4)的性能问题时,发现的一个sql语句:
select count(1) from (select a.ID from cms_article_compose a where A.ID IN (select aid from cms_article_keywords where upper(keyword) = upper('技术') or upper(keyword) = upper('技巧') or upper(keyword) = upper('评测')) AND (a.SITEMAP_ID_LEVEL_CODE between 8707375104 and 8724152319) and ((a.CATEGORY_ID_LEVEL_CODE between 21541945344 and 21558722559) or (a.CATEGORY_ID_LEVEL_CODE between 21592276992 and 21609054207) or (a.CATEGORY_ID_LEVEL_CODE between 21625831424 and 21642608639) or (a.CATEGORY_ID_LEVEL_CODE between 21659385856 and 21676163071)) and a.status = 5 and a.score >= 1 and a.id not in (select article_id from cms_block_article b where b.sitemap_id = 122 and b.template_id = 1626 and b.block_code = 'b_jszq' and b.type = 2 and sysdate between nvl(b.start_time, sysdate - 100) and nvl(b.end_time, sysdate + 100) ) and a.id not in (select a.ID from cms_article_compose a, cms_block_article b where a.status = 5 and b.sitemap_id = 122 and b.template_id = 1626 and b.block_code = 'b_jszq' and b.type = 1 and sysdate between nvl(b.start_time, sysdate - 100) and nvl(b.end_time, sysdate + 100) and b.article_id > -1 and a.id = b.article_id) ); COUNT(1) ---------- 966 select * from table(dbms_xplan.display_cursor(null,null,'typical iostats last')); ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:10.96 | 259K| |* 2 | HASH JOIN ANTI | | 1 | 1 | 312 (3)| 966 |00:00:10.96 | 259K| | 3 | NESTED LOOPS ANTI | | 1 | 1 | 307 (2)| 970 |00:00:10.95 | 259K| | 4 | NESTED LOOPS | | 1 | 1 | 305 (2)| 970 |00:00:10.79 | 245K| | 5 | SORT UNIQUE | | 1 | 51942 | 52 (2)| 81560 |00:00:00.98 | 342 | | 6 | INLIST ITERATOR | | 1 | | | 81584 |00:00:00.08 | 342 | |* 7 | INDEX RANGE SCAN | UK_CMS_ARTICLE_KEYWORDS | 3 | 51942 | 52 (2)| 81584 |00:00:00.01 | 342 | |* 8 | TABLE ACCESS BY INDEX ROWID| CMS_ARTICLE_COMPOSE | 81560 | 1 | 1 (0)| 970 |00:00:09.66 | 244K| |* 9 | INDEX UNIQUE SCAN | PK_ID | 81560 | 1 | 1 (0)| 81553 |00:00:02.75 | 163K| |* 10 | TABLE ACCESS BY INDEX ROWID | CMS_BLOCK_ARTICLE | 970 | 1 | 2 (0)| 0 |00:00:00.12 |14552| |* 11 | INDEX RANGE SCAN | IDX_CMS_BLOCK_ARTICLE | 970 | 33 | 1 (0)| 15520 |00:00:00.04 |1942 | | 12 | VIEW | VW_NSO_1 | 1 | 1 | 5 (0)| 16 |00:00:00.01 | 66 | | 13 | NESTED LOOPS | | 1 | 1 | 5 (0)| 16 |00:00:00.01 | 66 | |* 14 | TABLE ACCESS BY INDEX ROWID| CMS_BLOCK_ARTICLE | 1 | 1 | 4 (0)| 16 |00:00:00.01 | 16 | |* 15 | INDEX RANGE SCAN | IDX_CMS_BLOCK_ARTICLE | 1 | 33 | 2 (0)| 16 |00:00:00.01 | 3 | |* 16 | TABLE ACCESS BY INDEX ROWID| CMS_ARTICLE_COMPOSE | 16 | 1 | 1 (0)| 16 |00:00:00.01 | 50 | |* 17 | INDEX UNIQUE SCAN | PK_ID | 16 | 1 | 1 (0)| 16 |00:00:00.01 | 34 | --------------------------------------------------------------------------------------------------------------------------------
当然,这里逻辑IO偏高,主要是因为步骤8,9循环的次数太多了,主要的时间消耗也是发生在步骤8,9上了.当然,一开始我认为这里的问题和这个系统中Optimizer_Index_Caching=60有关,我在另一个系统中遇到过类似的问题,我在这篇文章http://blog.csdn.net/zhaosj1726/archive/2010/12/04/6054650.aspx中论及过(Optimizer_Index_Caching<>0,并且nl连接的内表使用唯一索引扫描的时候,索引访问的cost可能就会变成0,也就是说不管外表返回的card有多大,内表访问是完全免费的,导致优化器选择了性能很差的执行计划),一开始我没想细到:Optimizer_Index_Caching=60时,索引的访问代价确实可能按0计算了,可回访表的代价1还是有的呀,这里51942次,optimizer_index_cost_adj=50的情况下,访问内表的代价也有25971呢,这样和执行计划显示的cost相差很大的,不过一开始我没有细想这些,其实也是这里的执行计划的显示误导了我.一开始我关注的是步骤4实际返回了970行数据,而估算却是返回1行数据,注意这里的1很可能是代表0的,所以这个问题需要调查一下.
select a.ID from cms_article_compose a where A.ID IN (select aid from cms_article_keywords where upper(keyword) = upper('技术') or upper(keyword) = upper('技巧') or upper(keyword) = upper('评测')) AND (a.SITEMAP_ID_LEVEL_CODE between 8707375104 and 8724152319) and ((a.CATEGORY_ID_LEVEL_CODE between 21541945344 and 21558722559) or (a.CATEGORY_ID_LEVEL_CODE between 21592276992 and 21609054207) or (a.CATEGORY_ID_LEVEL_CODE between 21625831424 and 21642608639) or (a.CATEGORY_ID_LEVEL_CODE between 21659385856 and 21676163071)) and a.status = 5 and a.score >= 1; 这是那部分相应的sql语句. ------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------------------------------------------------ |* 1 | HASH JOIN SEMI | | 1 | 1524 | 1260 (1)| 970 |00:00:00.58 | 12051 | |* 2 | TABLE ACCESS BY INDEX ROWID| CMS_ARTICLE_COMPOSE | 1 | 1601 | 1207 (1)| 20762 |00:00:00.17 | 11644 | |* 3 | INDEX RANGE SCAN | IND_ARTCOMPOSE_STATUS_SLEVEL | 1 | 4469 | 10 (0)| 26386 |00:00:00.03 | 87 | | 4 | INLIST ITERATOR | | 1 | | | 81586 |00:00:00.16 | 407 | |* 5 | INDEX RANGE SCAN | UK_CMS_ARTICLE_KEYWORDS | 3 | 51942 | 52 (2)| 81586 |00:00:00.08 | 407 | ------------------------------------------------------------------------------------------------------------------------------------
明显,这里的逻辑IO和执行时间都大幅的下降了.我们不去论及为什么作为一个sql的一部分时选择了nl + INDEX UNIQUE SCAN,而作为一个单独的sql时却选择了hash_sj.我们关注的是这里card估算是1524,而不是上面的1了.
select /*+ leading(b a) use_nl(a) index(a (id)) no_merge(b)*/a.ID from cms_article_compose a, (select distinct aid from cms_article_keywords where upper(keyword) = upper('技术') or upper(keyword) = upper('技巧') or upper(keyword) = upper('评测')) b where (a.SITEMAP_ID_LEVEL_CODE between 8707375104 and 8724152319) and ((a.CATEGORY_ID_LEVEL_CODE between 21541945344 and 21558722559) or (a.CATEGORY_ID_LEVEL_CODE between 21592276992 and 21609054207) or (a.CATEGORY_ID_LEVEL_CODE between 21625831424 and 21642608639) or (a.CATEGORY_ID_LEVEL_CODE between 21659385856 and 21676163071)) and a.status = 5 and a.score >= 1 and b.aid=a.id; ------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------------------------------------------- | 1 | NESTED LOOPS | | 1 | 1601 | 26085 (1)| 970 |00:00:25.05 | 245K| | 2 | VIEW | | 1 | 51507 | 303 (2)| 81564 |00:00:05.38 | 342 | | 3 | HASH UNIQUE | | 1 | 51507 | 303 (2)| 81564 |00:00:04.64 | 342 | | 4 | INLIST ITERATOR | | 1 | | | 81588 |00:00:04.32 | 342 | |* 5 | INDEX RANGE SCAN | UK_CMS_ARTICLE_KEYWORDS | 3 | 51942 | 52 (2)| 81588 |00:00:02.45 | 342 | |* 6 | TABLE ACCESS BY INDEX ROWID| CMS_ARTICLE_COMPOSE | 81564 | 1 | 1 (0)| 970 |00:00:18.56 | 244K| |* 7 | INDEX UNIQUE SCAN | PK_ID | 81564 | 1 | 1 (0)| 81557 |00:00:08.07 | 163K| -------------------------------------------------------------------------------------------------------------------------------
这里是几乎相同的执行计划了(当然还是有些不同的,比如这里多了一个view行,这里是hash unique,而不是sort unique),这里的card估算是1601,也不是上面的1.其实,这里还有一件事情挺有意思的:同一个sql,不同的执行计划,card估算居然是不同的?看来还是需要做一个10053事件:
select count(1) from (select a6.ID from cms_article_compose a6 where a6.ID IN (select aid from cms_article_keywords where upper(keyword) = upper('技术') or upper(keyword) = upper('技巧') or upper(keyword) = upper('评测')) AND (a6.SITEMAP_ID_LEVEL_CODE between 8707375104 and 8724152319) and ((a6.CATEGORY_ID_LEVEL_CODE between 21541945344 and 21558722559) or (a6.CATEGORY_ID_LEVEL_CODE between 21592276992 and 21609054207) or (a6.CATEGORY_ID_LEVEL_CODE between 21625831424 and 21642608639) or (a6.CATEGORY_ID_LEVEL_CODE between 21659385856 and 21676163071)) and a6.status = 5 and a6.score >= 1 and a6.id not in (select article_id from cms_block_article b1 where b1.sitemap_id = 122 and b1.template_id = 1626 and b1.block_code = 'b_jszq' and b1.type = 2 and sysdate between nvl(b1.start_time, sysdate - 100) and nvl(b1.end_time, sysdate + 100) ) and a6.id not in (select a1.ID from cms_article_compose a1, cms_block_article b2 where a1.status = 5 and b2.sitemap_id = 122 and b2.template_id = 1626 and b2.block_code = 'b_jszq' and b2.type = 1 and sysdate between nvl(b2.start_time, sysdate - 100) and nvl(b2.end_time, sysdate + 100) and b2.article_id > -1 and a1.id = b2.article_id) );
我把sql稍微改写了一下,就是不同部分的同样对象名给了不同的别名,这样便于区分,因为10053 trace文件中都在论及某个对象时给出了别名的。============ Plan Table ============ ------------------------------------------------------------------+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | ------------------------------------------------------------------+-----------------------------------+ | 0 | SELECT STATEMENT | | | | 312 | | | 1 | SORT AGGREGATE | | 1 | 92 | | | | 2 | HASH JOIN ANTI | | 1 | 92 | 312 | 00:00:04 | | 3 | NESTED LOOPS ANTI | | 1 | 79 | 307 | 00:00:04 | | 4 | NESTED LOOPS | | 1 | 41 | 305 | 00:00:04 | | 5 | SORT UNIQUE | | 51K | 659K | 52 | 00:00:01 | | 6 | INLIST ITERATOR | | | | | | | 7 | INDEX RANGE SCAN | UK_CMS_ARTICLE_KEYWORDS| 51K | 659K | 52 | 00:00:01 | | 8 | TABLE ACCESS BY INDEX ROWID | CMS_ARTICLE_COMPOSE | 1 | 28 | 1 | 00:00:01 | | 9 | INDEX UNIQUE SCAN | PK_ID | 1 | | 1 | 00:00:01 | | 10 | TABLE ACCESS BY INDEX ROWID | CMS_BLOCK_ARTICLE | 1 | 38 | 2 | 00:00:01 | | 11 | INDEX RANGE SCAN | IDX_CMS_BLOCK_ARTICLE | 33 | | 1 | 00:00:01 | | 12 | VIEW | VW_NSO_1 | 1 | 13 | 5 | 00:00:01 | | 13 | NESTED LOOPS | | 1 | 47 | 5 | 00:00:01 | | 14 | TABLE ACCESS BY INDEX ROWID | CMS_BLOCK_ARTICLE | 1 | 38 | 4 | 00:00:01 | | 15 | INDEX RANGE SCAN | IDX_CMS_BLOCK_ARTICLE | 33 | | 2 | 00:00:01 | | 16 | TABLE ACCESS BY INDEX ROWID | CMS_ARTICLE_COMPOSE | 1 | 9 | 1 | 00:00:01 | | 17 | INDEX UNIQUE SCAN | PK_ID | 1 | | 1 | 00:00:01 | ------------------------------------------------------------------+-----------------------------------+
从后往前推(因为我们这里主要是关注这里为什么得到了1这个card,而不是这个执行计划是不是最优的)Final - All Rows Plan: Best join order: 7 Cost: 312.3564 Degree: 1 Card: 1.0000 Bytes: 92 Resc: 312.3564 Resc_io: 305.4525 Resc_cpu: 59150512 Resp: 312.3564 Resp_io: 305.4525 Resc_cpu: 59150512
我们往上找连接顺序7:Join order[7]: CMS_ARTICLE_KEYWORDS[CMS_ARTICLE_KEYWORDS]#3 CMS_ARTICLE_COMPOSE[A6]#0 CMS_BLOCK_ARTICLE[B1]#1 VW_NSO_1[VW_NSO_1]#2 SORT resource Sort statistics Sort width: 2455 Area size: 1048576 Max Area size: 429496320 Degree: 1 Blocks to Sort: 159 Row size: 25 Total Rows: 51942 --这个是排序的行数 Initial runs: 2 Merge passes: 1 IO Cost / pass: 88 Total IO sort cost: 247 Total CPU sort cost: 49141986 Total Temp space used: 2515000***************Now joining: CMS_ARTICLE_COMPOSE[A6]#0***************NL Join Outer table: Card: 1.00 Cost: 304.35 Resp: 304.35 Degree: 1 Bytes: 13 Inner table: CMS_ARTICLE_COMPOSE Alias: A6
Join Card: 0.03 = outer (1.00) * inner (1601.23) * sel (1.9415e-05)
Best:: JoinMethod: NestedLoop Cost: 305.35 Degree: 1 Resp: 305.35 Card: 0.03 Bytes: 41
Best so far: Table#: 3 cost: 304.3459 card: 1.0000 bytes: 13 Table#: 0 cost: 305.3465 card: 0.0311 bytes: 41 Table#: 1 cost: 307.3506 card: 0.0311 bytes: 79 Table#: 2 cost: 312.3564 card: 0.0310 bytes: 92这里的顺序和执行计划中的顺序是一致的,而且代价也是一致的,注意这里第一次连接的时候,外表也就是CMS_ARTICLE_KEYWORDS表估算的card是1.再往前看单表访问路径处CMS_ARTICLE_KEYWORDS的card的估算: Table: CMS_ARTICLE_KEYWORDS Alias: CMS_ARTICLE_KEYWORDS Card: Original: 6592549 Rounded: 51942 Computed: 51942.02 Non Adjusted: 51942.02
Best:: AccessPath: IndexRange Index: UK_CMS_ARTICLE_KEYWORDS Cost: 51.61 Degree: 1 Resp: 51.61 Card: 51942.02 Bytes: 0注意这里,card还是对的:51942,这一点在上面的排序行数:51942处也可以得到佐证.可经过一个sort操作(sort unique)忽然就变为1了,为什么呢?感觉bug的可能性还是很大的.其实,这里无论是10053文件里的执行计划还是display_cursor显示的执行计划,它们的显示似乎都是有问题的,它们都是显示:sort unique后,card还是51942,cost还是52,似乎是nl后card变为了1,cost变成了305似的.但从10053文件看来,似乎不是这样的,应该是sort unique后:card=1,cost=304.35,它这里的304.35是如何计算出来的呢?就是cost(IndexRange UK_CMS_ARTICLE_KEYWORDS)+排序的io代价+排序的cpu代价: 51.61 + 247 + 49141986000/714000000/(10+ 8192/4096) = 304.3455( Using NOWORKLOAD Stats CPUSPEED: 714 millions instruction/sec IOTFRSPEED: 4096 bytes per millisecond (default is 4096) IOSEEKTIM: 10 milliseconds (default is 10) 使用8k大小的数据块)因为nl的外表返回1行,内表的访问成本是:OPTIMIZER PERCENT INDEX CACHING = 60 Access Path: index (UniqueScan) Index: PK_ID resc_io: 1.00 resc_cpu: 9339 ix_sel: 4.1692e-07 ix_sel_with_filters: 4.1692e-07 NL Join: Cost: 305.35 Resp: 305.35 Degree: 1所以nl连接后的总成本cost=304.35 + 1*1 = 305.35,nl连接后的card是0.03,显示为1.
总结一下:总的来说,就是sort unique后,card的估算忽然变成了1,实际却大得多,导致了随后的一系列问题.当然,这里执行计划的显示也存在一些问题,似乎没有正确的显示实际的估算情况.至于sort unique后为什么card的估算就变成1了,10053 trace文件没有给出任何的信息,也不清楚,我觉得bug的可能性还是有的.