日期:2014-05-16  浏览次数:20468 次

对NESTED LOOP进行Hash Join优化

在网上看到一篇不错的讨论贴 记录下来
大概意思是这样:
? 现有两个表:A表(3-4w记录) B表(1-2w记录)
? 要求:找出A表name字段 前面部分与B表name相同的记录

?

一般我们会这样写

select a.*,b.name from a,b
where substr(a.name,1,length(b.name))=b.name

?或

select a.*,b.name from a,b
where  a.name  like b.name||'%';

?但这样性能会出现严重问题

SQL> select a.*,b.object_name from TEST_OBJECT a, TEST_OBJ1 b
  2  where substr(a.object_name,1,length(b.object_name))=b.object_name;

已选择108612行。

已用时间:  00: 06: 51.24
执行计划
----------------------
Plan hash value: 661671976

----------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------
|   0 | SELECT STATEMENT   |             | 68676 |  7243K|  1745K  (1)| 05:49:05 |
|   1 |  NESTED LOOPS      |             | 68676 |  7243K|  1745K  (1)| 05:49:05 |
|   2 |   TABLE ACCESS FULL| TEST_OBJECT | 52544 |  4566K|   139   (1)| 00:00:02 |
|*  3 |   TABLE ACCESS FULL| TEST_OBJ1   |     1 |    19 |    33   (0)| 00:00:01 |
----------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJ
              ECT_NAME")))


统计信息
----------------------
          1  recursive calls
          0  db block gets
    9262734  consistent gets
          0  physical reads
          0  redo size
    4408357  bytes sent via SQL*Net to client
      80055  bytes received via SQL*Net from client
       7242  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     108612  rows processed

? ?因为系统分析时使用 NESTED LOOPS???连接, NESTED LOOPS?用小表做为驱动表时还是可以的,但使用大表做为驱动表那将是灾难的开始。

??? 那么就这样简单的一个语句还能如何优化?

?? 看下网友anlinew提供的解决方法

?

SQL> select a.*,b.object_name from TEST_OBJECT a, TEST_OBJ1 b
  2  where substr(a.object_name,1,length(b.object_name))=b.object_name
  3  and substr(a.object_name,1,4)=substr(b.object_name,1,4)   and length(b.object_name)>3
  5  union all
  6  select a.*,b.object_name from TEST_OBJECT a, TEST_OBJ1 b
  7  where substr(a.object_name,1,length(b.object_name))=b.object_name
  8  and length(b.object_name)<4;

已选择108612行。

已用时间:  00: 00: 21.54
执行计划
----------------------
Plan hash value: 4080738151

-----------------------------------------------
| Id  | Operation           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------
|   0 | SELECT STATEMENT    |             |  3468 |   365K| 90997 (100)| 00:18:12 |
|   1 |  UNION-ALL          |             |       |       |            |          |
|*  2 |   HASH JOIN         |             |    34 |  3672 |   176   (2)| 00:00:03 |
|*  3 |    TABLE ACCESS FULL| TEST_OBJ1   |   659 | 12521 |    35   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| TEST_OBJECT | 52544 |  4566K|   139   (1)| 00:00:02 |
|   5 |   NESTED LOOPS      |             |  3434 |   362K| 90821   (1)| 00:18:10 |
|*  6 |    TABLE ACCESS FULL| TEST_OBJ1   |   659 | 12521 |    35   (0)| 00:00:01 |
|*  7 |    TABLE ACCESS FULL| TEST_OBJECT |     5 |   445 |   138   (1)| 00:00:02 |
-----------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("A"."OBJECT_NAME",1,4)=SUBSTR("B"."OBJECT_NAME",1,4))
       filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJE
              CT_NAME")))
   3 - filter(LENGTH("B"."OBJECT_NAME")>3)
   6 - filter(LENGTH("B"."OBJECT_NAME")<4)
   7 - filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJE
              CT_NAME")))


统计信息
----------------