日期:2014-05-16  浏览次数:20859 次

海量数据查询问题
数据库大概情况如下;
1,每天新建一张表,按照时间范围分为24个区,
2,每分钟向表里导入十万条数据,

目前测试时表中大概250万条数据,进行查询时用时需要20s,感觉有点太慢了,请教各位高手,应该如何优化呢?

------解决方案--------------------
试下改成With的方式呢
SQL code
WITH w1 AS (
SELECT b.CDRID, b.BASE_CDRID, b.START_TIME, b.REPORT_TIME, b.IMSI
  FROM CDR_DATA_A20111024 B
 WHERE ((B.START_TIME BETWEEN
       TO_DATE('2011-10-24 08:07:47', 'yyyy-mm-dd hh24:mi:ss') - 2 / 24 AND
       TO_DATE('2011-10-24 23:57:47', 'yyyy-mm-dd hh24:mi:ss')) AND
       (B.REPORT_TIME >
       TO_DATE('2011-10-24 08:07:47', 'yyyy-mm-dd hh24:mi:ss')))
   AND (B.IMSI IN ('460022517717646') OR
       B.PEER_NUMBER IN ('460022517717646')))
SELECT T1.START_TIME, T1.REPORT_TIME, T1.IMSI, T1.CDRID
  FROM W1 T1
 WHERE EXISTS (SELECT T2.CDRID
          FROM W1 T2
         WHERE T1. CDRID = T2.CDRID
            OR T1.BASE_CDRID = T2.BASE_CDRID)

------解决方案--------------------
你的sql需求是什么呢?表a的条件和表b的条件除了(b.cdrid = a.cdrid OR b.base_cdrid = a.cdrid)其他都一样的,因为表A和表B同一个表,(b.cdrid = a.cdrid OR b.base_cdrid = a.cdrid)在cdrid不为null的时候永远是true,所以你的sql等价:
SQL code

SELECT a.start_time,
       a.report_time,
       a.imsi,
       a.cdrid
  FROM cdr_data_a20111024 a
 WHERE (
           (a.start_time BETWEEN TO_DATE('2011-10-24 08:07:47', 'yyyy-mm-dd hh24:mi:ss') - 2 / 24 AND TO_DATE('2011-10-24 23:57:47', 'yyyy-mm-dd hh24:mi:ss')) AND
           (a.report_time > TO_DATE('2011-10-24 08:07:47', 'yyyy-mm-dd hh24:mi:ss'))
       )
   AND (a.imsi = '460022517717646' OR a.peer_number = '460022517717646')