Query takes too long to query with OR clause but their parts are very quick

问题: I have two tables with ~1M rows indexed by their Id's. the fallowing query... SELECT t.* FROM transactions t INNER JOIN integration it ON it.id_trans = t.id_trans WHERE...

问题:

I have two tables with ~1M rows indexed by their Id's.

the fallowing query...

SELECT t.* FROM transactions t
INNER JOIN integration it ON it.id_trans = t.id_trans
WHERE t.id_trans = '5440073'
OR it.id_integration = '439580587'

This query takes about 30s. But ...

SELECT ... WHERE t.id_trans = '5440073'

takes less than 100ms and

SELECT ... WHERE it.id_integration = '439580587'

also takes less than 100ms. Even

SELECT ... WHERE t.id_trans = '5440073'
UNION
SELECT ... WHERE it.id_integration = '439580587'

takes less then 100ms

Why does the OR clause takes so much time even if the parts being so fast?


回答1:

Why is OR so slow, but UNION is so fast?

Do you understand why UNION is fast? Because it can use two separate indexes to good advantage, and gather some result rows from each part of the UNION, then combine the results together.

But why can't OR do that? Simply put, the Optimizer is not smart enough to try that angle.

In your case, the tests are on different tables; this leads to radically different query plans (see EXPLAIN SELECT ...) for the two parts of the UNION. Each can be well optimized, so each is fast.

Assuming each part delivers only a few rows, the subsequent overhead of UNION is minor -- namely to gather the two small sets of row, dedup them (if you use UNION DISTINCT instead of UNION ALL), and deliver the results.

Meanwhile, the OR query effectively gather all combinations of the two tables, then filtered out based on the two parts of the OR. The intermediate stage may involve a huge temp table, only to have most of the rows tossed.

(Another example of inflate-deflate is JOINs + GROUP BY. The workarounds are different.)


回答2:

I would suggest writing the query using UNION ALL:

SELECT t.*
FROM transactions t
WHERE t.id_trans = '5440073'
UNION ALL
SELECT t.*
FROM transactions t JOIN
     integration it 
     ON it.id_trans = t.id_trans
WHERE t.id_trans <> '5440073' AND
      it.id_integration = '439580587';

Note: If the ids are really numbers (and not strings), then drop the single quotes. Mixing types can sometimes confuse the optimizer and prevent the use of indexes.

  • 发表于 2018-08-31 16:10
  • 阅读 ( 236 )
  • 分类:sof

条评论

请先 登录 后评论
不写代码的码农
小编

篇文章

作家榜 »

  1. 小编 文章
返回顶部
部分文章转自于网络,若有侵权请联系我们删除