问题:
I have two tables with ~1M rows indexed by their Id's.
the fallowing query...
SELECT t.* FROM transactions t
INNER JOIN integration it ON it.id_trans = t.id_trans
WHERE...
可以将文章内容翻译成中文,广告屏蔽插件会导致该功能失效:
问题:
I have two tables with ~1M rows indexed by their Id's.
the fallowing query...
SELECT t.* FROM transactions t
INNER JOIN integration it ON it.id_trans = t.id_trans
WHERE t.id_trans = '5440073'
OR it.id_integration = '439580587'
This query takes about 30s. But ...
SELECT ... WHERE t.id_trans = '5440073'
takes less than 100ms and
SELECT ... WHERE it.id_integration = '439580587'
also takes less than 100ms. Even
SELECT ... WHERE t.id_trans = '5440073'
UNION
SELECT ... WHERE it.id_integration = '439580587'
takes less then 100ms
Why does the OR
clause takes so much time even if the parts being so fast?
回答1:
Why is OR
so slow, but UNION
is so fast?
Do you understand why UNION
is fast? Because it can use two separate indexes to good advantage, and gather some result rows from each part of the UNION
, then combine the results together.
But why can't OR
do that? Simply put, the Optimizer is not smart enough to try that angle.
In your case, the tests are on different tables; this leads to radically different query plans (see EXPLAIN SELECT ...
) for the two parts of the UNION
. Each can be well optimized, so each is fast.
Assuming each part delivers only a few rows, the subsequent overhead of UNION
is minor -- namely to gather the two small sets of row, dedup them (if you use UNION DISTINCT
instead of UNION ALL
), and deliver the results.
Meanwhile, the OR
query effectively gather all combinations of the two tables, then filtered out based on the two parts of the OR
. The intermediate stage may involve a huge temp table, only to have most of the rows tossed.
(Another example of inflate-deflate is JOINs
+ GROUP BY
. The workarounds are different.)
回答2:
I would suggest writing the query using UNION ALL
:
SELECT t.*
FROM transactions t
WHERE t.id_trans = '5440073'
UNION ALL
SELECT t.*
FROM transactions t JOIN
integration it
ON it.id_trans = t.id_trans
WHERE t.id_trans <> '5440073' AND
it.id_integration = '439580587';
Note: If the ids are really numbers (and not strings), then drop the single quotes. Mixing types can sometimes confuse the optimizer and prevent the use of indexes.