偷偷摘套内射激情视频,久久精品99国产国产精,中文字幕无线乱码人妻,中文在线中文a,性爽19p

<big id="qtase"><code id="qtase"><optgroup id="qtase"></optgroup></code></big>

<sub id="qtase"></sub>

<cite id="qtase"><li id="qtase"></li></cite>

AI.x社區(qū)

軟考社區(qū)

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

信創(chuàng)認(rèn)證

公眾號矩陣

移動端

視頻課免費課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考信創(chuàng)認(rèn)證華為認(rèn)證廠商認(rèn)證 IT技術(shù)PMP項目管理免費題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營

鴻蒙開發(fā)者社區(qū)訂閱號

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號

51CTO軟考題庫

賬號設(shè)置退出

GreatSQL優(yōu)化技巧：手動實現(xiàn)謂詞下推

作者：JennyYu 2025-08-28 02:11:00

數(shù)據(jù)庫其他數(shù)據(jù)庫

SQL優(yōu)化的核心思想是減少I/O開銷，無論什么優(yōu)化技巧都是圍繞這個主題，根據(jù)SQL具體情況演變出的形形色色的方法而已。萬變不離其宗，本案例也是如此。

導(dǎo)語

最近總是聽到用 AI 來優(yōu)化 SQL 的言論，今天心血來潮試了一下，把表結(jié)構(gòu)、統(tǒng)計信息、SQL語句、執(zhí)行計劃都告訴AI，AI給出了一大堆的建議，它會從索引，語句改寫，參數(shù)調(diào)整各個方面給出優(yōu)化策略，看似面面俱到，但是如果不懂優(yōu)化理論，隨便使用其給出的優(yōu)化建議，可能的結(jié)果就是，一頓操作猛如虎，一看戰(zhàn)績零杠五。所以本人還是老老實實的總結(jié)優(yōu)化技巧吧，這些案例技巧或許某天會成為 AI 的營養(yǎng)餐。

SQL 案例

SQL 案例語句：(實際業(yè)務(wù)場景太復(fù)雜，截取片段來說明本文主題)

SELECT ta.*, tb.*
FROM (SELECT *
          FROM (SELECT a.contactid,
                       a.subs_number,
                       a.log_time,
                       ROW_NUMBER() OVER(PARTITION BY a.contactid, a.subs_number ORDERBY a.log_time DESC) rn,
                       a.log_id
                  FROM a
                 WHERE a.contactid IS NOT NULL
                   AND a.log_time >= '2025-05-30 00:00:00'
                   AND a.log_time <= '2025-06-02') cc
         WHERE rn = 1) ta
LEFTJOIN (SELECT b.*,
                    ROW_NUMBER() OVER(PARTITIONBY b.basesn ORDERBY b.create_time DESC) rn
               FROM b
              WHERE b.create_time IS NOT NULL) tb
    ON ta.contactid = tb.basesn
   AND tb.rn = 1

下面支撐該案例 SQL 的測試表結(jié)構(gòu)，符合案例 SQL 特點的測試數(shù)據(jù)。

CREATE TABLE a(log_id bigint,CONTACTID INT,subs_number INT,log_time datetime,PRIMARY KEY (log_id),KEY idx_logtime(log_time));
   CREATE TABLE b(idbigint PRIMARY KEY,basesn INT,create_time datetime,KEY idx_basesn(basesn));
   
   delimiter //
   
   CREATE ORREPLACEPROCEDURE P1() IS
   BEGIN
     FOR I IN1 .. 10000LOOP
        INSERTINTO a(log_id,contactid,subs_number,log_time) VALUES(i,TRUNC(rand()*8000),TRUNC(rand()*9000),SYSDATE-rand()*90);
     END LOOP;
     
     FOR I IN 1 .. 1000000 LOOP
        INSERT INTO b(id,basesn,create_time) VALUES(i,TRUNC(rand()*800000),SYSDATE-rand()*90);
     END LOOP;
   END;
   //
   delimiter ;

兩表的統(tǒng)計信息如下：

greatsql> SHOW index FROM a;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name    | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| a     |          0 | PRIMARY     |            1 | log_id      | A         |       10000 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| a     |          1 | idx_logtime |            1 | log_time    | A         |        9990 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
2rowsinset (0.00 sec)

greatsql> SHOW index FROM b;
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| b     |          0 | PRIMARY    |            1 | id          | A         |      916864 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| b     |          1 | idx_basesn |            1 | basesn      | A         |      515268 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
2rowsinset (0.00 sec)

語句分析

SQL有兩個派生表ta，tb，這兩表做left join 左外連接。派生表ta，作為左外連接的左表，內(nèi)層表a有l(wèi)og_time過濾條件，該列有單列索引，查詢兩三天的數(shù)據(jù)數(shù)據(jù)量命中幾百行，查詢一個月左右的數(shù)據(jù)量命中幾千到1萬左右。派生表tb，作為左外連接的右表，內(nèi)層表b全表百萬級別的數(shù)據(jù)量，條件create_time is not null過濾性不好。兩個派生表都使用了窗口函數(shù)ROW_NUMBER()

執(zhí)行計劃分析

語句實際執(zhí)行計劃如下：

EXPLAIN: -> Nestedloopleftjoin  (cost=22497.56rows=0) (actual time=6181.328..6182.085rows=331 loops=1)
    -> Filter: (cc.rn = 1)  (cost=1.21..35.91rows=30) (actual time=2.712..2.794rows=331 loops=1)
        -> Tablescanon cc  (cost=2.50..2.50rows=0) (actual time=2.704..2.752rows=331 loops=1)
            -> Materialize  (cost=0.00..0.00rows=0) (actual time=2.698..2.698rows=331 loops=1)
                -> Windowaggregate: row_number() OVER (PARTITIONBY a.CONTACTID,a.subs_number ORDERBY a.log_time desc )   (actual time=2.189..2.323rows=331 loops=1)
                    -> Sort: a.CONTACTID, a.subs_number, a.log_time DESC  (cost=149.21rows=331) (actual time=2.168..2.190rows=331 loops=1)
                        -> Filter: (a.CONTACTID isnotnull)  (cost=149.21rows=331) (actual time=0.156..1.847rows=331 loops=1)
                            -> Indexrangescanon a using idx_logtime over ('2025-05-30 00:00:00' <= log_time <= '2025-06-02 00:00:00'), withindex condition: ((a.log_time >= TIMESTAMP'2025-05-30 00:00:00') and (a.log_time <= TIMESTAMP'2025-06-02 00:00:00'))  (cost=149.21rows=331) (actual time=0.147..1.806rows=331 loops=1)
    -> Index lookup on tb using <auto_key0> (basesn=cc.contactid, rn=1)  (cost=0.25..766.47rows=3025) (actual time=18.668..18.668rows=1 loops=331)
        -> Materialize  (cost=0.00..0.00rows=0) (actual time=6178.570..6178.570rows=1000000 loops=1)
            -> Windowaggregate: row_number() OVER (PARTITIONBY b.basesn ORDERBY b.create_time desc )   (actual time=2153.616..3469.381rows=1000000 loops=1)
                -> Sort: b.basesn, b.create_time DESC  (cost=100382.85rows=998296) (actual time=2153.598..2733.042rows=1000000 loops=1)
                    -> Filter: (b.create_time isnotnull)  (cost=100382.85rows=998296) (actual time=0.075..900.074rows=1000000 loops=1)
                        -> Tablescanon b  (cost=100382.85rows=998296) (actual time=0.074..316.051rows=1000000 loops=1)

1rowinset (6.22 sec)

兩表ta，tb使用Nested loop方式進(jìn)行連接，ta表作為外層驅(qū)動表，結(jié)果集rows為331。 tb表作為內(nèi)層循環(huán)表，循環(huán)掃描331次，這些都消耗不多。

此SQL耗時多的步驟在對tb的內(nèi)層表b進(jìn)行排序(Sort)，做窗口函數(shù)聚合計算(Window aggregate)，再做物化處理(Materialize)這三個步驟了，對一百萬的數(shù)據(jù)做這些處理耗時約6s，雖然只執(zhí)行一次，但對SQL性能的影響是很大的?，F(xiàn)在問題聚焦于能不能減少做這些處理的數(shù)據(jù)量。

從b表的統(tǒng)計信息看，關(guān)聯(lián)字段basesn的選擇性不錯，本SQL最終結(jié)果集也只有331行，關(guān)聯(lián)字段對b表的過濾條件是很好的，當(dāng)前優(yōu)化器的行為表現(xiàn)是，因為有窗口函數(shù)聚合運算，主查詢的關(guān)聯(lián)謂詞條件無法推入到tb派生表的內(nèi)部。了解了這一點，想辦法改寫語句，讓關(guān)聯(lián)字段起到過濾作用。

優(yōu)化方案

這里我想到的解決方案是：對外層查詢表的列CONTACTID去重處理，關(guān)聯(lián)到tb內(nèi)層查詢中，對滿足關(guān)聯(lián)條件的數(shù)據(jù)做Sort，Window aggregate，Materialize這些處理。

為什么增加這一層關(guān)聯(lián)與原語句等價呢，就當(dāng)作思考題吧，可在評論區(qū)評論噢！

語句改寫參考如下：

SELECT ta.*, tb.*
FROM (SELECT *
          FROM (SELECT a.contactid,
                       a.subs_number,
                       a.log_time,
                       row_number() OVER(PARTITIONBY a.contactid, a.subs_number ORDERBY a.log_time DESC) rn,
                       a.log_id
                  FROM a
                 WHERE a.contactid ISNOTNULL
                   AND a.log_time >= '2025-05-30 00:00:00'
                   AND a.log_time <= '2025-06-02') cc
         WHERE rn = 1) ta
LEFTJOIN (SELECT b.*,
                    row_number() OVER(PARTITIONBY b.basesn ORDERBY b.create_time DESC) rn
               FROM b
               join (SELECT distinct CONTACTID
                      FROM a
                     WHERE CONTACTID ISNOTNULL
                       AND LOG_TIME >= '2025-05-30 00:00:00'
                       AND LOG_TIME <= '2025-06-02') a1
                 ON a1.CONTACTID = b.basesn
              WHERE b.create_time ISNOTNULL) tb
    ON ta.contactid = tb.basesn
   AND tb.rn = 1

改寫后的語句執(zhí)行計劃如下：

EXPLAIN: -> Nestedloopleftjoin  (cost=111.18rows=0) (actual time=14.846..15.281rows=331 loops=1)
    -> Filter: (cc.rn = 1)  (cost=1.21..35.91rows=30) (actual time=2.668..2.747rows=331 loops=1)
        -> Tablescanon cc  (cost=2.50..2.50rows=0) (actual time=2.636..2.683rows=331 loops=1)
            -> Materialize  (cost=0.00..0.00rows=0) (actual time=2.630..2.630rows=331 loops=1)
                -> Windowaggregate: row_number() OVER (PARTITIONBY a.CONTACTID,a.subs_number ORDERBY a.log_time desc )   (actual time=2.214..2.356rows=331 loops=1)
                    -> Sort: a.CONTACTID, a.subs_number, a.log_time DESC  (cost=149.21rows=331) (actual time=2.173..2.198rows=331 loops=1)
                        -> Filter: (a.CONTACTID isnotnull)  (cost=149.21rows=331) (actual time=0.089..1.784rows=331 loops=1)
                            -> Indexrangescanon a using idx_logtime over ('2025-05-30 00:00:00' <= log_time <= '2025-06-02 00:00:00'), withindex condition: ((a.log_time >= TIMESTAMP'2025-05-30 00:00:00') and (a.log_time <= TIMESTAMP'2025-06-02 00:00:00'))  (cost=149.21rows=331) (actual time=0.071..1.730rows=331 loops=1)
    -> Index lookup on tb using <auto_key0> (basesn=cc.contactid, rn=1)  (cost=0.25..2.57rows=10) (actual time=0.037..0.038rows=1 loops=331)
        -> Materialize  (cost=0.00..0.00rows=0) (actual time=12.159..12.159rows=382 loops=1)
            -> Windowaggregate: row_number() OVER (PARTITIONBY b.basesn ORDERBY b.create_time desc )   (actual time=11.614..11.781rows=382 loops=1)
                -> Sort: b.basesn, b.create_time DESC  (actual time=11.608..11.636rows=382 loops=1)
                    -> Stream results  (cost=237.31rows=518) (actual time=1.673..11.394rows=382 loops=1)
                        -> Nestedloopinnerjoin  (cost=237.31rows=518) (actual time=1.670..11.247rows=382 loops=1)
                            -> Filter: (a1.CONTACTID isnotnull)  (cost=214.40..35.91rows=297) (actual time=1.430..1.545rows=321 loops=1)
                                -> Tablescanon a1  (cost=215.02..221.21rows=298) (actual time=1.429..1.502rows=321 loops=1)
                                    -> Materialize  (cost=215.00..215.00rows=298) (actual time=1.428..1.428rows=321 loops=1)
                                        -> Tablescanon <temporary>  (cost=179.02..185.21rows=298) (actual time=1.303..1.349rows=321 loops=1)
                                            -> Temporarytablewith deduplication  (cost=179.00..179.00rows=298) (actual time=1.302..1.302rows=321 loops=1)
                                                -> Filter: (a.CONTACTID isnotnull)  (cost=149.21rows=298) (actual time=0.110..1.143rows=331 loops=1)
                                                    -> Indexrangescanon a using idx_logtime over ('2025-05-30 00:00:00' <= log_time <= '2025-06-02 00:00:00'), withindex condition: ((a.log_time >= TIMESTAMP'2025-05-30 00:00:00') and (a.log_time <= TIMESTAMP'2025-06-02 00:00:00'))  (cost=149.21rows=331) (actual time=0.108..1.108rows=331 loops=1)
                            -> Filter: (b.create_time isnotnull)  (cost=0.48rows=2) (actual time=0.028..0.030rows=1 loops=321)
                                -> Index lookup on b using idx_basesn (basesn=a1.CONTACTID)  (cost=0.48rows=2) (actual time=0.027..0.029rows=1 loops=321)

1rowinset (0.03 sec)

可以看出改寫后的SQL耗時0.03s，比原來的6.2s，性能提升了約200倍。表面上SQL是比原來復(fù)雜了一點，但整體執(zhí)行效率卻得到了很大的提升。

總結(jié)

SQL優(yōu)化的核心思想是減少I/O開銷，無論什么優(yōu)化技巧都是圍繞這個主題，根據(jù)SQL具體情況演變出的形形色色的方法而已。萬變不離其宗，本案例也是如此。

通過手動改寫SQL，實現(xiàn)謂詞下推，減少了內(nèi)層表需要處理的數(shù)據(jù)量，從而提升了SQL性能。

當(dāng)然，我們期待GreatSQL的優(yōu)化器能在未來實現(xiàn)這一算法，自動實現(xiàn)謂詞下推，不用改動SQL，即可高效執(zhí)行SQL。

無論哪種數(shù)據(jù)庫的優(yōu)化器，都會或多或少存在一定缺陷，我們優(yōu)化DBA需要做的就是，理解其缺陷，再利用現(xiàn)有資源，幫助其找到好的執(zhí)行計劃，來提升SQL性能。

責(zé)任編輯：武曉燕來源： GreatSQL社區(qū)

SQL 優(yōu)化性能

51CTO技術(shù)棧公眾號

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營

<pre id="thpou"><strike id="thpou"></strike></pre>