淺析MySQL數(shù)據(jù)碎片的產(chǎn)生
原創(chuàng)【51CTO獨(dú)家譯文】本文淺析MySQL數(shù)據(jù)碎片的產(chǎn)生:定義,時(shí)間及成因。
MySQL列表,包括MyISAM和InnoDB這兩種最常見(jiàn)的類型,而根據(jù)經(jīng)驗(yàn)來(lái)說(shuō),其碎片的產(chǎn)生及消除都是隨機(jī)的。碎片會(huì)在你的表格中留下明顯的空白,而這會(huì)給列表掃描工作帶來(lái)相當(dāng)大的困擾。對(duì)你的列表進(jìn)行優(yōu)化,這樣會(huì)使列表的全面及分區(qū)掃描工作進(jìn)行得更有效率。
碎片——實(shí)例
MySQL具有相當(dāng)多不同種類的存儲(chǔ)引擎來(lái)實(shí)現(xiàn)列表中的數(shù)據(jù)存儲(chǔ)功能。每當(dāng)MySQL從你的列表中刪除了一行內(nèi)容,該段空間就會(huì)被留空。而在一段時(shí)間內(nèi)的大量刪除操作,會(huì)使這種留空的空間變得比存儲(chǔ)列表內(nèi)容所使用的空間更大。當(dāng)MySQL對(duì)數(shù)據(jù)進(jìn)行掃描時(shí),它掃描的對(duì)象實(shí)際是列表的容量需求上限,也就是數(shù)據(jù)被寫(xiě)入的區(qū)域中處于峰值位置的部分。如果進(jìn)行新的插入操作,MySQL將嘗試?yán)眠@些留空的區(qū)域,但仍然無(wú)法將其徹底占用。
這種額外的破碎的存儲(chǔ)空間在讀取效率方面比正常占用的空間要低得多。讓我們看一個(gè)實(shí)例。
我們將創(chuàng)建一個(gè)數(shù)據(jù)庫(kù)(有時(shí)也稱其為大綱)及一個(gè)測(cè)試用的列表:
- (root@localhost) [test]> create database frag_test;
- Query OK, 1 row affected (0.03 sec)
- (root@localhost) [test]> use frag_test;
- Database changed
- (root@localhost) [frag_test]> create table frag_test (c1 varchar(64));
- Query OK, 0 rows affected (0.05 sec)
現(xiàn)在讓我們?cè)诹斜碇屑尤肴缦聨仔?
- (root@localhost) [frag_test]> insert into frag_test values ('this is row 1');
- Query OK, 1 row affected (0.01 sec)
- (root@localhost) [frag_test]> insert into frag_test values ('this is row 2');
- Query OK, 1 row affected (0.00 sec)
- (root@localhost) [frag_test]> insert into frag_test values ('this is row 3');
- Query OK, 1 row affected (0.00 sec)
現(xiàn)在我們進(jìn)行碎片查看:
- (root@localhost) [frag_test]> show table status from frag_test\G;
- *************************** 1. row ***************************
- Name: frag_test
- Engine: MyISAM
- Version: 10
- Row_format: Dynamic
- Rows: 3
- Avg_row_length: 20
- Data_length: 60
- Max_data_length: 281474976710655
- Index_length: 1024
- Data_free: 0
- Auto_increment: NULL
- Create_time: 2011-02-23 14:55:27
- Update_time: 2011-02-23 15:06:55
- Check_time: NULL
- Collation: latin1_swedish_ci
- Checksum: NULL
- Create_options:
- Comment:
- 1 row in set (0.00 sec)
現(xiàn)在我們刪除一行,并再次檢測(cè):
- (root@localhost) [frag_test]> delete from frag_test where c1 = 'this is row 2';
- Query OK, 1 row affected (0.00 sec)
- (root@localhost) [frag_test]> show table status from frag_test\G;
- *************************** 1. row ***************************
- Name: frag_test
- Engine: MyISAM
- Version: 10
- Row_format: Dynamic
- Rows: 2
- Avg_row_length: 20
- Data_length: 60
- Max_data_length: 281474976710655
- Index_length: 1024
- Data_free: 20
- Auto_increment: NULL
- Create_time: 2011-02-23 14:55:27
- Update_time: 2011-02-23 15:07:49
- Check_time: NULL
- Collation: latin1_swedish_ci
- Checksum: NULL
- Create_options:
- Comment:
- 1 row in set (0.00 sec)
需要注意的是,“data_free”一欄顯示出了我們刪除第二行后所產(chǎn)生的留空空間。想象一下如果你有兩萬(wàn)行指令的話,結(jié)果是什么樣的。以此推算,它們將耗費(fèi)四十萬(wàn)字節(jié)的存儲(chǔ)空間?,F(xiàn)在如果你將兩萬(wàn)條命令行刪到只剩一行,列表中有用的內(nèi)容將只占二十字節(jié),但MySQL在讀取中會(huì)仍然將其視同于一個(gè)容量為四十萬(wàn)字節(jié)的列表進(jìn)行處理,并且除二十字節(jié)以外,其它空間都被白白浪費(fèi)了。
清理碎片
幸運(yùn)的是一旦你鎖定了這一問(wèn)題,MySQL提供了一種簡(jiǎn)便的修正方法。這就是所謂的優(yōu)化列表,具體內(nèi)容如下:
- (root@localhost) [frag_test]> optimize table frag_test;
- +---------------------+----------+----------+----------+
- | Table | Op | Msg_type | Msg_text |
- +---------------------+----------+----------+----------+
- | frag_test.frag_test | optimize | status | OK |
- +---------------------+----------+----------+----------+
- 1 row in set (0.00 sec)
- (root@localhost) [frag_test]> show table status from frag_test\G;
- *************************** 1. row ***************************
- Name: frag_test
- Engine: MyISAM
- Version: 10
- Row_format: Dynamic
- Rows: 2
- Avg_row_length: 20
- Data_length: 40
- Max_data_length: 281474976710655
- Index_length: 1024
- Data_free: 0
- Auto_increment: NULL
- Create_time: 2011-02-23 14:55:27
- Update_time: 2011-02-23 15:11:05
- Check_time: 2011-02-23 15:11:05
- Collation: latin1_swedish_ci
- Checksum: NULL
- Create_options:
- Comment:
- 1 row in set (0.00 sec)
性能考量
“優(yōu)化列表”功能在進(jìn)行中會(huì)對(duì)整個(gè)列表進(jìn)行鎖定。對(duì)于小型列表,這一功能的效果非常好,因?yàn)檎麄€(gè)列表的讀取和修改速度都會(huì)很快。但對(duì)于那些體積巨大的列表來(lái)說(shuō),這一過(guò)程將消耗很長(zhǎng)時(shí)間,并且其間會(huì)中斷或減少可用的應(yīng)用程序數(shù)量。怎么辦?
再一次,MySQL幸運(yùn)地提供了一項(xiàng)堪稱偉大的功能,名為“主-主復(fù)制”。在這種配置之下,你的后臺(tái)數(shù)據(jù)庫(kù)實(shí)際上成為兩個(gè)單獨(dú)的數(shù)據(jù)庫(kù),一個(gè)主動(dòng)可調(diào)用的,一個(gè)被動(dòng)可調(diào)整的。這兩個(gè)數(shù)據(jù)庫(kù)在各方面來(lái)說(shuō)都是完全相同的。要實(shí)現(xiàn)各種在線操作——包括“優(yōu)化列表”操作——只需在你的被動(dòng)數(shù)據(jù)庫(kù)中即可進(jìn)行。這將不會(huì)對(duì)你的應(yīng)用程序造成絲毫影響。一旦優(yōu)化操作完成,主、被動(dòng)數(shù)據(jù)庫(kù)將互相轉(zhuǎn)換,以便應(yīng)用程序直接指向二號(hào)數(shù)據(jù)庫(kù),對(duì)還未進(jìn)行優(yōu)化的主動(dòng)數(shù)據(jù)庫(kù)部分自動(dòng)開(kāi)始優(yōu)化工作。
這時(shí),兩套數(shù)據(jù)庫(kù)的角色已經(jīng)互換,而應(yīng)用程序也將順利指向二號(hào)數(shù)據(jù)庫(kù),執(zhí)行與在一號(hào)數(shù)據(jù)庫(kù)上相同的列表優(yōu)化。而現(xiàn)在主動(dòng)已經(jīng)轉(zhuǎn)換為被動(dòng),因此不會(huì)中斷主要任務(wù)處理。
其它命令
顯示你數(shù)據(jù)庫(kù)中存在碎片的全部列表:
- (root@localhost) [(none)]> select table_schema, table_name, data_free, engine from information_schema.tables where table_schema not in ('information_schema', 'mysql') and data_free > 0;
- +--------------+-----------------------------+-----------+--------+
- | table_schema | table_name | data_free | engine |
- +--------------+-----------------------------+-----------+--------+
- | aitc | wp_comments | 346536 | MyISAM |
- | aitc | wp_options | 64308 | MyISAM |
- | aitc | wp_postmeta | 124 | MyISAM |
- | cactidb | poller_item | 160 | MyISAM |
- | cactidb | poller_output | 384 | MyISAM |
- | drupal | sessions | 30976 | MyISAM |
- | drupal | users | 92 | MyISAM |
- | drupal | variable | 20 | MyISAM |
- | gg | wp_comments | 232 | MyISAM |
- | gg | wp_options | 696 | MyISAM |
- | gg | wp_postmeta | 560 | MyISAM |
- | ihi | wp_comments | 536 | MyISAM |
- | ihi | wp_options | 444 | MyISAM |
- | ihi | wp_postmeta | 288 | MyISAM |
- | ihi | wp_redirection_items | 1292 | MyISAM |
- | ihi | wp_redirection_logs | 140352 | MyISAM |
- | nds | wp_comments | 4704 | MyISAM |
- | nds | wp_options | 150580 | MyISAM |
- | nds | wp_postmeta | 76 | MyISAM |
- | oos | wp_comments | 317124 | MyISAM |
- | oos | wp_options | 88196 | MyISAM |
- | oos | wp_postmeta | 76 | MyISAM |
- | phplist | phplist_listuser | 252 | MyISAM |
- | phplist | phplist_sendprocess | 52 | MyISAM |
- | phplist | phplist_user_user | 32248 | MyISAM |
- | phplist | phplist_user_user_attribute | 120 | MyISAM |
- | phplist | phplist_user_user_history | 288 | MyISAM |
- | phplist | phplist_usermessage | 1428 | MyISAM |
- | pn_nds | nuke_session_info | 12916 | MyISAM |
- | psa | exp_event | 10024 | MyISAM |
- | test | active_sessions | 30144 | MyISAM |
- +--------------+-----------------------------+-----------+--------+
- 31 rows in set (0.26 sec)
如果你更改了某個(gè)列表的存儲(chǔ)引擎,你也應(yīng)該對(duì)這一列表進(jìn)行碎片清理。這是因?yàn)镸ySQL的工作原理導(dǎo)致其必須讀取整個(gè)列表,然后利用新的存儲(chǔ)引擎將內(nèi)容寫(xiě)回磁盤(pán),而在此過(guò)程中碎片所在的位置及影響到的數(shù)據(jù)都對(duì)執(zhí)行效率造成了嚴(yán)重的不良影響。
上述情況如下所示:
- (root@localhost) [frag_test]> alter table frag_test engine = innodb;
- Query OK, 2 rows affected (0.17 sec)
- Records: 2 Duplicates: 0 Warnings: 0
- (root@localhost) [frag_test]> show table status from frag_test
- -> \G;
- *************************** 1. row ***************************
- Name: frag_test
- Engine: InnoDB
- Version: 10
- Row_format: Compact
- Rows: 2
- Avg_row_length: 8192
- Data_length: 16384
- Max_data_length: 0
- Index_length: 0
- Data_free: 0
- Auto_increment: NULL
- Create_time: 2011-02-23 15:41:12
- Update_time: NULL
- Check_time: NULL
- Collation: latin1_swedish_ci
- Checksum: NULL
- Create_options:
- Comment: InnoDB free: 7168 kB
- 1 row in set (0.00 sec)
結(jié)論
如果你發(fā)現(xiàn)一些列表中包含了大量的數(shù)據(jù)留空現(xiàn)象,那么對(duì)其進(jìn)行優(yōu)化是絕對(duì)值得的,因?yàn)檫@一過(guò)程會(huì)大大提升列表的讀取性能及應(yīng)用表現(xiàn)。
原文地址:http://www.databasejournal.com/features/mysql/article.php/3927871/article.htm
【編輯推薦】