十分鐘搭建實(shí)驗(yàn)分布式數(shù)據(jù)庫環(huán)境
劃水了好久,今天來跟大家分享一下如何用一臺(tái)筆記本,構(gòu)建一套分布式數(shù)據(jù)庫的實(shí)驗(yàn)環(huán)境吧。我們使用docker 和 postgres xl 來完成。
各位讀者老爺們扣Q上車,Let's Go!!!!
Postgres XL 簡(jiǎn)介
什么是Postgres-XL
XL的意思是:eXtensible Lattice,可以擴(kuò)展的格子,即將PostgreSQL應(yīng)用在多機(jī)器上的分布式數(shù)據(jù)庫的形象化表達(dá)。
Postgres-XL 是一個(gè)完全滿足ACID的、開源的、可方便進(jìn)行水平擴(kuò)展的、多租戶安全的、基于PostgreSQL的數(shù)據(jù)庫解決方案。
Postgres-XL 可非常靈活的應(yīng)付各種負(fù)載,比如:
- OLAP(通過MPP并行化)
- OLTP
- OLAP & OLTP
- 操作數(shù)據(jù)存儲(chǔ)
- Key-value存儲(chǔ),包括JSON格式
不同的應(yīng)用場(chǎng)景:
- 支持商業(yè)智能應(yīng)用(數(shù)據(jù)倉庫&數(shù)據(jù)集市),因?yàn)镻GXL支持MPP(Massively Parallel Processing)
- Web2.0,數(shù)據(jù)庫擴(kuò)容的解決方案
- 遺留系統(tǒng)的數(shù)據(jù)庫擴(kuò)容的解決方案
- 新應(yīng)用,可以先使用PostgreSQL,之后隨著數(shù)據(jù)庫變大使用PGXL擴(kuò)容
PGXL底層為PostgreSQL,這意味著它支持所有支持PostgresSQL類型的驅(qū)動(dòng),包括:JDBC, ODBC, OLE DB, Python, Ruby, perl DBI, Tcl, and Erlang.
PostgreSQL與Postgres-XL
- 1994年,Postgre95發(fā)布,開源。
- 1996年,PostgreSQL繼承了Postgre95,發(fā)布。
- 2010年,Postgres-XC發(fā)布。
- 2012年,前PGXC核心開發(fā)者創(chuàng)建StormDB公司,進(jìn)行了一些改進(jìn),包括對(duì)MPP并行化的性能改進(jìn)和多租戶安全。
- 2013年,TransLattice收購了StormDB。
- 2014年,將項(xiàng)目開源,命名為Postgres-XL。
Postgres-XC與Postgres-XL
PGXL的架構(gòu)師和開發(fā)者 很多都是以前做PGXC的,PGXL的部分代碼是從PGXC移植過來的。
比起功能性,PGXL更強(qiáng)調(diào)穩(wěn)定性, 正確性和性能.
PGXL增加了一些重要的性能提升,比如MPP和replan avoidance on the data nodes,這些都是PGXC沒有的。
PGXC目前集中在OLTP的業(yè)務(wù)上面,PGXL則更加靈活,可以應(yīng)用于很多不同種類的業(yè)務(wù)上,比如可以用在大數(shù)據(jù)處理領(lǐng)域,除此,在多租戶的環(huán)境中,PGXL也更加安全。
PGXL的社區(qū)非常開放。
架構(gòu)
GXL有三個(gè)主要組件,分別是GTM,Coordinator(CN)和Datanode(DN)。
GTM(Gloable Transaction Manager)負(fù)責(zé)提供事務(wù)的ACID屬性;Datanode負(fù)責(zé)存儲(chǔ)表的數(shù)據(jù)和本地執(zhí)行由Coordinator派發(fā)的SQL任務(wù);Coordinator負(fù)責(zé)處理每個(gè)來自Application的SQL任務(wù),并且決定由哪個(gè)Datanode執(zhí)行,然后將任務(wù)計(jì)劃派發(fā)給相應(yīng)的Datanode,根據(jù)需要收集結(jié)果返還給Application;
Postgres XL on Docker
我們采用一個(gè)GTM,2臺(tái)CN,2臺(tái)DN,結(jié)構(gòu)如下圖所示:
docker-compose.yml
配置文件如下所示,執(zhí)行 docker-compose up,啟動(dòng)集群
version: "3"
services:
db_gtm_1:
environment:
- PG_HOST=0.0.0.0
- PG_NODE=gtm_1
- PG_PORT=6666
# - PG_PASSWORD=dafei1288
build: .
# image: z_db_gtm_1
command: docker-cmd-gtm
entrypoint: docker-entrypoint-gtm
volumes:
- db_gtm_1:/var/lib/postgresql
networks:
- db_a
healthcheck:
test: ["CMD", "docker-healthcheck-gtm"]
db_coord_1:
ports:
- "25432:5432"
environment:
- PG_GTM_HOST=db_gtm_1
- PG_GTM_PORT=6666
- PG_HOST=0.0.0.0
- PG_NODE=coord_1
- PG_PORT=5432
# - PG_PASSWORD=dafei1288
build: .
# privileged: true
# image: z_db_coord_1
command: docker-cmd-coord
entrypoint: docker-entrypoint-coord
volumes:
- db_coord_1:/var/lib/postgresql
depends_on:
- db_gtm_1
networks:
- db_a
- db_b
healthcheck:
test: ["CMD", "docker-healthcheck-coord"]
db_coord_2:
ports:
- "25433:5432"
environment:
- PG_GTM_HOST=db_gtm_1
- PG_GTM_PORT=6666
- PG_HOST=0.0.0.0
- PG_NODE=coord_2
- PG_PORT=5432
# - PG_PASSWORD=dafei1288
build: .
# privileged: true
# image: z_db_coord_2
command: docker-cmd-coord
entrypoint: docker-entrypoint-coord
volumes:
- db_coord_2:/var/lib/postgresql
depends_on:
- db_gtm_1
networks:
- db_a
- db_b
healthcheck:
test: ["CMD", "docker-healthcheck-coord"]
db_data_1:
ports:
- "25434:5432"
environment:
- PG_GTM_HOST=db_gtm_1
- PG_GTM_PORT=6666
- PG_HOST=0.0.0.0
- PG_NODE=data_1
- PG_PORT=5432
# - PG_PASSWORD=dafei1288
build: .
# image: z_db_data_1
command: docker-cmd-data
entrypoint: docker-entrypoint-data
depends_on:
- db_gtm_1
volumes:
- db_data_1:/var/lib/postgresql
networks:
- db_a
healthcheck:
test: ["CMD", "docker-healthcheck-data"]
db_data_2:
ports:
- "25435:5432"
environment:
- PG_GTM_HOST=db_gtm_1
- PG_GTM_PORT=6666
- PG_HOST=0.0.0.0
- PG_NODE=data_2
- PG_PORT=5432
# - PG_PASSWORD=dafei1288
build: .
# image: z_db_data_2
command: docker-cmd-data
entrypoint: docker-entrypoint-data
depends_on:
- db_gtm_1
volumes:
- db_data_2:/var/lib/postgresql
networks:
- db_a
healthcheck:
test: ["CMD", "docker-healthcheck-data"]
pgpool:
# image: smirart/pgpool:latest
image: postdock/pgpool:latest
ports:
- "8686:8686"
- "8687:5432"
- "8688:9898"
# environment:
# - PG_PASSWORD=dafei1288
volumes:
- ./pgpool.conf:/var/pgpool_configs/pgpool.conf
restart: always
networks:
- db_b
volumes:
db_gtm_1: {}
db_coord_1: {}
db_coord_2: {}
db_data_1: {}
db_data_2: {}
networks:
db_a:
internal: true
db_b:
internal: true
如果有需要,可以開啟gppool,也可以注釋掉,不影響使用
pgpool.conf
listen_addresses = '*'
port = 5432
# pool_passwd = 'dafei1288'
socket_dir = '/tmp'
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/tmp'
listen_backlog_multiplier = 2
serialize_accept = off
replication_mode = on
load_balance_mode = on
backend_hostname0 = 'db_coord_1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/data0'
backend_flag0 = 'ALWAYS_MASTER'
backend_hostname1 = 'db_coord_2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/data1'
backend_flag1 = 'ALLOW_TO_FAILOVER'
health_check_period0 = 0
health_check_timeout0 = 20
health_check_user0 = '_healthcheck'
health_check_password0 = ''
health_check_database0 = ''
health_check_max_retries0 = 0
health_check_retry_delay0 = 1
connect_timeout0 = 10000
實(shí)驗(yàn)結(jié)果
本實(shí)驗(yàn)工程 fork自 https://github.com/tiredpixel/z.2020-10-22.postgres-xl-docker,由于原鏡像已設(shè)置為只讀,并且執(zhí)行會(huì)出一些奇奇怪怪的錯(cuò)誤,于是我就整理了一番,項(xiàng)目已托管到全球最大同仁網(wǎng)站gayhub,網(wǎng)址如下:
https://github.com/dafei1288/postgres-xl-docker
參考列表
https://blog.csdn.net/yeruby/article/details/49004329
https://github.com/tiredpixel/z.2020-10-22.postgres-xl-docker