Erlang并行快速排序?qū)崙?zhàn)
本節(jié)我將用Erlang多進(jìn)程方式實(shí)現(xiàn)快速排序,快速排序采用的是分治的思想,即將原問(wèn)題分解為若干個(gè)規(guī)模更小但結(jié)構(gòu)與原問(wèn)題相似的子問(wèn)題。遞歸地解這些子問(wèn)題,然后將這些子問(wèn)題的解組合為原問(wèn)題的解。通過(guò)對(duì)比快速排序的串行算法,我們將此串行算法改進(jìn)為并行算法。
首先,來(lái)看看快速排序的串行算法:
1.從數(shù)列中挑出一個(gè)元素,稱為 "基準(zhǔn)"(pivot);
2.重新排序數(shù)列,所有元素比基準(zhǔn)值小的擺放在基準(zhǔn)前面,所有元素比基準(zhǔn)值大的擺在基準(zhǔn)的后面(相同的數(shù)可以到任一邊)。在這個(gè)分割結(jié)束之后,該基準(zhǔn)就處于數(shù)列的中間位置。這個(gè)稱為分割(partition)操作;
3.遞歸地(recursive)把小于基準(zhǔn)值元素的子數(shù)列和大于基準(zhǔn)值元素的子數(shù)列排序。
此思想網(wǎng)上到處都有,現(xiàn)在來(lái)看看快速排序的并行算法:
其中的一個(gè)簡(jiǎn)單思想是,對(duì)每次劃分過(guò)后得到的兩個(gè)序列分別使用兩個(gè)處理器完成遞歸排序。例如對(duì)一個(gè)長(zhǎng)為n的序列,首先劃分得到兩個(gè)長(zhǎng)為n/2的序列,將其交給兩個(gè)處理器分別處理;而后進(jìn)一步劃分得到四個(gè)長(zhǎng)為n/4的序列,再分別交給四個(gè)處理器處理:如此遞歸下去最終得到排序好的序列。
當(dāng)然,這里是理想的劃分情況,如果劃分步驟不能達(dá)到平均分配的目的,那么排序效率會(huì)相對(duì)較差。
跟上節(jié)講得并行枚舉排序進(jìn)程調(diào)用的區(qū)別在于,枚舉排序一次性將數(shù)據(jù)傳給所有節(jié)點(diǎn)進(jìn)程,而快速排序進(jìn)程調(diào)用在分割數(shù)據(jù)的過(guò)程中呈現(xiàn)的是樹(shù)狀形式。
并行快排的偽代碼如下:
- //快速排序的并行算法
- 輸入:無(wú)序數(shù)組data[1,n],使用的處理器個(gè)數(shù)為2^m
- 輸出:有序數(shù)組data[1,n]
- Begin
- para_quicksort(data,1,n,m,0)
- End
- procedure para_quicksort(data,i,j,m,id)
- Begin
- if (j-i) <=k or m=0 then
- Pid call quicksort(data,i,j)
- else
- Pid:r = patition(data,i,j)
- Pid send data[r+1,m-1] to Pid+2 ^(m-1)
- para_quicksort(data,i,r-1,m-1,id)
- par_quicksort(data,r+1,j,m-1,id+2^(m-1) )
- Pid+2 ^ (m-1) send data[r+1,m-1] back to Pid
- end if
- End
現(xiàn)在,就來(lái)看看Erlang代碼實(shí)現(xiàn)并行快排:
設(shè)置啟動(dòng)接口,啟動(dòng)多個(gè)進(jìn)程之后,在總控端調(diào)用Dict = store(['node1@pc1305','node2@pc1305']...)存儲(chǔ)各節(jié)點(diǎn)進(jìn)程的字典表,然后調(diào)用start([3,4,512,1,2,5,……],2,Dict)進(jìn)行快排,其中第二參數(shù)表示需要最多2^2=4個(gè)進(jìn)程來(lái)完成任務(wù),總控端也作為一個(gè)排序的節(jié)點(diǎn)。
- %% 啟動(dòng)排序時(shí),需存儲(chǔ)各節(jié)點(diǎn)信息:使用Dict = store(['node1@pc1305','node2@pc1305']...)方法
- %%然后,便可啟動(dòng)start(Data,M,Dict) 進(jìn)行排序,M為啟動(dòng)的進(jìn)程個(gè)數(shù)指標(biāo),Dict為上述存儲(chǔ)的節(jié)點(diǎn)
- -module(para_qsort).
- -export([para_qsort/3,start/3,store/1,lookup/2]).
- store(L) -> store(L,[]).
- store([],Ret) -> Ret;
- store(L,Ret) ->
- Value = lists:nth(1,L),
- Key = length(Ret)+1,
- io:format("Key=~p Value=~p~n",[Key,Value]),
- New = [{Key,Value} | Ret],
- store(lists:delete(Value,L) , New).
- lookup(Key,Dict) ->
- {K,V} = lists:nth(1,Dict),
- if
- K =:= Key ->
- V;
- K =/= Key ->
- Filter = lists:delete({K,V},Dict),
- lookup(Key,Filter)
- end.
- start(Data,M,Dict) ->
- register(monitor, spawn(fun() -> wait([]) end)),
- put(monitor, node()),
- NewDict = [{monitor, get(monitor)} | Dict],
- para_qsort(Data,M,NewDict).
wait/1函數(shù)主要對(duì)各節(jié)點(diǎn)排序好的結(jié)果進(jìn)行歸并整理,然后輸出最終結(jié)果,代碼如下
- %%服務(wù)器節(jié)點(diǎn)監(jiān)聽(tīng)
- wait(Ret) ->
- Len1 = parser(Ret)+length(Ret)-1,
- Len2 = len(Ret),
- if
- (Len1 =:= Len2) and (Len2 =/=0) ->
- SortRet= merge(Ret,[],1),
- io:format("Para qsort result:~n",[]),
- io:format("~p~n",[SortRet]);
- (Len1 =/= Len2) or (Len2 =:= 0) ->
- receive
- {Node,Pid,L} ->
- {Data,I,J} = L,
- Temp = [{Node,Data,I,J}|Ret],
- wait(Temp)
- end;
- die ->
- quit
- end.
- len([]) -> 0;
- len([H|T])->
- {Node,Data,I,J} = H,
- length(Data).
- parser([]) -> 0;
- parser([H|T]) ->
- {Node,Data,I,J}=H,
- J-I+1+parser(T).
- merge([],Ret,Len) -> Ret;
- merge(T,Ret,Start) ->
- H = lists:nth(1,T),
- {Node,Data,I,J} = H,
- if
- I =:= Start ->
- ListBetween = lists:sublist(Data,I, J-I+1),
- case (I=:=1) of
- true ->
- Temp = lists:append(Ret,ListBetween);
- false ->
- Pivo = lists:append(Ret,[lists:nth(I-1,Data)]),
- Temp = lists:append(Pivo,ListBetween)
- end,
- io:format("~n-----<< ~p >> processing data[~p,~p]~n-------------------------Result=~p~n",[Node,I,J,Temp]),
- Len = Start+J-I+2,
- NewData = lists:delete(H,T),
- merge(NewData,Temp,Len);
- I =/= Start ->
- NewData = lists:delete(H,T),
- Temp = lists:append(NewData,[H]),
- merge(Temp,Ret,Start)
- end.
para_qsort完成對(duì)排序任務(wù)的分發(fā),代碼如下:
- para_qsort(Data,M,Dict) ->
- %%進(jìn)程個(gè)數(shù)最多為2^M次方
- I = 1,
- J = length(Data),
- ID = 0,
- para_qsort(Data,I,J,M,ID,Dict).
- para_qsort(Data,I,J, M, ID,Dict) ->
- %% 閥值,列表排序個(gè)數(shù)小于K時(shí)直接調(diào)用qsort直接快排
- K = 4,
- Value1 = (J-I) < K,
- Value2 = (M =:= 0),
- Value = Value1 or Value2,
- if
- Value =:= true ->
- %io:format("~nCurrent node: (~p) ~nData= ~p I=~p J=~p~n",[node(),Data,I,J]),
- ListBefore = lists:sublist(Data, I-1),
- ListBetween = lists:sublist(Data,I, J-I+1),
- ListAfter = lists:sublist(Data,J+1,length(Data)-J),
- Ret = lists:sort(ListBetween),
- Temp = lists:append(ListBefore,Ret),
- NewData = lists:append(Temp, ListAfter),
- {monitor, lookup(monitor,Dict) } ! {node(),self(),{NewData,I,J} },
- io:format("-------------------------------------------------------------");
- Value =:= false ->
- {NewData,R} = partition(Data,I,J),
- send(NewData, R+1, J, M-1,ID + math:pow(2 ,(M-1)), Dict ),
- para_qsort(NewData,I,R-1,M-1,ID, Dict)
- end.
其中patition用于將data按基準(zhǔn)值劃分為兩部分,代碼如下:
- partition(Data, K, L) ->
- Pivo= lists:nth(L, Data),
- ListBefore = lists:sublist(Data, K-1),
- ListBetween = lists:sublist(Data,K, L-K),
- ListAfter = lists:sublist(Data,L+1,length(Data)-L),
- Left = [X|| X <- ListBetween,X =<Pivo],
- Right = [X || X <- ListBetween,X > Pivo],
- Ret = lists:append(Left,[Pivo|Right]),
- Temp = lists:append(ListBefore,Ret),
- NewData = lists:append(Temp, ListAfter),
- Len = length(ListBefore)+length(Left)+1,
- {NewData, Len}.
任務(wù)分割完后,會(huì)將此任務(wù)發(fā)送給下個(gè)節(jié)點(diǎn)進(jìn)程進(jìn)行處理,send代碼如下:
- send( Data, I, J, M,No ,Dict) ->
- ID = trunc(No),
- Node = lookup(ID,Dict),
- Pid = spawn(Node, fun() -> loop() end),
- Pid ! {node(),self(), {Data,I,J,M,ID,Dict}}.
- %%客戶節(jié)點(diǎn)N準(zhǔn)備接受排序數(shù)據(jù)
- loop() ->
- receive
- {Node,Pid,die} ->
- disconnect_node(Node),
- io:format("Node (~p) disconnected~n",[Node]),
- quit;
- {Node,Pid,L} ->
- {Data,I,J,M,ID,Dict} = L,
- %io:format("4----:Current node:~p server node:~p~n",[node(),lookup(monitor,Dict)]),
- para_qsort(Data,I,J,M,ID,Dict)
- %loop()
- end.
思想比較簡(jiǎn)單,我只是沒(méi)對(duì)它進(jìn)行進(jìn)一步優(yōu)化,最終運(yùn)行結(jié)果如下:
大家可以看到,此data的劃分就是不均勻的,node1、node3只處理了一個(gè)數(shù)據(jù),node2處理了3個(gè)數(shù)據(jù),其余的都是server處理的,因此排序的效率問(wèn)題很差。此程序只是展示如何使用Erlang實(shí)現(xiàn)并行快排,并不是學(xué)習(xí)如何優(yōu)化問(wèn)題。
大致就說(shuō)這么多了,以后我將實(shí)現(xiàn)PSRS并行算法的Erlang實(shí)現(xiàn),這個(gè)算法Erlang實(shí)現(xiàn)起來(lái)比較麻煩,大家可以先去看看。
原文:http://www.cnblogs.com/itfreer/archive/2012/05/14/Erlang_in_practise_quick-sort.html
【編輯推薦】