作者lihgong (Q.Q)
看板MATLAB
标题[心得] Matlab Distributed Computing
时间Wed Aug 16 09:30:59 2006
测试环境
Computer1: P4 3.4G, 1G RAM Win XP SP2, Matlab 2006a
computer name:
QQMOU
Job-Manager name:
QQMOU_jobmanager
Worker name:
QQMOU_worker
Computer2: P4 2.8G, 1G RAM Win XP SP2, Matlab 2006a
computer name: mychat-7b73b358
Worker name:
QQMOU_worker1
测试结果
Dist-comp: 131.18 sec
Computer1: 226.42 sec
结论
利用 dist-computing 确实可以加速
然而 dist-computing 协定本身的 overhead 颇重
但是当运算量极大时, protocol 的 overhead 可以忽略
以上结果显示, 运用两台速度相近的电脑计算, 得到 42% 的加速
----
建构 cluster
1. 设定 MDCE server
$MATLAB\toolbox\distcomp\bin\win32\mdce install
$MATLAB\toolbox\distcomp\bin\win32\mdce start
2. 设定 job-manager (仅需一台)
$MATLAB\toolbox\distcomp\bin\win32\startjobmanager.bat
-name
QQMOU_jobmanager -clean -v
ps. 我的 job manager 装在 computer 1
QQMOU_jobmanager 是名字, 可以自己取
3. 设定 worker (在每一台都要设定)
Computer 1:
$MATLAB\toolbox\distcomp\bin\win32\startworker.bat
-name
QQMOU_worker -jobmanager
QQMOU_jobmanager
-jobmanagerhost
QQMOU -clean -v
Computer 2:
$MATLAB\toolbox\distcomp\bin\win32\startworker.bat
-name
QQMOU_worker1 -jobmanager
QQMOU_jobmanager
-jobmanagerhost
QQMOU -clean -v
4. 检查设定 (可以在每一个 node 上做)
$MATLAB\toolbox\distcomp\bin\win32\nodestatus.bat
-infolevel 2
----
分别用 dist-computing 和 单机执行以下函数:
[fun.m]
function x = fun(times)
x = 0;
for i=1:times
x = x + sum(randn(1,10000));
x = x / 10000;
end
单机执行
tic % 启动码表
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
fun(30000);
toc % 结束码表并计算时间
dist-computing 版本
tic % 码表启动
% 设定 jobmanager 的讯息
jobmanager_name = 'QQMOU_jobmanager';
jobmanager_hostname = 'QQMOU';
% 先连线到 job manager
jm = findResource('scheduler', 'type', 'jobmanager', ...
'name',jobmanager_name,'LookupURL',jobmanager_hostname);
% 在 job manager 里建立一个 job
j = createJob(jm);
% 设定档案分享
set(j,'FileDependencies',{'fun.m'})
% 描述 job 里的 tasks
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
createTask(j, @fun, 1, {30000});
% 提交 job
submit(j);
waitForState(j);
% 取回运算资料
results = getAllOutputArguments(j)
% 运算资料存放的形式是 cell
% 可以用 cell2mat 转回 matrix
% 砍掉已经结束的 jobs
destroy(j);
toc
----
Reference
[1] Matlab Online Help, Distributed Computing Toolbox
[2] Distributed Computing Toolbox User's Guide
http://www.mathworks.com/access/helpdesk/help/pdf_doc/distcomp/distcomp.pdf
[3] MATLAB Distributed Computing Engine System Administrator's Guide
http://www.mathworks.com/access/helpdesk/help/pdf_doc/mdce/mdce.pdf
[4] "Stage 2: Configure Your Cluster"
http://0rz.net/be1Jx
[5] "Stage 4: Test Your Distributed Computing Environment"
http://0rz.net/011Hq
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 140.113.128.237
1F:推 sunev:请问一下...这是只要2006a就可以了吗? 08/16 21:01
2F:→ sunev:不需要任何额外的其他程式? 08/16 21:02
3F:→ lihgong:只要 2006a + Distributed Computing Toolbox 08/16 22:15
4F:→ lihgong:reference [3] [4] [5] 是建构 cluster 的重点 08/16 22:16
5F:→ lihgong:使用前, 把所有防火墙都关掉, 比较不会有状况 08/16 22:16
6F:推 sunev:嗯嗯....谢谢你的分享 08/19 03:01
※ 编辑: lihgong 来自: 140.113.236.184 (05/21 13:18)