作者hardman1110 (笨小孩)
看板C_and_CPP
标题[问题] OpenMP加速问题
时间Wed May 11 11:56:37 2016
开发平台(Platform): (Ex: VC++, GCC, Linux, ...)
vc2015
额外使用到的函数库(Library Used): (Ex: OpenGL, ...)
opencv, openmp
问题(Question):
在回圈内作加总,已加入reduction等设定但效能还是比没用mp来的差有时
甚至程式会卡住一段时间
喂入的资料(Input):
ROI,影像data及其权重
预期的正确结果(Expected Output):
速度倍增 >> CPU为I5 3210M
错误结果(Wrong Output):
比原本还慢
程式码(Code):(请善用置底文网页, 记得排版)
贴上最核心的CODE 其他只是图档等输出入还有权重计算,跟OPENMP无关
-------------
float tempValueNoMP = 0;
double t0 = 0;
double t1 = 0;
double t2 = 0;
t0 = (double)getTickCount();
// without MP
for (int k = 0; k < lSize; k++)
{
int xMin = 0, xMax = 0, yMin = 0, yMax = 0;
xMin = ROI[j].x + Position[i][k].x;
xMax = xMin + Position[i][k].width;
yMin = ROI[j].y + Position[i][k].y;
yMax = yMin + Position[i][k].height;
tempValueNoMP += Weight[i][k] *
(data.at<float>(yMin, xMin) + data.at<float>(yMax, xMax) -
data.at<float>(yMin, xMax) - data.at<float>(yMax, xMin));
}
t0 = ((double)getTickCount() - t0) / getTickFrequency();
t1 = (double)getTickCount();
// with MP
#pragma omp parallel reduction( +:tempValue)
{
#pragma omp for
for (int k = 0; k < lSize; k++)
{
int xMin = 0, xMax = 0, yMin = 0, yMax = 0;
xMin = ROI[j].x + Position[i][k].x;
xMax = xMin + Position[i][k].width;
yMin = ROI[j].y + Position[i][k].y;
yMax = yMin + Position[i][k].height;
tempValue += Weight[i][k] *
(data.at<float>(yMin, xMin) + data.at<float>(yMax, xMax) -
data.at<float>(yMin, xMax) - data.at<float>(yMax, xMin));
}
}
t1 = ((double)getTickCount() - t1) / getTickFrequency();
printf("%.3f\n", t0 - t1);
补充说明(Supplement):
有比对过tempvalue & tempvalueNoMP,答案一样
但t0就是比t1小,程式更改後也卡卡的
--
※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 211.72.181.189
※ 文章网址: https://webptt.com/cn.aspx?n=bbs/C_and_CPP/M.1462939000.A.DC8.html
※ 编辑: hardman1110 (211.72.181.189), 05/11/2016 12:01:18
1F:推 KJFC: 用到自定义class貌似就会怪怪 05/11 14:34
2F:→ KJFC: 错了 temp只有乘值 05/11 14:40
3F:→ KJFC: 加上乘值 05/11 14:40
4F:→ hardman1110: KJFC大大 是改成( +*:tempvalue)吗? 05/11 17:03
5F:→ hardman1110: 这样改的原因是? 因为我是乘完权重才相加的不是吗? 05/11 17:04
6F:推 KJFC: 没 我是说我错了 05/12 09:11