[問題] CUDA shared-memory

時間Tue Oct 3 10:12:05 2017

開發平台(Platform): (Ex: Win10, Linux, ...) WIN10 編譯器(Ex: GCC, clang, VC++...)+目標環境(跟開發平台不同的話需列出) VC2017 額外使用到的函數庫(Library Used): (Ex: OpenGL, ...) CUDA 9.0 問題(Question)：想透過 shared memory 來加速kernal的效能利用treadid 平行assign資料也有用__syncthreads 來同步但資料還是跟用迴圈跑的不一樣 (結果有錯) 想請問大大們我的使用方式有錯嗎? 還有vc上可以單步執行來看CUDA變數嗎? 餵入的資料(Input)：一維陣列的輸入與輸出指標預期的正確結果(Expected Output)： USE_SHARED_MEM = 0 與 = 1 data值要一樣錯誤結果(Wrong Output)： github: https://github.com/ChiFang/question/blob/master/CUDA_SharedMem.cu USE_SHARED_MEM = 1 會導致最後結果錯誤，表示data值不一樣 (後面程式完全一模一樣) 程式碼(Code)：(請善用置底文網頁, 記得排版) #define USE_SHARED_MEM 1 __global__ void kernal_test(const int a_RangeUpScale, const int *a_CostData, int *a_Input) { // Get the work index of the current element to be processed int y = blockIdx.x*blockDim.x + threadIdx.x; //執行緒在陣列中對應的位置 #if USE_SHARED_MEM == 1 __shared__ int Buff[32]; #else int Buff[32]; #endif // Do the operation for (int x = 1; (x < g_ImgWidth_CUDA); x++) { int TmpPos = y*Area + (x-1)*a_RangeUpScale; #if USE_SHARED_MEM == 1 // Synchronize to make sure the sub-matrices are loaded before starting the computation __syncthreads(); if (threadIdx.x < 32) { Buff[threadIdx.x] = a_CostSmooth[TmpPos + threadIdx.x]; } // Synchronize to make sure the sub-matrices are loaded before starting the computation __syncthreads(); #else for (int cnt = 0; cnt < 32 ;cnt++) { Buff[cnt] = a_CostSmooth[TmpPos + cnt]; } #endif // use Buff to do something } } 補充說明(Supplement)： grid size = 8 block size = 135 所以thread id 一定會大於32 --

※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 114.34.230.27 ※ 文章網址: https://webptt.com/m.aspx?n=bbs/C_and_CPP/M.1506996728.A.C64.html

1^F：→ johnjohnlin: do something 裡面通常還要有一個 syncthread 10/03 10:30

2^F：→ hardman1110: 原因是什? 前面的同步不算嗎? 困惑中= = 10/03 10:40

3^F：→ johnjohnlin: 你迴圈繞回去的時候會寫到 shared memory 10/03 10:44

4^F：→ hardman1110: Do something 之後就不會更改值了 10/03 11:27

5^F：→ hardman1110: 所以我才在一開始同步 10/03 11:28

6^F：→ hardman1110: 就算繞回去應該再同步一次不是嗎？ 10/03 11:32

7^F：推 a1u1usul3: do something的時候有的thread提早做完先去改值了，有 10/03 12:16

8^F：→ a1u1usul3: 的thread還沒做完需要用舊的值，但被改了 10/03 12:16

9^F：→ hardman1110: 所以我只要在使用前一刻同步就好囉？ 10/03 12:37

10^F：→ hardman1110: 還有在assign值前同步 10/03 13:00

※ 編輯: hardman1110 (114.34.230.27), 10/03/2017 13:24:40

11^F：→ hardman1110: 已嘗試在assign前後都同步，但結果還是會錯(暈 10/03 13:26

※ 編輯: hardman1110 (114.34.230.27), 10/03/2017 13:31:48

12^F：→ johnjohnlin: 那 do something 裡面是不是有 break 之類的 10/03 14:00

13^F：→ johnjohnlin: BTW, blockDim.x = 135 是個很糟糕的選擇，盡量避免 10/03 14:01

14^F：→ a1u1usul3: code還是貼在codepad吧 10/03 14:09

15^F：→ a1u1usul3: 然後你的code會不會邏輯上就錯了 10/03 14:19

16^F：→ a1u1usul3: 不用sharedMeomry的時候，每個thread從自己的TmpPos拿 10/03 14:19

17^F：→ a1u1usul3: 32個元素進private memory，而TmpPos每個thread都不同 10/03 14:20

※ 編輯: hardman1110 (114.34.230.27), 10/03/2017 14:21:21

18^F：→ a1u1usul3: 結果用SharedMemory的時候32人從自己的TmpPos拿一個 10/03 14:22

19^F：→ a1u1usul3: 元素進SharedMemory 10/03 14:23

20^F：→ hardman1110: a1大已補上github好讀版連結 10/03 14:24

21^F：→ hardman1110: 我這邊純粹想讓多個thread 同時assign值甭跑回圈 10/03 14:25

22^F：→ a1u1usul3: SharedMemory的功能是讓多個thread共用的資料不用重複 10/03 14:27

23^F：→ a1u1usul3: 你資料的並沒有共用不是嗎@@? 10/03 14:27

24^F：→ hardman1110: 我想通了~抱歉確實把y當執行緒切每個thread y不同 10/03 14:28

25^F：→ hardman1110: 純共用的話感覺用register 就好陣列大小不大 10/03 14:30

26^F：→ a1u1usul3: thread內共用->register threads間共用->sharedMemory 10/03 14:31

27^F：→ a1u1usul3: Blocks間共用->GlobalMemory 10/03 14:32

28^F：→ a1u1usul3: 好像還有很潮的shuffle，threads間共用的樣子 10/03 14:36

29^F：→ hardman1110: 要在加速的話好像還可以用surface memory來讀寫? 10/03 14:55

30^F：→ hardman1110: 感謝各位大大指點 10/03 14:56

	[問題/行為] 貓晚上進房間會不會有憋尿問題
	Re: [閒聊] 選了錯誤的女孩成為魔法少女 XDDDDDDDDDD
	[正妹] 瑞典一張
	[心得] EMS高領長版毛衣.墨小樓MC1002
	[分享] 丹龍隔熱紙GE55+33+22
	[問題] 清洗洗衣機
	[尋物] 窗台下的空間
	[閒聊] 双極の女神1 木魔爵
	[售車] 新竹 1997 march 1297cc 白色四門
	[討論] 能從照片感受到攝影者心情嗎
	[狂賀] 賀賀賀賀賀！島村卯月！總選舉NO.1
	[難過] 羨慕白皮膚的女生
	閱讀文章
	[黑特]
	[問題] SBK S1安裝於安全帽位置
	[分享] 舊woo100絕版開箱!!
	Re: [無言] 關於小包衛生紙
	[開箱] E5-2683V3 RX480Strix 快睿C1 簡單測試
	[心得] 蒼の海賊龍地獄執行者16PT
	[售車] 1999年Virage iO 1.8EXi
	[心得] 挑戰33 LV10 獅子座pt solo
	[閒聊] 手把手教你不被桶之新手主購教學
	[分享] Civic Type R 量產版官方照無預警流出
	[售車] Golf 4 2.0 銀色自排
	[出售] Graco提籃汽座（有底座）2000元誠可議
	[問題] 請問補牙材質掉了還能再補嗎?(台中半年內
	[問題] 44th 單曲生寫竟然都給重複的啊啊！
	[心得] 華南紅卡/icash 核卡
	[問題] 拔牙矯正這樣正常嗎
	[贈送] 老莫高業初業 102年版
	[情報] 三大行動支付本季掀戰火
	[寶寶] 博客來Amos水蠟筆5/1特價五折
	Re: [心得] 新鮮人一些面試分享
	[心得] 蒼の海賊龍地獄麒麟25PT
	Re: [閒聊] (君の名は。雷慎入) 君名二創漫畫翻譯
	Re: [閒聊] OGN中場影片：失蹤人口局 (英文字幕)
	[問題] 台灣大哥大4G訊號差
	[出售] [全國]全新千尋侘草LED燈, 水草

WEB批踢踢(PTT)

C_and_CPP 板

[問題] CUDA shared-memory

熱門看板

贊助商連結