[问题] frame-pointer 与 performance

时间Mon Nov 8 22:11:06 2021

大家好，最近发现 llvm ir 有一个 attribute 叫做 frame-pointer，它会影响 performance 目前 O3 预设是 none 的，而如果是用 clang -emit-llvm -Xclang -disable-O0-optnone 这样的方式取得没有优化过的 llvm ir，则会是 all 据我说知，他会消除 frame pointer 的储存（如果是 none 的话）， "理论上"会让程式的 performance 好一点，毕竟会减少 register 的使用经过测试，确实如果同是使用 O3 sequence，frame-pointer=none performance确实比较好但是！！我用我自己的优化顺序， frame-pointer=none 得到的 runtime = 8 sec 左右 frame-pointer=all 得到的 runtime = 3.8 sec 差非常多！然後我把他们转成 Assembly code，确实不太一样，但 none 程式码比较短，而且减少很多存取却让 performance 更差劲可以明白指令的多寡与 performance 无关，但据我说知，frame-pointer 不去储存与使用，应该会更快吧？甚至我自己有些 IR 从 all 改 none 会更好唯独某几个 IR code 会更差。我测试的 source code 的是 insertion sort https://imgur.com/nqexaZb https://imgur.com/hKigtrh https://imgur.com/FR3qETS https://imgur.com/5119ek8 https://imgur.com/g3RHx98 这些是 Assembly code 的差异，感觉与 insertion sort 本身的逻辑无关补上 perf 之後的结果： frame-pointer=all Performance counter stats for './190_all' (10 runs): 142666 cache-misses # 0.020 % of all cache refs ( +- 5.01% ) 698701320 cache-references ( +- 0.71% ) 234781 branch-misses ( +- 0.44% ) 13059296783 cycles ( +- 0.16% ) 59991967735 instructions # 4.59 insn per cycle ( +- 0.05% ) 3.417880975 seconds time elapsed ( +- 0.26% ) frame-pointer=none Performance counter stats for './190_none' (10 runs): 352932 cache-misses # 0.046 % of all cache refs ( +- 2.58% ) 770977710 cache-references ( +- 0.81% ) 260282 branch-misses ( +- 0.33% ) 30052057516 cycles ( +- 0.05% ) 60037013675 instructions # 2.00 insn per cycle ( +- 0.05% ) 7.921856465 seconds time elapsed ( +- 0.05% ) 看起来branch-misses 高大概10% Insn per cycle 直接慢一半.. --

※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 114.43.59.118 (台湾) ※ 文章网址: https://webptt.com/cn.aspx?n=bbs/CompilerDev/M.1636380668.A.B00.html

1^F：推 sonicyang: runetime 比较短performance 比较差？我漏看了什麽吗 ? 11/09 00:50

抱歉，我秒数弄错了

2^F：→ Lipraxde: Assembly 都在你眼前了，再加点油分析一下。 11/09 01:55

3^F：→ Lipraxde: LLVM 可以在每个 pass 跑完後 dump IR / machine IR，11/09 01:55

4^F：→ Lipraxde: 对了解优化、为什麽生出这样的 pattern 很有帮助。11/09 01:55

5^F：→ Lipraxde: 另外就是不太确定你有没有读过 System V ABI，如果要做11/09 01:55

6^F：→ Lipraxde: 的这麽深的优化的话熟悉 ABI 是很重要的！11/09 01:55

7^F：→ Lipraxde: 啊...好像讲了些不太相干的东西，回到你的问题，虽然给11/09 02:04

8^F：→ Lipraxde: 的资讯有点少，不过从执行时间的差距、codegen 结果的11/09 02:04

9^F：→ Lipraxde: 差异来看，我会觉得有可能是由於 cache 所造成的。11/09 02:04

刚刚使用 Linux 的工具 perf 分析两者差异，在 cache misses, cache reference 上没有差异，但在 instrcutions per cycle 上有着显着的差异： frame-pointer=all 的有 4.56 instruction num per cycles, frame-pointer=none 的则只有 1.99 instruction num per cycles. ※ 编辑: shane87123 (114.43.59.118 台湾), 11/09/2021 02:17:11 ※ 编辑: shane87123 (114.43.59.118 台湾), 11/09/2021 02:32:19

10^F：→ Lipraxde: Branch miss 呢？11/09 09:19

补上了！谢谢大大 ※ 编辑: shane87123 (101.12.89.21 台湾), 11/09/2021 13:28:27 ※ 编辑: shane87123 (101.12.89.21 台湾), 11/09/2021 13:28:51

11^F：→ Lipraxde: 有开 frame-pointer 的版本因为有多的 push、move 个关 11/09 16:50

12^F：→ Lipraxde: 系，因此不建议直接对 instruction num per cycles 下 11/09 16:50

13^F：→ Lipraxde: 定论。然後我注意到一个地方，all、none 的 instructio 11/09 16:50

14^F：→ Lipraxde: n 数量是差不多的，可以看看是为什麽 :) 11/09 16:50

	[问题/行为] 猫晚上进房间会不会有憋尿问题
	Re: [闲聊] 选了错误的女孩成为魔法少女 XDDDDDDDDDD
	[正妹] 瑞典一张
	[心得] EMS高领长版毛衣.墨小楼MC1002
	[分享] 丹龙隔热纸GE55+33+22
	[问题] 清洗洗衣机
	[寻物] 窗台下的空间
	[闲聊] 双极の女神1 木魔爵
	[售车] 新竹 1997 march 1297cc 白色四门
	[讨论] 能从照片感受到摄影者心情吗
	[狂贺] 贺贺贺贺贺！岛村卯月！总选举NO.1
	[难过] 羡慕白皮肤的女生
	阅读文章
	[黑特]
	[问题] SBK S1安装於安全帽位置
	[分享] 旧woo100绝版开箱!!
	Re: [无言] 关於小包卫生纸
	[开箱] E5-2683V3 RX480Strix 快睿C1 简单测试
	[心得] 苍の海贼龙地狱执行者16PT
	[售车] 1999年Virage iO 1.8EXi
	[心得] 挑战33 LV10 狮子座pt solo
	[闲聊] 手把手教你不被桶之新手主购教学
	[分享] Civic Type R 量产版官方照无预警流出
	[售车] Golf 4 2.0 银色自排
	[出售] Graco提篮汽座（有底座）2000元诚可议
	[问题] 请问补牙材质掉了还能再补吗?(台中半年内
	[问题] 44th 单曲生写竟然都给重复的啊啊！
	[心得] 华南红卡/icash 核卡
	[问题] 拔牙矫正这样正常吗
	[赠送] 老莫高业初业 102年版
	[情报] 三大行动支付本季掀战火
	[宝宝] 博客来Amos水蜡笔5/1特价五折
	Re: [心得] 新鲜人一些面试分享
	[心得] 苍の海贼龙地狱麒麟25PT
	Re: [闲聊] (君の名は。雷慎入) 君名二创漫画翻译
	Re: [闲聊] OGN中场影片：失踪人口局 (英文字幕)
	[问题] 台湾大哥大4G讯号差
	[出售] [全国]全新千寻侘草LED灯, 水草

WEB批踢踢(PTT)

CompilerDev 板

[问题] frame-pointer 与 performance

热门看板

赞助商连结