[问题] 字串经过encode之後的length

时间Sat Nov 17 09:26:23 2018

各位大大好，初次在本版发文，请多指教。在perldoc上面有叙述到: When you run $octets = encode("utf8", $string) , then $octets might not be equal to $string. Though both contain the same data, the UTF8 flag for $octets is always off. When you encode anything, the UTF8 flag on the result is always off, even when it contains a completely valid utf8 string. https://perldoc.perl.org/Encode.html 我测试了以下三个状况，在perl 5.26.1的环境下确认原始字串都是utf-8，但在encode之後用length去检测发现结果不太能完全理解。 1. $str1 = "中文"; length( $str1 ); #答案为6，因为UTF8 flag off，这是在算byte数，中文占3 Bytes Encode::_utf8_on( $str1 ); length( $str1 ); #答案为2，因为UTF8 flag on，这是在算中文字数量 Encode::_utf8_off( $str1 ); length( $str1 ); #答案为为6，一样是UTF8 flag off的状况，为byte数 $str1_e = encode("UTF-8", str1); length($str1_e); #答案为12，不知道是在计算什麽的数量才会得到12 2. $str2 = "English"; length( $str2 ); #因为英文本来就只占了1 Byte，所以答案为7 Encode::_utf8_on( $str2 ); length( $str2 ); #算字数一样答案为7 Encode::_utf8_off( $str2 ); length( $str2 ); #7 Bytes $str2_e = encode("UTF-8", str2); length( $str2); #没改变，一样是7，是什麽原因呢? 3. $str3 = "ABC呢"; length( $str3 ); #英文 1*3 = 3 Bytes + 中文 3*1 = 3 Bytes 答案为6 Bytes Encode::_utf8_on( $str3 ); length( $str3 ); #共3英文字+1中文字 = 4个字 Encode::_utf8_off( $str3 ); length( $str3 ); #6，因为6 Bytes $str3_e = encode("UTF-8", str3); length( $str3_e); #答案为9，所以看起来中文在这情况下的结果是每字+6? 经过以上实测，比较有疑问的是encode到底做了什麽? 以至於让length去侦测时，中文会回传6呢? 又问得更根本，在encode回来之後$octets送进length时，究竟是被当作什麽型态在处理呢? 请版上专业的大大解惑，我可以200P作为微薄的谢礼，谢谢! --

※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 36.226.177.46 ※ 文章网址: https://webptt.com/cn.aspx?n=bbs/Perl/M.1542417987.A.BD4.html ※ kipi91718:转录至看板 ask 11/17 13:54

	[问题/行为] 猫晚上进房间会不会有憋尿问题
	Re: [闲聊] 选了错误的女孩成为魔法少女 XDDDDDDDDDD
	[正妹] 瑞典一张
	[心得] EMS高领长版毛衣.墨小楼MC1002
	[分享] 丹龙隔热纸GE55+33+22
	[问题] 清洗洗衣机
	[寻物] 窗台下的空间
	[闲聊] 双极の女神1 木魔爵
	[售车] 新竹 1997 march 1297cc 白色四门
	[讨论] 能从照片感受到摄影者心情吗
	[狂贺] 贺贺贺贺贺！岛村卯月！总选举NO.1
	[难过] 羡慕白皮肤的女生
	阅读文章
	[黑特]
	[问题] SBK S1安装於安全帽位置
	[分享] 旧woo100绝版开箱!!
	Re: [无言] 关於小包卫生纸
	[开箱] E5-2683V3 RX480Strix 快睿C1 简单测试
	[心得] 苍の海贼龙地狱执行者16PT
	[售车] 1999年Virage iO 1.8EXi
	[心得] 挑战33 LV10 狮子座pt solo
	[闲聊] 手把手教你不被桶之新手主购教学
	[分享] Civic Type R 量产版官方照无预警流出
	[售车] Golf 4 2.0 银色自排
	[出售] Graco提篮汽座（有底座）2000元诚可议
	[问题] 请问补牙材质掉了还能再补吗?(台中半年内
	[问题] 44th 单曲生写竟然都给重复的啊啊！
	[心得] 华南红卡/icash 核卡
	[问题] 拔牙矫正这样正常吗
	[赠送] 老莫高业初业 102年版
	[情报] 三大行动支付本季掀战火
	[宝宝] 博客来Amos水蜡笔5/1特价五折
	Re: [心得] 新鲜人一些面试分享
	[心得] 苍の海贼龙地狱麒麟25PT
	Re: [闲聊] (君の名は。雷慎入) 君名二创漫画翻译
	Re: [闲聊] OGN中场影片：失踪人口局 (英文字幕)
	[问题] 台湾大哥大4G讯号差
	[出售] [全国]全新千寻侘草LED灯, 水草

WEB批踢踢(PTT)

Perl 板

[问题] 字串经过encode之後的length

热门看板

赞助商连结