作者liyih ()
看板Perl
标题Re: [问题] 大量字串资料个数出现次数统计
时间Tue Aug 2 13:18:16 2011
※ 引述《spider1216 (顺着感觉走)》之铭言:
: 不好意思我是perl新手
: 我的问题是 我现在有404个单字资料库
: 然後我要去比对一个文字档案,其内容由上面资料库中的单字组成
: 我想去统计我的文字档案中 有哪些资料库单字且出现几次
: 请高手可以教我该怎麽做
==== FILE: dict ====
hello
you
how
do
are
==== FILE: sentence ====
How do you do ?
How are you ?
What is your name ?
==== FILE: count.pl ====
#!/usr/bin/perl
use strict;
use warnings;
my %dict = ();
my %unseen = ();
#### 读入字典档
open( DICT, "dict" );
while (<DICT>) {
chomp;
$dict{ lc($_) } = 0;
}
close(DICT);
#### 比对并计算出现次数
open( FH, "sentence" );
while (<FH>) {
chomp;
my @words = split( /[^a-zA-Z]+/, $_ );
foreach my $w (@words) {
if ( exists( $dict{ lc($w) } ) ) {
$dict{ lc($w) }++;
}
else {
$unseen{ lc($w) } = 0;
}
}
}
close(FH);
#### 输出统计结果
print "$_ => $dict{$_}\n" foreach ( sort keys %dict );
print "UNSEEN: ", join( ", ", sort keys %unseen ), "\n";
# perl count.pl
are => 1
do => 2
hello => 0
how => 2
you => 2
UNSEEN: is, name, what, your
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 140.114.64.130
1F:推 spider1216:感谢高手教学 小弟受益良多^^ 08/02 15:01
2F:推 nicebb:nice 10/01 10:16