[问题] 网页上的全域比对

时间Mon Nov 2 14:50:53 2009

在文件当中全域比对使用/g就可以，不过我同样的方式套用到网页上没有产生功能不知道是不是Mechanize有另外的改法？是延伸之前的问题，把要查询的部分丢上网路，然後抓取部分结果下来当我要比对的部分超过一个时，就只比对到第一个就输出了不知道有没有办法将整个网页都扫过？实际的例子：於PDBsum中输入2v69(id) http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl 抓取Uniprot後面的数字此处的input为 "2v69" 理想的output为 "2v69, P00877, P00873" 不过现在的code只能抓到 "2v69, P00877" 後面的会漏掉。目前的script： #!/usr/bin/perl use WWW::Mechanize; my $file = "input.txt"; my $ofile = ">output.txt"; my $checkURL = "http://www.ebi.ac.uk/pdbsum/"; open FILE, $file or die "File open error!!"; open FILE2, $ofile or die "File open error!!"; my $mech = WWW::Mechanize -> new(); my $result; while(<FILE>){ chomp; $_=~ s/ //g; $mech -> get($checkURL); $mech -> submit_form( form_number => 1, fields => { template => "main.html", EBI => "TRUE", pdbcode => $_, }, ); if($mech->content=~/http:\/\/www.uniprot.org\/uniprot\/(\D\S\S\S\S\S)/msg) { print FILE2 "$_ , $1\n"; } else {print FILE2 "$_ \n"; } } close FILE; close FILE2; ~~~~ 主要是用最後面一段的if($mech->content=~比对原始码 -- 再次感谢曾协助过我的版大m(__ __)m --

※ 发信站: 批踢踢实业坊(ptt.cc) ◆ From: 140.114.88.228

1^F：推 freshroger:@arr = ( $mech->content = ...... ); 11/02 23:10

2^F：→ freshroger:for $entry (@arr) { $output .= ",$entry";} 11/02 23:10

3^F：→ freshroger:print FILE2 "$output\n"; 11/02 23:10

4^F：→ freshroger:记得前面加上 my $output; $output .= $_; 11/02 23:11

5^F：→ freshroger:如果你要取少数的data,这样ok,多的话建议直接下载dat档 11/02 23:14

6^F：→ freshroger:再一次parse :) 11/02 23:14

7^F：推 freshroger:这网址给你参考 http://research.isb-sib.ch/ssmap/ 11/02 23:26

了解！谢谢板大:D 补上一个用while做出来的： if($mech->content=~/(http:\/\/www.uniprot.org\/uniprot\/\D\d\d\d\d\d)/ms) { my $line=$mech->content; print FILE2 "$_ , "; while ( $line =~ s/http:\/\/www.uniprot.org\/uniprot\/(\D\d\d\d\d\d)//ms) { print FILE2 "$1 "; } print FILE2 ",\n"; } else {print FILE2 "$_ \n"; } } 很硬来就是..XD ※ 编辑: adu 来自: 140.114.88.228 (11/03 09:37)

	[问题/行为] 猫晚上进房间会不会有憋尿问题
	Re: [闲聊] 选了错误的女孩成为魔法少女 XDDDDDDDDDD
	[正妹] 瑞典一张
	[心得] EMS高领长版毛衣.墨小楼MC1002
	[分享] 丹龙隔热纸GE55+33+22
	[问题] 清洗洗衣机
	[寻物] 窗台下的空间
	[闲聊] 双极の女神1 木魔爵
	[售车] 新竹 1997 march 1297cc 白色四门
	[讨论] 能从照片感受到摄影者心情吗
	[狂贺] 贺贺贺贺贺！岛村卯月！总选举NO.1
	[难过] 羡慕白皮肤的女生
	阅读文章
	[黑特]
	[问题] SBK S1安装於安全帽位置
	[分享] 旧woo100绝版开箱!!
	Re: [无言] 关於小包卫生纸
	[开箱] E5-2683V3 RX480Strix 快睿C1 简单测试
	[心得] 苍の海贼龙地狱执行者16PT
	[售车] 1999年Virage iO 1.8EXi
	[心得] 挑战33 LV10 狮子座pt solo
	[闲聊] 手把手教你不被桶之新手主购教学
	[分享] Civic Type R 量产版官方照无预警流出
	[售车] Golf 4 2.0 银色自排
	[出售] Graco提篮汽座（有底座）2000元诚可议
	[问题] 请问补牙材质掉了还能再补吗?(台中半年内
	[问题] 44th 单曲生写竟然都给重复的啊啊！
	[心得] 华南红卡/icash 核卡
	[问题] 拔牙矫正这样正常吗
	[赠送] 老莫高业初业 102年版
	[情报] 三大行动支付本季掀战火
	[宝宝] 博客来Amos水蜡笔5/1特价五折
	Re: [心得] 新鲜人一些面试分享
	[心得] 苍の海贼龙地狱麒麟25PT
	Re: [闲聊] (君の名は。雷慎入) 君名二创漫画翻译
	Re: [闲聊] OGN中场影片：失踪人口局 (英文字幕)
	[问题] 台湾大哥大4G讯号差
	[出售] [全国]全新千寻侘草LED灯, 水草

WEB批踢踢(PTT)

Perl 板

[问题] 网页上的全域比对

热门看板

赞助商连结