作者SABER101 (None)
看板RegExp
标题[问题] PHP抓原始码内容
时间Thu Aug 4 11:41:09 2011
http://www.epinions.com/review/Canon_PowerShot_SX210_IS_Digital_Camera/content_536777166468
想从上面网址抓中间Review部分
Review是在本文位置
<span class=rkr>
本文
<br>
表达式这样写<span class=rkr>\s(.*)\s<br>
完全抓不到东西
然後\s(.*)\s<br>这样抓到文章九成
剩下这段抓不到
Straight photography (Street, documentary, and environmental portraiture) is
primarily concerned with capturing images of people in uncontrived, naturally
lit and candid settings that evocatively depict or dramatically reveal some
aspect of the human condition. In addition to being a first rate general
purpose digicam, the nifty little SX210 is almost perfect for “straight”
photography - it is compact, responsive, unobtrusive, features a 14x zoom
(for a little extra standoff room) and dependably generates first rate images.
明明就有写换行为什麽会抓不到在同一行的这一段
然後第一个和第二个的差别只差<span class=rkr>
然後就全没了?????
好苦恼啊
希望有人能解答
谢谢
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 220.137.134.31
1F:→ scp958630:/<span class=rkr>(.*?)<b>Recommended:/s 08/04 11:53
2F:→ scp958630:从<span class=rkr>到文字中间有不只一个\n 08/04 11:56
3F:→ scp958630:\s只能match到一个 你可以用\s*或是 加s modifier让 08/04 11:56
4F:→ scp958630:.包含\n 08/04 11:56
5F:→ SABER101:用s modifier会从第一个<span class=rkr>开始抓的说 08/04 17:43
6F:→ blackkaku:<span class=rkr>\n +\n(.+<br>)$ 抓\1 08/04 21:02