PyThon 2.x 简单爬虫正则匹配

#coding=utf-8
import urllib
url = "http://www.shashou47.com/category/server/linux-centos/"
html = urllib.urlopen(url).read()
import re
res_tr1 = r'<a href="(.*?)#respond"'
m_th1 = re.findall(res_tr1,html)
for mm in m_th1:
print mm
res_tr2 = r' rel="bookmark">(.*?)</a></h2>'
m_th2 = re.findall(res_tr2,html)
for mm in m_th2:
print unicode(mm,'utf-8')

版权声明：本站原创文章，于2017-02-23，由 shashou47 发表，共 384字。
转载请注明：PyThon 2.x 简单爬虫正则匹配 | 杀手47's Blog

发表评论取消回复

目前评论：2 其中：访客 1 博主 1

wei 1
回复 2017-02-24 10:04 沙发

你也再学python呀
- shashou47 Admin
  回复 2017-03-05 17:23 1层
  
  @wei 准备搞树莓派的

文章目录
繁
微信

在线咨询