17.splash_case02
发布时间:2020-12-16 23:19:18 所属栏目:百科 来源:网络整理
导读:# 抓
# 抓取《我不是药神》的豆瓣评论 import csv import time import requests from lxml import etree fw = open('douban_comments.csv','w') writer = csv.writer(fw) writer.writerow(['comment_time','comment_content']) for i in range(0,20): # url = 'http://localhost:8050/render.html?url=https://movie.douban.com/subject/26752088/comments?start={}&limit=20&sort=new_score&status=P&timeout=30&wait=0.5'.format(i*20) url = 'https://movie.douban.com/subject/26752088/comments?start={}&limit=20&sort=new_score&status=P'.format(i*20) response = requests.get(url) tree = etree.HTML(response.text) comments = tree.xpath('//div[@class="comment"]') for item in comments: comment_time = item.xpath('./h3/span[2]/span[contains(@class,"comment-time")]/@title')[0] comment_time = int(time.mktime(time.strptime(comment_time,'%Y-%m-%d %H:%M:%S'))) comment_content = item.xpath('./p/span/text()')[0].strip() print(comment_time) print(comment_content) writer.writerow([comment_time,comment_content]) (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
相关内容
- swift – 如何覆盖初始UIViewController的特征集合? (与故
- sqlite建库+各种
- ruby – 创建一个不会重复Titan的addEdge()Gremlin查询
- SQLite:COUNT在大表上慢
- ajax 实现无刷新验证用户名是否存在
- 哇,学习Swift语言多为30岁白人!
- 109.In a new installation of Oracle Database 11g, you p
- Nand Flash,Nor Flash,CFI Flash,SPI Flash 之间的关系
- reactjs – 如何修复流错误“TouchHistoryMath.重复的模块提
- c# – 如何为.Net应用程序选择Oracle提供程序?