加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 编程开发 > Python > 正文

python---requests爬取顶点小说

发布时间:2020-12-20 10:22:55 所属栏目:Python 来源:网络整理
导读:import requests,re from lxml import etreestart_url = ‘ https://www.23us.so/files/article/html ‘ url =start_url+ ‘ /10/10839/index.html ‘ response = requests.get(url).textnumbers_list =re.findall( ‘ wshref=" ‘ +start_url+ ‘ /10/108
import requests,re
from lxml import  etree


start_url=https://www.23us.so/files/article/html
url=start_url+/10/10839/index.html

response=requests.get(url).text
numbers_list=re.findall(wshref="+start_url+/10/10839/(d+).html,response,re.S)

#j=re.findall(‘<a href="‘+ur+‘/9/9579/9633139.html">(.*?)</a>‘,k,re.S)

x1=url
y1=//*[@class="L"]//text()
novel_name = ‘剑来.txt


def pares(x,y):
    m = requests.get(x)
    m.encoding = m.apparent_encoding
    um = etree.HTML(m.text)
    poo = um.xpath(y)
    return poo
def writecontext():
    for i in do:
        with open(novel_name,a,encoding=utf-8)as f:
            f.write(str(i))
            print(i)
def writetitle():
    with open(novel_name,encoding=utf-8)as f:
        f.write("nn"+o+"n")
        print(o)
    writecontext()
    
doo=pares(x1,y1)
e=0
while e<10000:#控制章节数,
    x2=start_url+"/10/10839/{}.html".format(numbers_list[e])
    y2=//*[@id="contents"]/text()
    do = pares(x2,y2)
    o = doo[e]
    e = e + 1
    writetitle()
    
    

    
    

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读