加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 编程开发 > Python > 正文

下载漫画小脚本

发布时间:2020-12-17 17:16:48 所属栏目:Python 来源:网络整理
导读:今天PHP站长网 52php.cn把收集自互联网的代码分享给大家,仅供参考。 #!/usr/bin/env python# -*- coding:utf-8 -*-"""Copyright (c) 2015,The Sun TechnologyThis Program could download files from the internet"""imp

以下代码由PHP站长网 52php.cn收集自互联网

现在PHP站长网小编把它分享给大家,仅供参考

#!/usr/bin/env python
# -*- coding:utf-8 -*-
"""
Copyright (c) 2015,The Sun Technology
This Program could download files from the internet
"""
import urllib2
import os
import time
from urllib2 import HTTPError
from bs4 import BeautifulSoup
from urlparse import urlparse

BASE_URL="/Users/mac/Documents%s"

def get_file_name(req_url):
    path_obj=urlparse(req_url)
    return os.path.split(path_obj.path)

def get_save_path(save_dir):
    dirs=get_file_name(save_dir)
    save_path=BASE_URL%dirs[0]
    if not os.path.exists(save_path):
        os.mkdir(save_path)

def save_files(file_url,file_path):
    start=time.time()
    response=urllib2.urlopen(file_url)
    html=response.read()
    response.close()
    with open(file_path,"wb") as handler:
        handler.write(html)
    print "%s has been downloaded successfully "%file_url
    print "Total cost:%.3f ms"%(time.time()-start)

def download(url_path):
    start = 82
    for pageNum in range(start,start+10):
        try:
            combine_url=url_path%pageNum
            response=urllib2.urlopen(combine_url)
            page=response.read() if response.getcode()==200 else None
            """ Start parsing the HTML from web page"""
            if not page:
                return
            soup = BeautifulSoup(page,"html.parser")
            img_url=soup.find_all('img',id="main-comic")
            #parse the url
            url_parse=urlparse(url_path)
            #rebuild the url
            rebuild_url= url_parse.scheme+':'+img_url[0].get('src')
            #download comic from url
            get_name=get_file_name(rebuild_url)

            save_files(rebuild_url,BASE_URL%'/'.join(get_name))

        except HTTPError,e:
            print "An error has accour",e
            continue
        finally:
            response.close()

if __name__ == '__main__':
    req_url="http://explosm.net/comics/%s"
    get_save_path(req_url)
    download(req_url)

以上内容由PHP站长网【52php.cn】收集整理供大家参考研究

如果以上内容对您有帮助,欢迎收藏、点赞、推荐、分享。

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读