如何在python中压缩一个非常大的文件
发布时间:2020-12-20 13:45:38 所属栏目:Python 来源:网络整理
导读:我想使用 python压缩一些可能达到99 GB左右的文件.请问使用zipfile库最有效的方法是什么.这是我的示例代码 with gcs.open(zip_file_name,'w',content_type=b'application/zip') as f: with zipfile.ZipFile(f,'w') as z: for file in files: is_owner = (is_
我想使用
python压缩一些可能达到99 GB左右的文件.请问使用zipfile库最有效的方法是什么.这是我的示例代码
with gcs.open(zip_file_name,'w',content_type=b'application/zip') as f: with zipfile.ZipFile(f,'w') as z: for file in files: is_owner = (is_page_allowed_to_visitor(page,visitor) or (file.owner_id == visitor.id) ) if is_owner: file.show = True elif file.available_from: if file.available_from > datetime.now(): file.show = False elif file.available_to: if file.available_to < datetime.now(): file.show = False else: file.show = True if file.show: file_name = "/%s/%s" % (gcs_store.get_bucket_name(),file.gcs_name) gcs_reader = gcs.open(file_name,'r') z.writestr('%s-%s' %(file.created_on,file.name),gcs_reader.read() ) gcs_reader.close() f.close() #closing zip file 有些要点需要注意: 1)我使用谷歌应用引擎来托管文件,所以我不能使用zipfile.write()方法.我只能以字节为单位获取文件内容. 提前致谢 解决方法
我在zipfile库中添加了一个新方法.这个增强的zipfile库是开源的,可以在github上找到(
EnhancedZipFile).我添加了一个新方法,其灵感来自zipfile.write()方法和zipfile.writestr()方法
def writebuffered(self,zinfo_or_arcname,file_pointer,file_size,compress_type=None): if not isinstance(zinfo_or_arcname,ZipInfo): zinfo = ZipInfo(filename=zinfo_or_arcname,date_time=time.localtime(time.time())[:6]) zinfo.compress_type = self.compression if zinfo.filename[-1] == '/': zinfo.external_attr = 0o40775 << 16 # drwxrwxr-x zinfo.external_attr |= 0x10 # MS-DOS directory flag else: zinfo.external_attr = 0o600 << 16 # ?rw------- else: zinfo = zinfo_or_arcname zinfo.file_size = file_size # Uncompressed size zinfo.header_offset = self.fp.tell() # Start of header bytes self._writecheck(zinfo) self._didModify = True fp = file_pointer # Must overwrite CRC and sizes with correct data later zinfo.CRC = CRC = 0 zinfo.compress_size = compress_size = 0 # Compressed size can be larger than uncompressed size zip64 = self._allowZip64 and zinfo.file_size * 1.05 > ZIP64_LIMIT self.fp.write(zinfo.FileHeader(zip64)) if zinfo.compress_type == ZIP_DEFLATED: cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,zlib.DEFLATED,-15) else: cmpr = None file_size = 0 while 1: buf = fp.read(1024 * 8) if not buf: break file_size = file_size + len(buf) CRC = crc32(buf,CRC) & 0xffffffff if cmpr: buf = cmpr.compress(buf) compress_size = compress_size + len(buf) self.fp.write(buf) if cmpr: buf = cmpr.flush() compress_size = compress_size + len(buf) self.fp.write(buf) zinfo.compress_size = compress_size else: zinfo.compress_size = file_size zinfo.CRC = CRC zinfo.file_size = file_size if not zip64 and self._allowZip64: if file_size > ZIP64_LIMIT: raise RuntimeError('File size has increased during compressing') if compress_size > ZIP64_LIMIT: raise RuntimeError('Compressed size larger than uncompressed size') # Seek backwards and write file header (which will now include # correct CRC and file sizes) position = self.fp.tell() # Preserve current position in file self.fp.flush() self.filelist.append(zinfo) self.NameToInfo[zinfo.filename] = zinfo 注意事项 >我是python的新手,所以我上面写的代码可能不是很优化. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |