加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 大数据 > 正文

delphi – 如何吃流出流的字节?

发布时间:2020-12-15 04:14:18 所属栏目:大数据 来源:网络整理
导读:我正在修复一个ZIP库类.在内部,几乎所有ZIP实现都使用 DEFLATE compression (RFC1951). 问题是,在Delphi中,我无法访问任何DEFLATE压缩库.但是我们做的一件事就是 ZLIB compression code (RFC1950).它甚至还附带Delphi,还有其他六种实现方式. 在内部,ZLIB也使
我正在修复一个ZIP库类.在内部,几乎所有ZIP实现都使用 DEFLATE compression (RFC1951).

问题是,在Delphi中,我无法访问任何DEFLATE压缩库.但是我们做的一件事就是ZLIB compression code (RFC1950).它甚至还附带Delphi,还有其他六种实现方式.

在内部,ZLIB也使用DEFLATE进行压缩.所以我想做每个人都做过的事情 – 使用Delphi zlib库来实现其DEFLATE压缩功能.

问题是ZLIB在DEFLATED数据中添加了一个2字节的前缀和4字节的尾部:

[CMF]                1 byte
[FLG]                1 byte
[...deflate compressed data...]
[Adler-32 checksum]  4 bytes

所以我需要的是一种使用标准TCompressionStream(或TZCompressionStream,或TZCompressionStreamEx,取决于您正在使用的源代码)流来压缩数据的方法:

procedure CompressDataToTargetStream(sourceStream: TStream; targetStream: TStream);
var
   compressor: TCompressionStream;
begin
   compressor := TCompressionStream.Create(clDefault,targetStream); //clDefault = CompressionLevel
   try
      compressor.CopyFrom(sourceStream,sourceStream.Length)
   finally
      compressor.Free; 
   end;
end;

这是有效的,除了它写出前导2字节和尾随4字节;我需要去除那些.

所以我写了一个TByteEaterStream:

TByteEaterStream = class(TStream)
public
   constructor Create(TargetStream: TStream; 
         LeadingBytesToEat,TrailingBytesToEat: Integer);
end;

例如

procedure CompressDataToTargetStream(sourceStream: TStream; targetStream: TStream);
var
   byteEaterStream: TByteEaterStream;
   compressor: TCompressionStream;
begin
   byteEaterStream := TByteEaterStream.Create(targetStream,2,4); //2 leading bytes,4 trailing bytes
   try
      compressor := TCompressionStream.Create(clDefault,byteEaterStream); //clDefault = CompressionLevel
      try
         compressor.CopyFrom(sourceStream,sourceStream.Length)
      finally
         compressor.Free; 
      end;
   finally
      byteEaterStream.Free;
   end;
end;

此流将覆盖write方法.吃前2个字节是微不足道的.诀窍是吃掉4个字节.

食者流有一个4字节的数组,我总是保持缓冲区中每次写入的最后四个字节.当EaterStream被销毁时,尾随的四个字节随之而来.

问题是通过这个缓冲区洗几百万次写入会破坏性能.上游的典型用途是:

for each of a million data rows
    stream.Write(s,Length(s)); //30-90 character string

我绝对不希望上游用户必须表明“结束就在附近”.我只是希望它更快.

问题

观察流过的字节流,保留最后四个字节的最佳方法是什么;鉴于你不知道什么时候写作将是最后一次.

我正在修复的代码将整个压缩版本写入TStringStream,然后只抓取900MB – 6个字节来获取内部DEFLATE数据:

cs := TStringStream.Create('');
....write compressed data to cs
S := Copy(CS.DataString,3,Length(CS.DataString) - 6);

除了运行用户内存不足.最初我改变它以写入TFileStream,然后我可以执行相同的技巧.

但我想要更好的解决方案;流解决方案.我希望数据进入压缩的最终流,没有任何中间存储.

我的实施

并不是说它有所帮助;因为我不是要求系统甚至使用适应流来进行修剪

TByteEaterStream = class(TStream)
private
    FTargetStream: TStream;
    FTargetStreamOwnership: TStreamOwnership;
    FLeadingBytesToEat: Integer;
    FTrailingBytesToEat: Integer;
    FLeadingBytesRemaining: Integer;

    FBuffer: array of Byte;
    FValidBufferLength: Integer;
    function GetBufferValidLength: Integer;
public
    constructor Create(TargetStream: TStream; LeadingBytesToEat,TrailingBytesToEat: Integer; StreamOwnership: TStreamOwnership=soReference);
    destructor Destroy; override;

    class procedure SelfTest;

    procedure Flush;

    function Read(var Buffer; Count: Longint): Longint; override;
    function Write(const Buffer; Count: Longint): Longint; override;
    function Seek(Offset: Longint; Origin: Word): Longint; override;
end;

{ TByteEaterStream }

constructor TByteEaterStream.Create(TargetStream: TStream; LeadingBytesToEat,TrailingBytesToEat: Integer; StreamOwnership: TStreamOwnership=soReference);
begin
    inherited Create;

    //User requested state
    FTargetStream := TargetStream;
    FTargetStreamOwnership := StreamOwnership;
    FLeadingBytesToEat := LeadingBytesToEat;
    FTrailingBytesToEat := TrailingBytesToEat;

    //internal housekeeping
    FLeadingBytesRemaining := FLeadingBytesToEat;

    SetLength(FBuffer,FTrailingBytesToEat);
    FValidBufferLength := 0;
end;

destructor TByteEaterStream.Destroy;
begin
    if FTargetStreamOwnership = soOwned then
        FTargetStream.Free;
    FTargetStream := nil;

    inherited;
end;

procedure TByteEaterStream.Flush;
begin
    if FValidBufferLength > 0 then
    begin
        FTargetStream.Write(FBuffer[0],FValidBufferLength);
        FValidBufferLength  := 0;
    end;
end;

function TByteEaterStream.Write(const Buffer; Count: Integer): Longint;
var
    newStart: Pointer;
    totalCount: Integer;
    addIndex: Integer;
    bufferValidLength: Integer;
    bytesToWrite: Integer;
begin
    Result := Count;

    if Count = 0 then
        Exit;

    if FLeadingBytesRemaining > 0 then
    begin
        newStart := Addr(Buffer);
        Inc(Cardinal(newStart));
        Dec(Count);
        Dec(FLeadingBytesRemaining);
        Result := Self.Write(newStart^,Count)+1; //tell the upstream guy that we wrote it

        Exit;
    end;

    if FTrailingBytesToEat > 0 then
    begin
        if (Count < FTrailingBytesToEat) then
        begin
            //There's less bytes incoming than an entire buffer
            //But the buffer might overfloweth
            totalCount := FValidBufferLength+Count;

            //If it could all fit in the buffer,then let it
            if (totalCount <= FTrailingBytesToEat) then
            begin
                Move(Buffer,FBuffer[FValidBufferLength],Count);
                FValidBufferLength := totalCount;
            end
            else
            begin
                //We're going to overflow the buffer.

                //Purge from the buffer the amount that would get pushed
                FTargetStream.Write(FBuffer[0],totalCount-FTrailingBytesToEat);

                //Shuffle the buffer down (overlapped move)
                bufferValidLength := bufferValidLength - (totalCount-FTrailingBytesToEat);
                Move(FBuffer[totalCount-FTrailingBytesToEat],FBuffer[0],bufferValidLength);

                addIndex := bufferValidLength ; //where we will add the data to
                Move(Buffer,FBuffer[addIndex],Count);
            end;
        end
        else if (Count = FTrailingBytesToEat) then
        begin
            //The incoming bytes exactly fill the buffer. Flush what we have and eat the incoming amounts
            Flush;
            Move(Buffer,FTrailingBytesToEat);
            FValidBufferLength := FTrailingBytesToEat;
            Result := FTrailingBytesToEat; //we "wrote" n bytes
        end
        else
        begin
            //Count is greater than trailing buffer eat size
            Flush;

            //Write the data that definitely not to be eaten
            bytesToWrite := Count-FTrailingBytesToEat;
            FTargetStream.Write(Buffer,bytesToWrite);

            //Buffer the remainder
            newStart := Addr(Buffer);
            Inc(Cardinal(newStart),bytesToWrite);

            Move(newStart^,FTrailingBytesToEat);
            FValidBufferLength := 4;
        end;
    end;
end;

function TByteEaterStream.Seek(Offset: Integer; Origin: Word): Longint;
begin
    //what does it mean if they want to seek around when i'm supposed to be eating data?
    //i don't know; so results are,by definition,undefined. Don't use at your own risk
    Result := FTargetStream.Seek(Offset,Origin);
end;

function TByteEaterStream.Read(var Buffer; Count: Integer): Longint;
begin
    //what does it mean if they want to read back bytes when i'm supposed to be eating data?
    //i don't know; so results are,undefined. Don't use at your own risk
    Result := FTargetStream.Read({var}Buffer,Count);
end;

class procedure TByteEaterStream.SelfTest;

    procedure CheckEquals(Expected,Actual: string; Message: string);
    begin
        if Actual <> Expected then
            raise Exception.CreateFmt('TByteEaterStream self-test failed. Expected "%s",but was "%s". Message: %s',[Expected,Actual,Message]);
    end;

    procedure Test(const InputString: string; ExpectedString: string);
    var
        s: TStringStream;
        eater: TByteEaterStream;
    begin
        s := TStringStream.Create('');
        try
            eater := TByteEaterStream.Create(s,4,soReference);
            try
                eater.Write(InputString[1],Length(InputString));
            finally
                eater.Free;
            end;
            CheckEquals(ExpectedString,s.DataString,InputString);
        finally
            s.Free;
        end;
    end;
begin
    Test('1','');
    Test('11','');
    Test('113','');
    Test('1133','');
    Test('11333','');
    Test('113333','');
    Test('11H3333','H');
    Test('11He3333','He');
    Test('11Hel3333','Hel');
    Test('11Hell3333','Hell');
    Test('11Hello3333','Hello');
    Test('11Hello,3333','Hello,');
    Test('11Hello,W3333',W');
    Test('11Hello,Wo3333',Wo');
    Test('11Hello,Wor3333',Wor');
    Test('11Hello,Worl3333',Worl');
    Test('11Hello,World3333',World');
    Test('11Hello,World!3333',World!');
end;

解决方法

您需要推迟写入,直到您确定要知道要写入的字节不是必须吃掉的尾随字节.这一观察结果使您认为缓冲将提供解决方案.

所以,我建议这样做:

>使用使用缓冲的流适配器.
>吃前导字节很容易.刚刚将前两个字节发送到遗忘状态.
>在缓冲区之后写入要写入的字节,当需要刷新时,刷新缓冲区中除最后四个字节外的所有字节.
>刷新时,将未刷新的四个字节复制到缓冲区的开头,这样就不会丢失它们.
>关闭流时,将其冲洗,就像对缓冲流一样.并使用与以前相同的刷新技术,以便保持最后的四个字节.此时您知道这些是流的最后四个字节.

上述方法要求的一个要求是缓冲区的大小必须大于要剥离的尾随字节数.

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读