加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 综合聚焦 > 服务器 > 安全 > 正文

scala – Akka streams:读取多个文件

发布时间:2020-12-16 18:41:58 所属栏目:安全 来源:网络整理
导读:我有一个文件列表.我想要: 将所有这些作为单一来源阅读. 文件应按顺序依次读取. (没有循环) 任何文件都不应该完全在内存中. 从文件读取错误应该会折叠流. 感觉这应该有效:(Scala,akka-streams v2.4.7) val sources = Seq("file1","file2").map(new File(_)
我有一个文件列表.我想要:

>将所有这些作为单一来源阅读.
>文件应按顺序依次读取. (没有循环)
>任何文件都不应该完全在内存中.
>从文件读取错误应该会折叠流.

感觉这应该有效:(Scala,akka-streams v2.4.7)

val sources = Seq("file1","file2").map(new File(_)).map(f => FileIO.fromPath(f.toPath)
    .via(Framing.delimiter(ByteString(System.lineSeparator),10000,allowTruncation = true))
    .map(bs => bs.utf8String)
  )
val source = sources.reduce( (a,b) => Source.combine(a,b)(MergePreferred(_)) )
source.map(_ => 1).runWith(Sink.reduce[Int](_ + _)) // counting lines

但是这会导致编译错误,因为FileIO具有与之关联的物化值,而Source.combine不支持这一点.

映射物化值让我想知道如何处理文件读取错误,但是编译:

val sources = Seq("file1",allowTruncation = true))
    .map(bs => bs.utf8String)
    .mapMaterializedValue(f => NotUsed.getInstance())
  )
val source = sources.reduce( (a,b)(MergePreferred(_)) )
source.map(_ => 1).runWith(Sink.reduce[Int](_ + _))  // counting lines

但是在运行时抛出IllegalArgumentException:

java.lang.IllegalArgumentException: requirement failed: The inlets [] and outlets [MergePreferred.out] must correspond to the inlets [MergePreferred.preferred] and outlets [MergePreferred.out]

解决方法

下面的代码并不像它可能的那样简洁,以便清楚地模块化不同的问题.

// Given a stream of bytestrings delimited by the system line separator we can get lines represented as Strings
val lines = Framing.delimiter(ByteString(System.lineSeparator),allowTruncation = true).map(bs => bs.utf8String)

// given as stream of Paths we read those files and count the number of lines
val lineCounter = Flow[Path].flatMapConcat(path => FileIO.fromPath(path).via(lines)).fold(0l)((count,line) => count + 1).toMat(Sink.head)(Keep.right)

// Here's our test data source (replace paths with real paths)
val testFiles = Source(List("somePathToFile1","somePathToFile2").map(new File(_).toPath))

// Runs the line counter over the test files,returns a Future,which contains the number of lines,which we then print out to the console when it completes
testFiles runWith lineCounter foreach println

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读