scala – 包含NaN的收藏的最小/最大(处理订购中的不兼容性)
由于以下行为,我刚刚遇到一个讨厌的错误:
scala> List(1.0,2.0,3.0,Double.NaN).min res1: Double = NaN scala> List(1.0,Double.NaN).max res2: Double = NaN 我明白,对于成对比较,有时候可能会有max(NaN,0)= NaN,这可能是java.lang.Double.com遵循这个约定的原因(似乎有一个IEEE standard).然而,对于一个集合,我真的认为这是一个奇怪的惯例.所有上述收集确实包含有效数字;这些数字有最大和最小的清晰度.在我看来,收藏的最大数量不是数字的概念是矛盾的,因为NaN不是一个数字,所以它不能是一个集合的最大或最小“数量” – 除非没有有效数字;在这种情况下,最大的“不是数字”是完全正确的.语义上,最小和最大功能退化为检查该集合是否包含NaN.由于有更好的方法来检查NaN的存在(例如,collection.find(_.isNaN)),因此在集合上维护语义上有意义的最小/最大值将是非常好的. 所以我的问题是:什么是获取行为以忽略NaNs的存在的最佳方法?我看到两种可能性: >在调用min / max之前过滤NaN.由于这需要在所有地方明确处理这个问题,可能会导致履行处罚,我宁愿更容易一些. object NanAwareOrdering extends Ordering[Double] { def compare(x: Double,y: Double) = { if (x.isNaN()) { +1 // without checking x,return y < x } else if (y.isNaN()) { -1 // without checking y,return x < y } else { java.lang.Double.compare(x,y) } } } 然而,这种方法似乎取决于我是否有兴趣找到最小或最大值,即: scala> List(1.0,Double.NaN).min(NanAwareOrdering) res7: Double = 1.0 scala> List(1.0,Double.NaN).max(NanAwareOrdering) res8: Double = NaN 这意味着我将不得不有两个NanAwareOrdering取决于我是否要求最小值或最大值,这将禁止具有隐式值.因此我的问题是:如何定义一个顺序来处理这两种情况? 更新: 为了完整起见:在分析问题的过程中,我意识到“退化为NaN检查”的前提实际上是错误的.其实我觉得更丑陋: scala> List(1.0,Double.NaN).min res1: Double = NaN scala> List(Double.NaN,1.0).min res2: Double = 1.0 解决方法
免责声明:我会补充一下我自己的答案,以防其他任何人仍然对此事有更多的细节感兴趣.
一些理论… 我看起来像这个问题比我预期的要复杂得多.正如Alexey Romanov已经指出的那样,无与伦比的概念将要求最大/最小功能采取部分顺序.不幸的是,Alexey也是正确的,基于部分顺序的一般最大/最小函数是没有意义的:考虑部分排序仅定义某些组内的关系的情况,但组本身完全独立于彼此(例如,具有两个关系a< b和c< d的元素{a,b,c,d};我们将具有两个最大/最小).在这方面,甚至可以认为,正式的max / min应该总是返回两个值NaN和相应的有效的最小/最大值,因为NaN本身在它自己的关系组中也是一个极值. 因此,由于部分订单过于普遍/复杂,最小/最大功能需要订购.不幸的是,总订单不允许无与伦比的概念.审查总订单的三个定义属性很明显,“忽略NaN”是不可能的: …练习… 所以当试图提出一个订单的执行来实现我们所期望的最小/最大行为时,很明显我们必须违反某些(并承担后果). TraversableOnce中min / max / minBy / maxBy的执行遵循模式(for min): reduceLeft((x,y) => if (cmp.lteq(x,y)) x else y) 和最大变体的gteq.这给了我“左偏”比较的想法,即: x <comparison_operator> NaN is always true to keep x in the reduction NaN <comparison_operator> x is always false to inject x into the reduction 所产生的这种“左偏”排序的实现将如下所示: object BiasedOrdering extends Ordering[Double] { def compare(x: Double,y: Double) = java.lang.Double.compare(x,y) // this is inconsistent,but the same goes for Double.Ordering override def lteq(x: Double,y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) true else compare(x,y) <= 0 override def gteq(x: Double,y) >= 0 override def lt(x: Double,y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) false else compare(x,y) < 0 override def gt(x: Double,y) > 0 override def equiv(x: Double,y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) true else compare(x,y) == 0 } 分析: 目前我正在试图找出: >这个订单如何与默认排序相比较, 我正在与Scala的默认顺序Ordering.Double进行比较,以及从java.lang.Double.compare直接派生的以下排序: object OrderingDerivedFromCompare extends Ordering[Double] { def compare(x: Double,y: Double) = { java.lang.Double.compare(x,y) } } Scala的默认顺序Ordering.Double的一个有趣的属性是它通过语言的本机数值比较运算符(< =,==,>)覆盖所有比较成员函数,因此比较结果与如果我们直接与这些运算符进行比较.以下显示了NaN和三个订单的有效数字之间的所有可能的关系: Ordering.Double 0.0 > NaN = false Ordering.Double 0.0 >= NaN = false Ordering.Double 0.0 == NaN = false Ordering.Double 0.0 <= NaN = false Ordering.Double 0.0 < NaN = false OrderingDerivedFromCompare 0.0 > NaN = false OrderingDerivedFromCompare 0.0 >= NaN = false OrderingDerivedFromCompare 0.0 == NaN = false OrderingDerivedFromCompare 0.0 <= NaN = true OrderingDerivedFromCompare 0.0 < NaN = true BiasedOrdering 0.0 > NaN = true BiasedOrdering 0.0 >= NaN = true BiasedOrdering 0.0 == NaN = true BiasedOrdering 0.0 <= NaN = true BiasedOrdering 0.0 < NaN = true Ordering.Double NaN > 0.0 = false Ordering.Double NaN >= 0.0 = false Ordering.Double NaN == 0.0 = false Ordering.Double NaN <= 0.0 = false Ordering.Double NaN < 0.0 = false OrderingDerivedFromCompare NaN > 0.0 = true OrderingDerivedFromCompare NaN >= 0.0 = true OrderingDerivedFromCompare NaN == 0.0 = false OrderingDerivedFromCompare NaN <= 0.0 = false OrderingDerivedFromCompare NaN < 0.0 = false BiasedOrdering NaN > 0.0 = false BiasedOrdering NaN >= 0.0 = false BiasedOrdering NaN == 0.0 = false BiasedOrdering NaN <= 0.0 = false BiasedOrdering NaN < 0.0 = false Ordering.Double NaN > NaN = false Ordering.Double NaN >= NaN = false Ordering.Double NaN == NaN = false Ordering.Double NaN <= NaN = false Ordering.Double NaN < NaN = false OrderingDerivedFromCompare NaN > NaN = false OrderingDerivedFromCompare NaN >= NaN = true OrderingDerivedFromCompare NaN == NaN = true OrderingDerivedFromCompare NaN <= NaN = true OrderingDerivedFromCompare NaN < NaN = false BiasedOrdering NaN > NaN = false BiasedOrdering NaN >= NaN = true BiasedOrdering NaN == NaN = true BiasedOrdering NaN <= NaN = true BiasedOrdering NaN < NaN = false 我们可以看到: >只有OrderingDerivedFromCompare满足总订单属性.基于这个结果the reasoning behind 现在我们手头的实际问题,最小/最大功能.对于OrderingDerivedFromCompare,现在清楚我们必须获得什么 – NaN只是最大的值,所以很明显,将其作为最大值获取,而不管列表中的元素如何排列: OrderingDerivedFromCompare List(1.0,Double.NaN).min = 1.0 OrderingDerivedFromCompare List(Double.NaN,1.0,3.0).min = 1.0 OrderingDerivedFromCompare List(1.0,Double.NaN).max = NaN OrderingDerivedFromCompare List(Double.NaN,3.0).max = NaN 现在到Scala的默认排序.我非常震惊地看到,情况其实比我的问题更为复杂: Ordering.Double List(1.0,Double.NaN).min = NaN Ordering.Double List(Double.NaN,3.0).min = 1.0 Ordering.Double List(1.0,Double.NaN).max = NaN Ordering.Double List(Double.NaN,3.0).max = 3.0 实际上,元素的顺序变得相关(由于在reduceLeft中的每个比较返回false). “左偏”明显地解决了这个问题,导致了一致的结果: BiasedOrdering List(1.0,Double.NaN).min = 1.0 BiasedOrdering List(Double.NaN,3.0).min = 1.0 BiasedOrdering List(1.0,Double.NaN).max = 3.0 BiasedOrdering List(Double.NaN,3.0).max = 3.0 不幸的是,我仍然无法完全回答这里的所有问题.其余几点是: >为什么Scala的默认顺序是按照它的方式定义的?目前处理NaN似乎是有缺陷的. Ordering的非常危险的细节.Double是比较函数实际上委托给java.lang.Double.compare,而比较成员是基于语言的本机比较来实现的.这显然导致结果不一致,例如: Ordering.Double.compare(0.0,Double.NaN) == -1 // indicating 0.0 < NaN Ordering.Double.lt (0.0,Double.NaN) == false // contradiction >除了直接评估任何矛盾的比较之外,BiasedOrdering的潜在缺点是什么?快速检查排序结果如下,没有显示任何麻烦: Ordering.Double List(1.0,Double.NaN).sorted = List(1.0,NaN) OrderingDerivedFromCompare List(1.0,NaN) BiasedOrdering List(1.0,NaN) Ordering.Double List(Double.NaN,3.0).sorted = List(1.0,NaN) OrderingDerivedFromCompare List(Double.NaN,NaN) BiasedOrdering List(Double.NaN,NaN) 暂时我会有一个这个左偏偏的订单.但由于问题的本质不允许一个完美的一般解决方案:小心使用! 更新 而在基于隐式类作为monkjack建议的解决方案方面,我喜欢以下很多(因为它不会混乱(有缺陷的)总订单,而是内部转换为一个干净的完全排序的域): implicit class MinMaxNanAware(t: TraversableOnce[Double]) { def nanAwareMin = t.minBy(x => if (x.isNaN) Double.PositiveInfinity else x) def nanAwareMax = t.maxBy(x => if (x.isNaN) Double.NegativeInfinity else x) } // and now we can simply use val goodMin = list.nanAwareMin (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |