python – 对numpy中的分区索引进行分组argmax / argmin

发布时间：2020-12-20 13:40:47 所属栏目：Python 来源：网络整理

导读：Numpy的ufuncs有一个 reduceat 方法,它在一个数组中的连续分区上运行它们.所以不要写： import numpy as npa = np.array([4,6,8,9,5,4,9])split_at = [4,5]maxima = [max(subarray for subarray in np.split(a,split_at)] 我可以写： maxima = np.maximum.re

Numpy的ufuncs有一个 reduceat方法,它在一个数组中的连续分区上运行它们.所以不要写：

import numpy as np
a = np.array([4,6,8,9,5,4,9])
split_at = [4,5]
maxima = [max(subarray for subarray in np.split(a,split_at)]

我可以写：

maxima = np.maximum.reduceat(a,np.hstack([0,split_at]))

两者都将在切片a [0：4],[4：5],[5:10]中返回最大值,为[8,9].

我想要一个类似的函数来执行argmax,注意我只想在每个分区中使用一个最大索引：[3,5]使用上面的a和split_at(尽管索引5和9都获得了最大值)最后一组),将由返回

np.hstack([0,split_at]) + [np.argmax(subarray) for subarray in np.split(a,split_at)]

我将在下面发布一个可能的解决方案,但是希望看到一个可以在不创建组索引的情况下进行矢量化的解决方案.

解决方法

该解决方案涉及在组上建立索引(在上面的示例中为[0,1,2,2]).

group_lengths = np.diff(np.hstack([0,split_at,len(a)]))
n_groups = len(group_lengths)
index = np.repeat(np.arange(n_groups),group_lengths)

然后我们可以使用：

maxima = np.maximum.reduceat(a,split_at]))
all_argmax = np.flatnonzero(np.repeat(maxima,group_lengths) == a)
result = np.empty(len(group_lengths),dtype='i')
result[index[all_argmax[::-1]]] = all_argmax[::-1]

得到[3,5]的结果. [:: – 1]确保我们得到每组中的第一个而不是最后一个argmax.

这取决于花式分配中的最后一个索引确定分配的值,@ seberg says one shouldn’t rely on(以及结果= all_argmax [np.unique(index [all_argmax],return_index = True)可以实现更安全的替代方案[1] ],涉及对len(maxima)~n_groups元素的排序).

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!