python – 使用随机长度的块生成1d numpy
发布时间:2020-12-16 23:42:18 所属栏目:Python 来源:网络整理
导读:我需要生成一维数组,其中重复的整数序列由随机数的零分隔. 到目前为止,我正在使用下一个代码: from random import normalvariateregular_sequence = np.array([1,2,3,4,5],dtype=np.int)n_iter = 10lag_mean = 10 # mean length of zeros sequencelag_sd =
我需要生成一维数组,其中重复的整数序列由随机数的零分隔.
到目前为止,我正在使用下一个代码: from random import normalvariate regular_sequence = np.array([1,2,3,4,5],dtype=np.int) n_iter = 10 lag_mean = 10 # mean length of zeros sequence lag_sd = 1 # standard deviation of zeros sequence length # Sequence of lags lengths lag_seq = [int(round(normalvariate(lag_mean,lag_sd))) for x in range(n_iter)] # Generate list of concatenated zeros and regular sequences seq = [np.concatenate((np.zeros(x,dtype=np.int),regular_sequence)) for x in lag_seq] seq = np.concatenate(seq) 当我需要很多长序列时它可以工作但看起来很慢.那么,我该如何优化呢? 解决方法
您可以预先计算要放置重复regular_sequence元素的索引,然后以矢量化方式设置具有regular_sequence的索引.为了预先计算这些索引,可以使用
np.cumsum 来获得每个这样的regular_sequence块的开始,然后添加一组连续的整数,扩展到regular_sequence的大小以获得要更新的所有索引.因此,实现看起来像这样 –
# Size of regular_sequence N = regular_sequence.size # Use cumsum to pre-compute start of every occurance of regular_sequence offset_arr = np.cumsum(lag_seq) idx = np.arange(offset_arr.size)*N + offset_arr # Setup output array out = np.zeros(idx.max() + N,dtype=regular_sequence.dtype) # Broadcast the start indices to include entire length of regular_sequence # to get all positions where regular_sequence elements are to be set np.put(out,idx[:,None] + np.arange(N),regular_sequence) 运行时测试 – def original_app(lag_seq,regular_sequence): seq = [np.concatenate((np.zeros(x,regular_sequence)) for x in lag_seq] return np.concatenate(seq) def vectorized_app(lag_seq,regular_sequence): N = regular_sequence.size offset_arr = np.cumsum(lag_seq) idx = np.arange(offset_arr.size)*N + offset_arr out = np.zeros(idx.max() + N,dtype=regular_sequence.dtype) np.put(out,regular_sequence) return out In [64]: # Setup inputs ...: regular_sequence = np.array([1,dtype=np.int) ...: n_iter = 1000 ...: lag_mean = 10 # mean length of zeros sequence ...: lag_sd = 1 # standard deviation of zeros sequence length ...: ...: # Sequence of lags lengths ...: lag_seq = [int(round(normalvariate(lag_mean,lag_sd))) for x in range(n_iter)] ...: In [65]: out1 = original_app(lag_seq,regular_sequence) In [66]: out2 = vectorized_app(lag_seq,regular_sequence) In [67]: %timeit original_app(lag_seq,regular_sequence) 100 loops,best of 3: 4.28 ms per loop In [68]: %timeit vectorized_app(lag_seq,regular_sequence) 1000 loops,best of 3: 294 μs per loop (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |