Python Pandas用缺少的值填充数据帧
发布时间:2020-12-20 12:01:10 所属栏目:Python 来源:网络整理
导读:我以此数据帧为例 import pandas as pd#create dataframedf = pd.DataFrame([['DE','Table',201705,1000],['DE',201704, ['DE',201702,201701, ['AT',201708,['AT',201706,1000]], columns=['ISO','Product','Billed Week','Created Week','Billings'])pr
我以此数据帧为例
import pandas as pd #create dataframe df = pd.DataFrame([['DE','Table',201705,1000],['DE',201704, ['DE',201702,201701, ['AT',201708,['AT',201706,1000]], columns=['ISO','Product','Billed Week','Created Week','Billings']) print (df) ISO Product Billed Week Created Week Billings 0 DE Table 201705 201705 1000 1 DE Table 201705 201704 1000 2 DE Table 201705 201702 1000 3 DE Table 201705 201701 1000 4 AT Table 201708 201708 1000 5 AT Table 201708 201706 1000 6 AT Table 201708 201705 1000 7 AT Table 201708 201704 1000 我需要做的是用[‘ISO’,’产品’]为每个组填写一些缺少数据的0比林,其中序列中断,即在某一周没有创建账单,因此缺少.它需要基于“开单周”和“最短创建周”的最大值.即,这是应该完成而没有按顺序中断的组合. 因此,对于上述内容,我需要以编程方式附加到数据库中的缺失记录如下所示: ISO Product Billed Week Created Week Billings 0 DE Table 201705 201703 0 1 AT Table 201708 201707 0 解决方法def seqfix(x): s = x['Created Week'] x = x.set_index('Created Week') x = x.reindex(range(min(s),max(s)+1)) x['Billings'] = x['Billings'].fillna(0) x = x.ffill().reset_index() return x df = df.groupby(['ISO','Billed Week']).apply(seqfix).reset_index(drop=True) df[['Billed Week','Billings']] = df[['Billed Week','Billings']].astype(int) df = df[['ISO','Billings']] print(df) ISO Product Billed Week Created Week Billings 0 AT Table 201708 201704 1000 1 AT Table 201708 201705 1000 2 AT Table 201708 201706 1000 3 AT Table 201708 201707 0 4 AT Table 201708 201708 1000 5 DE Table 201705 201701 1000 6 DE Table 201705 201702 1000 7 DE Table 201705 201703 0 8 DE Table 201705 201704 1000 9 DE Table 201705 201705 1000 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |