python – 将索引拆分为pandas中的单独列

发布时间：2020-12-20 12:02:53 所属栏目：Python 来源：网络整理

导读：我有一个大型数据框,从中我可以获得groupby所需的数据.我需要从新数据帧的索引中获取几个单独的列. 部分原始数据框如下所示： code place vl year week0 111.0002.0056 region1 1 2017 291 112.6500.2285 region2 1 2017 312 112.5600.6325 region2 1 2017 3

我有一个大型数据框,从中我可以获得groupby所需的数据.我需要从新数据帧的索引中获取几个单独的列.

部分原始数据框如下所示：

code         place     vl   year    week
0   111.0002.0056   region1     1   2017    29
1   112.6500.2285   region2     1   2017    31
2   112.5600.6325   region2     1   2017    30
3   112.5600.6325   region2     1   2017    30
4   112.5600.8159   region2     1   2017    30
5   111.0002.0056   region2     1   2017    29
6   111.0002.0056   region2     1   2017    30
7   111.0002.0056   region2     1   2017    28
8   112.5600.8159   region3     1   2017    31
9   112.5600.8159   region3     1   2017    28
10  111.0002.0114   region3     1   2017    31
....

应用groupby后,它看起来像这样(代码：df_test1 = df_test.groupby([‘code’,’year’,’week’,’place’])[‘vl’].sum().unstack(fill_value = 0 ))：

place  region1  region2  region3  region4  index1
code            year    week                    
111.0002.0006   2017     29       0        3        0        0     (111.0002.0006,2017,29)
                         30       0        7        0        0     (111.0002.0006,30)
111.0002.0018   2017     29       0        0        0        0     (111.0002.0018,29)
111.0002.0029   2017     30       0        0        0        0     (111.0002.0029,30)
111.0002.0055   2017     28       0        33       0        8     (111.0002.0055,28)
                         29       1        155      2        41    (111.0002.0055,29)
                         30       0        142      1        39    (111.0002.0055,30)
                         31       0        31       0        13    (111.0002.0055,31)
111.0002.0056   2017     28       9        36       0        4     (111.0002.0056,28)
                         29       20       75       2        37    (111.0002.0056,29)
                         30       17       81       2        33    (111.0002.0056,30)
....

我将索引保存在单独的列index1中(代码：df_test1 [‘index1’] = df_test1.index)
我需要退出列index1三个单独的列代码,年和周.

结果应如下所示：

region1 region2 region3 region4       code     year  week                   
   0       3       0       0    111.0002.0006  2017   29
   0       7       0       0    111.0002.0006  2017   30
   0       0       0       0    111.0002.0018  2017   29
   0       0       0       0    111.0002.0029  2017   30
   0       33      0       8    111.0002.0055  2017   28
   1       155     2       41   111.0002.0055  2017   29
   0       142     1       39   111.0002.0055  2017   30
   0       31      0       13   111.0002.0055  2017   31
....

我会很感激任何建议！

解决方法

你添加 reset_index而不是df_test1 [‘index1’] = df_test1.index,对于clean df add rename_axis – 它删除列名称：

df_test1 = df_test.groupby(['code','year','week','place'])['vl'].sum() 
                  .unstack(fill_value=0) 
                  .reset_index() 
                  .rename_axis(None,axis=1)
print (df_test1)

            code  year  week  region1  region2  region3
0  111.0002.0056  2017    28        0        1        0
1  111.0002.0056  2017    29        1        1        0
2  111.0002.0056  2017    30        0        1        0
3  111.0002.0114  2017    31        0        0        1
4  112.5600.6325  2017    30        0        2        0
5  112.5600.8159  2017    28        0        0        1
6  112.5600.8159  2017    30        0        1        0
7  112.5600.8159  2017    31        0        0        1
8  112.6500.2285  2017    31        0        1        0

如有必要,最后更改列的排序：

#all cols are columns in df_test1
cols = ['code','week']
df_test1 = df_test1[[x for x in df_test1.columns if x not in cols] + cols]
print (df_test1)
   region1  region2  region3           code  year  week
0        0        1        0  111.0002.0056  2017    28
1        1        1        0  111.0002.0056  2017    29
2        0        1        0  111.0002.0056  2017    30
3        0        0        1  111.0002.0114  2017    31
4        0        2        0  112.5600.6325  2017    30
5        0        0        1  112.5600.8159  2017    28
6        0        1        0  112.5600.8159  2017    30
7        0        0        1  112.5600.8159  2017    31
8        0        1        0  112.6500.2285  2017    31

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!