python – 为什么我的PanelND工厂抛出KeyError?
我在Ubuntu 13.04上使用Pandas版本0.12.0.我正在尝试创建一个5D面板对象,以包含按条件分割的一些EEG数据.
我如何选择构建我的数据: 首先让我演示一下我对pandas.core.panelnd.creat_nd_panel_factory的使用. Subject = panelnd.create_nd_panel_factory( klass_name='Subject',axis_orders=['setsize','location','vfield','channels','samples'],axis_slices={'labels': 'location','items': 'vfield','major_axis': 'major_axis','minor_axis': 'minor_axis'},slicer=pd.Panel4D,axis_aliases={'ss': 'setsize','loc': 'location','vf': 'vfield','major': 'major_axis','minor': 'minor_axis'} # stat_axis=2 # dafuq is this? ) 从本质上讲,该组织如下: > setsize:一个实验条件,可以是1或2 最后两个轴对应于DataFrame的major_axis和minor_axis.为清楚起见,它们已重命名: >频道:列,EEG频道(其中129个) 我正在做的事情: 每个实验条件(主题x设置x位置x vfield)存储在它自己的制表符分隔文件中,我用pandas.read_table读取它,获取DataFrame对象.我想为每个主题创建一个5维面板(即主题),其将包含该主题的所有实验条件(即DataFrame). 首先,我正在为每个主题/主题构建一个嵌套字典: # ... do some boring stuff to get the text files,etc... for _,factors in df.iterrows(): # `factors` is a 4-tuple containing # (subject number,setsize,location,vfield,# and path to the tab-delimited file). sn,ss,loc,vf,path = factors eeg = pd.read_table(path,sep='t',names=range(1,129) + ['ref'],header=None) # build nested dict subjects.setdefault(sn,{}).setdefault(ss,{}).setdefault(loc,{})[vf] = eeg # and now attempt to build `Subject` for sn,d in subjects.iteritems(): subjects[sn] = Subject(d) 完整堆栈跟踪 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-2-831fa603ca8f> in <module>() ----> 1 import_data() /home/louist/Dropbox/Research/VSTM/scripts/vstmlib.py in import_data() 64 65 import ipdb; ipdb.set_trace() ---> 66 for sn,d in subjects.iteritems(): 67 subjects[sn] = Subject(d) 68 /usr/local/lib/python2.7/dist-packages/pandas/core/panelnd.pyc in __init__(self,*args,**kwargs) 65 if 'dtype' not in kwargs: 66 kwargs['dtype'] = None ---> 67 self._init_data(*args,**kwargs) 68 klass.__init__ = __init__ 69 /usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _init_data(self,data,copy,dtype,**kwargs) 250 mgr = data 251 elif isinstance(data,dict): --> 252 mgr = self._init_dict(data,passed_axes,dtype=dtype) 253 copy = False 254 dtype = None /usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _init_dict(self,axes,dtype) 293 raxes = [self._extract_axis(self,axis=i) 294 if a is None else a for i,a in enumerate(axes)] --> 295 raxes_sm = self._extract_axes_for_slice(self,raxes) 296 297 # shallow copy /usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _extract_axes_for_slice(self,axes) 1477 """ return the slice dictionary for these axes """ 1478 return dict([(self._AXIS_SLICEMAP[i],a) for i,a -> 1479 in zip(self._AXIS_ORDERS[self._AXIS_LEN - len(axes):],axes)]) 1480 1481 @staticmethod KeyError: 'location' 我知道panelnd是一个实验性功能,但我很确定我做错了什么.有人可以指点我正确的方向吗?如果它是一个bug,有什么可以做的吗? 像往常一样,非常感谢你! 解决方法
工作实例.您需要通过切片指定轴到内轴名称的映射.这与内部结构混淆,但大熊猫的固定名称仍然存在(并且通过Panel / Panel4D进行了一些硬编码),因此您需要提供映射.
我会首先创建一个Panel4D,然后像下面一样创建你的主题. 如果你发现更多错误,请在github /这里发帖.这不是一个使用频繁的功能. 产量 <class 'pandas.core.panelnd.Subject'> Dimensions: 3 (setsize) x 1 (location) x 1 (vfield) x 10 (channels) x 2 (samples) Setsize axis: level0_0 to level0_2 Location axis: level1_0 to level1_0 Vfield axis: level2_0 to level2_0 Channels axis: level3_0 to level3_9 Samples axis: level4_1 to level4_2 码 import pandas as pd import numpy as np from pandas.core import panelnd Subject = panelnd.create_nd_panel_factory( klass_name='Subject',axis_slices={'location' : 'labels','vfield' : 'items','channels' : 'major_axis','samples': 'minor_axis'},'loc': 'labels','vf': 'items','minor': 'minor_axis'}) subjects = dict() for i in range(3): eeg = pd.DataFrame(np.random.randn(10,2),columns=['level4_1','level4_2'],index=[ "level3_%s" % x for x in range(10)]) loc,vf = ('level1_0','level2_0') subjects["level0_%s" % i] = pd.Panel4D({ loc : { vf : eeg }}) print Subject(subjects) (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |