python – pandas中连续数据的平行坐标图
发布时间:2020-12-20 11:35:19 所属栏目:Python 来源:网络整理
导读:pandas的parallel_coordinates函数非常有用: import pandasimport matplotlib.pyplot as pltfrom pandas.tools.plotting import parallel_coordinatessampdata = read_csv('/usr/local/lib/python3.3/dist-packages/pandas/tests/data/iris.csv')parallel_c
pandas的parallel_coordinates函数非常有用:
import pandas import matplotlib.pyplot as plt from pandas.tools.plotting import parallel_coordinates sampdata = read_csv('/usr/local/lib/python3.3/dist-packages/pandas/tests/data/iris.csv') parallel_coordinates(sampdata,'Name') 但是当你有连续的数据时,它的行为并不是你所期望的: mypos = np.random.randint(10,size=(100,2)) mydata = DataFrame(mypos,columns=['x','y']) myres = np.random.rand(100,1) mydata['res'] = myres parallel_coordinates(mydata,'res') 我想有线条的颜色来反映幅度 解决方法
我今天遇到了同样的问题.我的解决方案是从pandas复制parallel_coordinates并根据我的特殊需要进行调整.我认为它对其他人有用,这是我的实现:
def parallel_coordinates(frame,class_column,cols=None,ax=None,color=None,use_columns=False,xticks=None,colormap=None,**kwds): import matplotlib.pyplot as plt import matplotlib as mpl n = len(frame) class_col = frame[class_column] class_min = np.amin(class_col) class_max = np.amax(class_col) if cols is None: df = frame.drop(class_column,axis=1) else: df = frame[cols] used_legends = set([]) ncols = len(df.columns) # determine values to use for xticks if use_columns is True: if not np.all(np.isreal(list(df.columns))): raise ValueError('Columns must be numeric to be used as xticks') x = df.columns elif xticks is not None: if not np.all(np.isreal(xticks)): raise ValueError('xticks specified must be numeric') elif len(xticks) != ncols: raise ValueError('Length of xticks must match number of columns') x = xticks else: x = range(ncols) fig = plt.figure() ax = plt.gca() Colorm = plt.get_cmap(colormap) for i in range(n): y = df.iloc[i].values kls = class_col.iat[i] ax.plot(x,y,color=Colorm((kls - class_min)/(class_max-class_min)),**kwds) for i in x: ax.axvline(i,linewidth=1,color='black') ax.set_xticks(x) ax.set_xticklabels(df.columns) ax.set_xlim(x[0],x[-1]) ax.legend(loc='upper right') ax.grid() bounds = np.linspace(class_min,class_max,10) cax,_ = mpl.colorbar.make_axes(ax) cb = mpl.colorbar.ColorbarBase(cax,cmap=Colorm,spacing='proportional',ticks=bounds,boundaries=bounds,format='%.2f') return fig 我不知道它是否适用于pandas原始功能提供的每个选项.但是对于你的例子,它给出了这样的东西: parallel_coordinates(mydata,'res',colormap="binary") 您可以通过在上一个函数中更改此行来添加alpha值: ax.plot(x,alpha=(kls - class_min)/(class_max-class_min),**kwds) 对于pandas原始示例,删除名称并将最后一列用作值: sampdata = read_csv('iris_modified.csv') parallel_coordinates(sampdata,'Value') 我希望这能帮到您! 克里斯托夫 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
相关内容