numpy
 
 numpy.vstack
 
  
  Stack arrays in sequence vertically (row wise). 
按行添加数据
 
 
 
 >>> a = np.array([1,2,3])
>>> b = np.array([2,3,4])
>>> np.vstack((a,b))
array([[1,2,3],[2,3,4]])
>>> a = np.array([[1],[2],[3]])
>>> b = np.array([[2],[3],[4]])
>>> np.vstack((a,b))
array([[1],[4]])
 
 numpy.ravel
 
  
  多维数据变成一维数据 
 Returns: 
y : array_like 
 If a is a matrix,y is a 1-D ndarray,otherwise y is an array of the same subtype as a. The shape of the returned array is (a.size,). Matrices are special cased for backward compatibility.
 
 
 
 
matplotlib
 
 
pandas
 
 describe
 
  
  将数据集的一些特性打印出来,默认打印的是数字类型的,如果想要打印categorical 类型,可以 
df.describe(include=[‘O’]),这里是大写的字母O。 
 
  include,exclude : list-like,‘all’,or None (default) 
 Specify the form of the returned result. Either: 
 None to both (default). The result will include only numeric-typed columns or,if none are,only categorical columns. 
 A list of dtypes or strings to be included/excluded. To select all numeric types use numpy numpy.number. To select categorical objects use type object. See also the select_dtypes documentation. eg. df.describe(include=[‘O’]) 
 If include is the string ‘all’,the output column-set will match the input one.
 
 
 
 
sklearn
 
 sklearn.datasets.make_classificationn_
 
  
  samples : int,optional (default=100) 
 The number of samples. 
 
  n_features : int,optional (default=20) 
 The total number of features. These comprise n_informative informative features,n_redundant redundant features,n_repeated duplicated features and n_features-n_informative-n_redundant- n_repeated useless features drawn at random.
 
  n_informative : int,optional (default=2) 
 The number of informative features. Each class is composed of a number of gaussian clusters each located around the vertices of a hypercube in a subspace of dimension n_informative. For each cluster,informative features are drawn independently from N(0,1) and then randomly linearly combined within each cluster in order to add covariance. The clusters are then placed on the vertices of the hypercube.
 
  n_classes : int,optional (default=2) 
 The number of classes (or labels) of the classification problem. 
 
  weights : list of floats or None (default=None) 
 The proportions of samples assigned to each class. If None,then classes are balanced. Note that if len(weights) == n_classes - 1,then the last class weight is automatically inferred. More than n_samples samples may be returned if the sum of weights exceeds 1.
 
 
 
 sklearn.metrics.confusion_matrix
 
  
  y_true : array,shape = [n_samples] 
 Ground truth (correct) target values.
 
  y_pred : array,shape = [n_samples] 
 Estimated targets as returned by a classifier.
 
  labels : array,shape = [n_classes],optional 
 List of labels to index the matrix. This may be used to reorder or select a subset of labels. If none is given,those that appear at least once in y_true or y_pred are used in sorted order.
 
  sample_weight : array-like of shape = [n_samples],optional 
 Sample weights. 
 
  return 
C : array,shape = [n_classes,n_classes]  Confusion matrix