加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 大数据 > 正文

One-hot encoding 数据处理

发布时间:2020-12-14 05:01:19 所属栏目:大数据 来源:网络整理
导读:import csv import os import shutil import codecs import pandas as pd import numpy as np from sklearn.preprocessing import OneHotEncoder from sklearn.preprocessing import LabelEncoder from sklearn.preprocessing import LabelBinarizer from sk

import csv
import os
import shutil
import codecs
import pandas as pd
import numpy as np

from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import LabelBinarizer
from sklearn.preprocessing import MultiLabelBinarizer

dir_name = ‘C:UsersThuang6DesktopMaxWellDataOneHotcsv_to_csv.csv’

path = os.chdir(‘C:UsersThuang6DesktopMaxWellDataOneHot’)
df = pd.read_csv(dir_name,names=[‘Time’,’Process’,’Component’,’Operation’,’Action’,’Control’,’Category’,’Context’],index_col = False)
df = df.fillna(value= ‘NULL’)
process = LabelBinarizer().fit_transform(df[‘Process’])
print(process)

component = LabelBinarizer().fit_transform(df[‘Component’])
print(component)

operation = LabelBinarizer().fit_transform(df[‘Operation’])
print(operation)

action = LabelBinarizer().fit_transform(df[‘Action’])
print(action)

control = LabelBinarizer().fit_transform(df[‘Control’])
print(control)

category = LabelBinarizer().fit_transform(df[‘Category’])
print(category)

final_output = np.hstack((process,component,operation,action,control,category))

print(final_output)

final_split = np.vsplit(final_output,21)

print(final_split)

print(np.shape(final_split))

print(“nihao”)
d = []
for i in range(21):
a = final_split[i]
#print(a)
b = np.ndarray.flatten(a)
c = b.tolist()
d.append(c)
#print(d)
#print(len(d))
#print(type(b))

b = np.ndarray.flatten(a)

print(np.ndarray.flatten(a))

print(d)

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读