加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 编程开发 > Python > 正文

如何在python中的sklearn中打印tf-idf得分矩阵

发布时间:2020-12-20 12:03:47 所属栏目:Python 来源:网络整理
导读:我使用sklearn获取tf-idf值如下. from sklearn.feature_extraction.text import TfidfVectorizermyvocabulary = ['life','learning']corpus = {1: "The game of life is a game of everlasting learning",2: "The unexamined life is not worth living",3: "
我使用sklearn获取tf-idf值如下.

from sklearn.feature_extraction.text import TfidfVectorizer
myvocabulary = ['life','learning']
corpus = {1: "The game of life is a game of everlasting learning",2: "The unexamined life is not worth living",3: "Never stop learning"}
tfidf = TfidfVectorizer(vocabulary = myvocabulary,ngram_range = (1,3))
tfs = tfidf.fit_transform(corpus.values())

现在我想在矩阵中查看我计算的tf-idf分数,如下所示.

tf-idf matrix

我尝试按如下方式进行.

idf = tfidf.idf_
dic = dict(zip(tfidf.get_feature_names(),idf))
print(dic)

但是,我得到如下输出.

{'life': 1.2876820724517808,'learning': 1.2876820724517808}

请帮我.

解决方法

感谢σηγ,我可以从 this question找到答案

feature_names = tfidf.get_feature_names()
corpus_index = [n for n in corpus]
import pandas as pd
df = pd.DataFrame(tfs.T.todense(),index=feature_names,columns=corpus_index)
print(df)

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读