我正在研究 TextBlob 来计算我编译的 Excel 工作表上的文章列表的情绪分数(极性、主观性)。


11/03/2004 04:03 At least 60 people were killed in three bomb attacks on crowded Madrid trains in Spain's worst-ever terrorist attack, said Efe newswire and other media. Red Cross said at least 200 people were injured. ``This is a massacre,'' said Socialist party leader Jose Luis Rodriguez Zapatero, who blamed Basque terrorist group ETA.

07/07/2005 04:41 London closed its subway system and evacuated all stations after emergency services were called to explosions in and around the financial district.

01/12/2009 04:00 American International Group, Inc. (AIG) today announced that it has closed two previously announced transactions with the Federal Reserve Bank of New York (FRBNY) that have reduced the debt AIG owes the FRBNY by $25 billion in exchange for the FRBNY’s acquisition of preferred equity interests in certain newly formed subsidiaries.

22/08/2013 11:38 NASDAQ shuts down for 3 hours due to a computer problem

通过单独执行每一行,我已经能够以最简单的方式使用 textblob:

analysis = TextBlob("NASDAQ shuts down for 3 hours due to a computer problem")

我想要导入包含日期和时间以及两列中的文章的 Excel 文件,然后继续循环每行以计算极性和主观性分数并将其保存在文件中。


import pandas as pd
import numpy as np
from textblob import TextBlob

path_to_file = "C:/Users/Parvesh/Desktop/New Project/Sentiment Analysis/events.csv"
df = pd.read_csv(path_to_file, encoding='latin-1')

df['Polarity'] = np.nan
df['Subjectivity'] = np.nan

pd.options.mode.chained_assignment = None

for idx, articles in enumerate(df['articles'].values):  # for each row in our df dataframe
    sentA = TextBlob("articles")  # pass the text only article to TextBlob to analyze
    df['Polarity'].iloc[idx] = sentA.sentiment.polarity  # write sentiment polarity back to df
    df['Subjectivity'].iloc[idx] = sentA.sentiment.subjectivity  # write sentiment subjectivity score back to df

df.to_csv("out.csv", index=False)



我是 Python 的新手(我正在使用 Pycharm)。我主要在 Stata 和 Matlab 上编写代码。



您应该将逻辑移至一个函数中,然后使用 将该函数应用于 DataFrame 的每一行。使用 .map().apply() 比手动循环更快、更干净。

import pandas as pd
from textblob import TextBlob

path_to_file = "C:/Users/Parvesh/Desktop/New Project/Sentiment Analysis/events.csv"
df = pd.read_csv(path_to_file, encoding='latin-1')

# function to extract polarity and subjectivity from text
def process_text(text):
    blob = TextBlob(text)
    return blob.sentiemnt.polarity, blob.sentiment.subjectivity

# apply to each row of the 'articles' Series using the method
df["polarity"], df["sentiment"] = zip(*


df.to_csv("out.csv", index=False)


