python - python中出现Attribute error 'float' object has no attribute 'split'如何解决?

标签 python string pandas series attributeerror

当我运行下面的代码时,它给我一个错误,指出存在属性错误:“float”对象在 python 中没有属性“split”。

我想知道为什么会出现这个错误。

def text_processing(df):

    """""=== Lower case ==="""
    '''First step is to transform comments into lower case'''
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))

    return df

df = text_processing(df)

错误的完整回溯:

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1664, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 53, in <module>
    df = text_processing(df)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in text_processing
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
  File "C:\Users\L31307\AppData\Roaming\Python\Python37\site-packages\pandas\core\series.py", line 3194, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/src\inference.pyx", line 1472, in pandas._libs.lib.map_infer
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in <lambda>
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
AttributeError: 'float' object has no attribute 'split'

最佳答案

错误指向这一行:

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \
                                    if x not in stop_words))

split 在这里用作 Python 内置 str 类的方法。您的错误表明 df['content'] 中的一个或多个值属于 float 类型。这可能是因为存在空值,即 NaN,或非空浮点值。

一种变通方法是在使用 split 之前在 x 上应用 str,然后再使用 split:

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \
                                    if x not in stop_words))

或者,可能是更好的解决方案,显式地使用带有 try/except 子句的命名函数:

def converter(x):
    try:
        return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])
    except AttributeError:
        return None  # or some other value

df['content'] = df['content'].apply(converter)

由于 pd.Series.apply 只是一个有开销的循环,您可能会发现列表理解或 map 更有效:

df['content'] = [converter(x) for x in df['content']]
df['content'] = list(map(converter, df['content']))

关于python - python中出现Attribute error 'float' object has no attribute 'split'如何解决?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52736900/

相关文章:

python - 从类中的另一个函数调用一个函数

python - 如何使用 SELECT COUNT sql 生成一个 int

javascript - 我可以使用什么方法将数组内的字符串分成单独的数组(使用 JavaScript)?

python - 跟踪数据框列中的最大值

python - 如何先根据键对元组元素进行排序,然后根据值对元组元素进行排序

python - 尝试将数组与用户输入匹配时出错

c# - 将 KeyDown 事件 (Keys) 连接到一个 C# (wpf) 字符串

java - 我无法获取要与每个字符一起显示的字符串

python - 制作具有多索引值的字典

python - 根据更复杂的条件删除 Pandas 中的行