python - 寻找使用字典将动态列添加到 pandas df 的有效方法

我正在尝试向现有的 pandas 数据框添加两个新列。我已经使用带有多个 if else 语句的 python 函数实现了它。但我认为这不是最好的方法，如果我可以使用字典或其他方法来实现相同的目标？

我正在使用以下代码添加新列:

import pandas as pd
df = pd.DataFrame( {"col_1": [1234567, 45677890, 673214, 6709,98765,'',876543]} )
def func(col_1):
    col_1=str(col_1)

    if col_1=="":
        return "NA",""
    elif col_1[0:3]=='123':
        return "some_text_1 "," other_text_1"
    elif col_1[0:3]=='456':
        return "some_text_2 ","other_text_2"
    elif col_1[0:2]=='67':
        return "some_text_3 ","other_text_3"
    elif col_1[0:1]=='9':
        return "some_text_4 ","other_text_4"
    else:
        return "Other","Other"

df["col_2"],df["col_3"]=zip(*df["col_1"].map(func))
print(df)


        col_1         col_2          col_3
    0   1234567  some_text_1    other_text_1
    1  45677890  some_text_2    other_text_2
    2    673214  some_text_3    other_text_3
    3      6709  some_text_3    other_text_3
    4     98765  some_text_4    other_text_4
    5                      NA               
    6    876543         Other          Other

所以我想在这里找到什么，因为我有多个 if 和 else 语句，什么是实现相同目标的最佳方法。我应该使用字典还是任何其他方法，任何指针将不胜感激。

最佳答案

您的方法可能很慢，因为它没有矢量化。这是另一种方法:

temp = df['col_1'].astype(str)
df = df.assign(col_2='Other', col_3='Other')
df.loc[temp.str[0] == '9', ['col_2', 'col_3']] = ('some_text_4 ', 'other_text_4')
df.loc[temp.str[0:2] == '67', ['col_2', 'col_3']] = ('some_text_3 ', 'other_text_3')
df.loc[temp.str[0:3] == '456', ['col_2', 'col_3']] = ('some_text_2 ', 'other_text_2')
df.loc[temp.str[0:3] == '123', ['col_2', 'col_3']] = ('some_text_1 ', 'other_text_1')
df.loc[temp == "", ['col_2', 'col_3']] = ("NA", "")
>>> df
      col_1         col_2         col_3
0   1234567  some_text_1   other_text_1
1  45677890  some_text_2   other_text_2
2    673214  some_text_3   other_text_3
3      6709  some_text_3   other_text_3
4     98765  some_text_4   other_text_4
5                      NA              
6    876543         Other         Other

这个想法是，您正在颠倒 if/else 语句的顺序，以便先执行最不重要的语句。后续规则优先，并且可以覆盖其上方的规则。

关于python - 寻找使用字典将动态列添加到 pandas df 的有效方法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45821664/

python - 寻找使用字典将动态列添加到 pandas df 的有效方法

上一篇：python - 将切片分配给字符串

下一篇：python - 导入错误 - Linkedin API