Python:pandas apply 与 map

我很难理解 df.apply() 究竟是如何工作的。

我的问题如下:我有一个数据框df。现在我想在几个列中搜索某些字符串。如果在任何列中找到该字符串，我想为找到该字符串的每一行添加一个“标签”(在新列中)。

我可以用 map 和 applymap 解决问题(见下文)。

但是，我希望更好的解决方案是使用 apply，因为它将函数应用于整个列。

问题:使用`apply`是不可能的吗？我的错误在哪里？

这是我使用map 和applymap 的解决方案。

df = pd.DataFrame([list("ABCDZ"),list("EAGHY"), list("IJKLA")], columns = ["h1","h2","h3","h4", "h5"])

使用`map`的解决方案

def setlabel_func(column):
    return df[column].str.contains("A")

mask = sum(map(setlabel_func, ["h1","h5"]))
df.ix[mask==1,"New Column"] = "Label"

使用`applymap`的解决方案

mask = df[["h1","h5"]].applymap(lambda el: True if re.match("A",el) else False).T.any()
df.ix[mask == True, "New Column"] = "Label"

对于 apply，我不知道如何将两列传递给函数/或者可能根本不理解其中的机制 ;-)

def setlabel_func(column):
    return df[column].str.contains("A")

df.apply(setlabel_func(["h1","h5"]),axis = 1)

上面给了我警报。

'DataFrame' object has no attribute 'str'

有什么建议吗？请注意，我的实际应用程序中的搜索功能更复杂，需要正则表达式功能，这就是我首先使用 .str.contain 的原因。

最佳答案

另一种解决方案是使用 DataFrame.any每行至少获得一个 True:

print (df[['h1', 'h5']].apply(lambda x: x.str.contains('A')))
      h1     h5
0   True  False
1  False  False
2  False   True

print (df[['h1', 'h5']].apply(lambda x: x.str.contains('A')).any(1))
0     True
1    False
2     True
dtype: bool

df['new'] = np.where(df[['h1','h5']].apply(lambda x: x.str.contains('A')).any(1),
                     'Label', '')

print (df)
  h1 h2 h3 h4 h5    new
0  A  B  C  D  Z  Label
1  E  A  G  H  Y       
2  I  J  K  L  A  Label

mask = df[['h1', 'h5']].apply(lambda x: x.str.contains('A')).any(1)
df.loc[mask, 'New'] = 'Label'
print (df)
  h1 h2 h3 h4 h5    New
0  A  B  C  D  Z  Label
1  E  A  G  H  Y    NaN
2  I  J  K  L  A  Label

关于Python:pandas apply 与 map，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42175526/

Python:pandas apply 与 map

问题:使用`apply`是不可能的吗？我的错误在哪里？

使用`map`的解决方案

使用`applymap`的解决方案

上一篇：python - 如果 json 中存在可选键，则 Django 过滤器

下一篇：python - 从字符串列表创建 Pandas 数据框

Python:pandas apply 与 map

问题:使用apply是不可能的吗？我的错误在哪里？

使用map的解决方案

使用applymap的解决方案

上一篇：python - 如果 json 中存在可选键，则 Django 过滤器

下一篇：python - 从字符串列表创建 Pandas 数据框

问题:使用`apply`是不可能的吗？我的错误在哪里？

使用`map`的解决方案

使用`applymap`的解决方案