python - Pandas:借助字典将变量子字符串从 A 列插入 B 列

我有这个pandas数据框:

df = pd.DataFrame(["LONG AAPL 2X CBZ","SHORT GOOG 10X VON"], columns=["Name"])

我想在名称列中识别“AAPL”，将其通过字典“AAPL”:“Apple”传递，然后将其插入到新列 Description 中的字符串中。

期望的输出:

Name                   Description
"LONG AAPL 2X CBZ"     "Tracks Apple with 2X leverage."
"SHORT GOOG 10X VON"   "Tracks Google with -10X leverage."

我遇到问题的部分是将变量子字符串输入到另一个字符串中，作为“使用 Y 杠杆跟踪 X。”

如果我不必这样做，只需从 name 提取到 description 即可:

df["Description"] = df["Name"].str.extract(r"\s(\S+)\s").map({"AAPL":"Apple", "GOOG":"Google"})

或提取杠杆:

df["Description"] = df["Name"].str.extract(r"(\d+X)")

如果可能，我想使用regex来提取变量，因为实际上我会做一些更详细的正则表达式，例如用于检索不同格式的乘数，例如X2、2x 等等。

注意:我可能需要设置另一列来告知杠杆是正数还是负数，并用它来决定是否在前面附加 "-"乘数为-10X杠杆。

df["direction"] = df["name"].map(lambda x: "Long" if "LONG" in x else "Short" if "SHORT " in x else "Long")

Name                   Direction      Description
"LONG AAPL 2X CBZ"     "Long"         "Tracks Apple with 2X leverage."
"SHORT GOOG 10X VON"   "Short"        "Tracks Google with -10X leverage."

最佳答案

因为我们只关心前两个和倒数第二个子字符串:

df = pd.DataFrame(["LONG AAPL 2X CBZ", "SHORT GOOG 10X VON", "BULL AXP UN X3 VON","LONG AXP X3 VON"], columns=["Name"])

maps = {"AAPL": "Apple", "GOOG": "Google"}
signs = {"SHORT": "-"}

def split(i):
    spl = i.split()
    a, b, c = spl[0], spl[1], spl[-2]
    val = maps.get(b, b) # if name is not to be replaced keep original
    return "Tracks  {} with {}{} leverage".format(val, signs.get(a, ""), c)

df["Description"]  = df["Name"].map(split)

输出:

                 Name                        Description
0    LONG AAPL 2X CBZ     Tracks  Apple with 2X leverage
1  SHORT GOOG 10X VON  Tracks  Google with -10X leverage
2  BULL AXP UN X3 VON       Tracks  AXP with X3 leverage
3     LONG AXP X3 VON       Tracks  AXP with X3 leverage

仅拆分比使用正则表达式更有效:

In [33]: df2 = pd.concat([df]*10000)
In [34]: timeit  df2["Name"].map(split)
10 loops, best of 3: 57.5 ms per loop

In [35]: timeit f2(df2['Name'])
10 loops, best of 3: 168 ms per loop

如果您想添加更多单词来替换，只需将它们添加到 map 中，并且添加标志也是如此。

关于python - Pandas:借助字典将变量子字符串从 A 列插入 B 列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30957101/

python - Pandas:借助字典将变量子字符串从 A 列插入 B 列

上一篇：python - Scrapy 爬取 0 页(0 页/分钟)

下一篇：python - 更改 Pandas DataFrame 数据点的值