python - Pandas:有条件地从列内容生成描述

标签 python pandas

我正在尝试解决一个函数的一些问题,该函数通过str.extract使用pandas regex来获取“name”列中的每一行 code> 生成列“description”。我使用的是 regex 而不是 split,因为代码必须能够管理各种格式。

必须修改该函数以确认各种条件。

数据框:

import pandas as pd
import re

df = pd.DataFrame(["LONG AXP UN X3 VON", "SHORT BIDU UN 5x VON", "SHORT GOOG VON", "LONG GOOG VON"], columns=["name"])

输入:

name
"LONG AXP UN X3 VON"
"SHORT BIDU UN 5x VON"
"SHORT GOOG VON"
"LONG GOOG VON"

当前代码:

description_map = {"AXP":"American Express", "BIDU":"Baidu"}
sign_map = {"LONG": "", "SHORT": "-"}
def f(strseries):
    stock = strseries.str.extract(r"\s(\S+)\s").map(description_map)
    leverage = strseries.str.extract(r"(X\d+|\d+X)\s", flags=re.IGNORECASE)
    sign = strseries.str.extract(r"(\S+)\s").map(sign_map)
    return "Tracks " + stock + " with " + sign + leverage + " leverage"

df["description"] = f(df["name"])

当前输出:

name                        description
"LONG AXP UN X3 VON"        "Tracks American Express with X3 leverage"
"SHORT BIDU UN 5x VON"      "Tracks Baidu with -5x leverage"
"SHORT GOOG VON"            ""
"LONG GOOG VON"             ""

期望的输出:

name                        description
"LONG AXP UN X3 VON"        "Tracks American Express with 3x leverage"
"SHORT BIDU UN 5x VON"      "Tracks Baidu inversely with -5x leverage"
"SHORT GOOG VON"            "Tracks inversely"
"LONG GOOG VON"             "Tracks"

影响:

  • 如果sign"-",我怎样才能让它在字符串中添加direction = "inversely"
  • 如果name中没有stock与字典description_map匹配:设置stock = ""并返回字符串。
  • 如果在名称中找不到杠杆:忽略“with”+符号+杠杆+“杠杆”部分
  • 拆分并重新排序符号+杠杆,以便它始终以-5x"顺序显示,无论是否输入为“SHORT X5”

最佳答案

我花了一些时间编写这个函数:

description_map = {"AXP":"American Express", "BIDU":"Baidu"}
sign_map = {"LONG": "", "SHORT": "-"}

stock_match = re.compile(r"\s(\S+)\s")
leverage_match = re.compile("[0-9]x|x[0-9]|X[0-9]|[0-9]X")

def f(value):

    f1 = lambda x: description_map[stock_match.findall(x)[0]] if stock_match.findall(x)[0] in description_map else ''
    f2 = lambda x: leverage_match.findall(x)[0] if len(leverage_match.findall(x)) > 0 else ''
    f3 = lambda x: '-' if 'SHORT' in x else ''

    stock = f1(value)
    leverage = f2(value)
    sign = f3(value)

    statement = "Tracks " + stock

    if stock == "":
        if sign == '-':
            return statement + "{}".format('inversely')
        else:
            return "Tracks"

    if leverage[0].replace('X','x') == 'x':
        leverage = leverage[1]+leverage[0].replace('X','x')

    if leverage != '' and sign == '-':
        statement += " {} with {}{} leverage".format('inversely', sign, leverage)
    elif leverage != '' and sign == '':
        statement += " with {} leverage".format(leverage)
    else:
        if sign == '-':
            statement += " {} ".format('Inversely')

    return statement

df["description"] = df["name"].map(lambda x:f(x))

输出:

In [97]: %paste
import pandas as pd
import re

df = pd.DataFrame(["LONG AXP UN X3 VON", "SHORT BIDU UN 5x VON", "SHORT GOOG VON", "LONG GOOG VON"], columns=["name"])

## -- End pasted text --

In [98]: df
Out[98]: 
                   name
0    LONG AXP UN X3 VON
1  SHORT BIDU UN 5x VON
2        SHORT GOOG VON
3         LONG GOOG VON

In [99]: df["description"] = df["name"].map(lambda x:f(x))

In [100]: df
Out[100]: 
                   name                               description
0    LONG AXP UN X3 VON  Tracks American Express with 3x leverage
1  SHORT BIDU UN 5x VON  Tracks Baidu inversely with -5x leverage
2        SHORT GOOG VON                          Tracks inversely
3         LONG GOOG VON                                    Tracks

关于python - Pandas:有条件地从列内容生成描述,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31054904/

相关文章:

Python 堆叠条形图(带有 pandas 交叉表)以及用于多列的 FacetGrid

python - 无法在openERP中导入自定义模块

python - 在 Windows 8 中编译 mod_wsgi 不适用于 django

python - 替换 pandas Dataframe 列中的 Unicode 字符

python - 如何修复左侧刻度? Matplotlib 和 Pandas

python - 将数据分组到 30 分钟的 bin 中

python - 如何获得每个边界点与霍夫线之间的垂直距离?

python - Flask 路由传递参数返回错误

python - 按排序顺序排列的字符串,除了首先将所有以 'x' 开头的字符串分组

python - 带 Pandas 的箱线图、groupby、子图、计算/描述性统计、聚合