python - 过滤引号内的 df 值

标签 python python-3.x pandas dataframe lambda

我使用如下代码从命令行结果生成 df :-

df_output_lines = [s.split() for s in os.popen("my command linecode").read().splitlines()]
df_output_lines  = list(filter(None, df_output_lines))

并将其转换为数据框:-

df=pd.DataFrame(df_output_lines)
df

数据采用以下格式:-

abc = pd.DataFrame([['time:"08:59:38.000"', 'instance:"(null)"','id:"3214039276626790405"'],['time:"08:59:38.000"', 'instance:"(Ops-MacBook-Pro.local)"','id:"3214039276626790405"'],['time:"08:59:38.000"', 'instance:"(Ops-MacBook-Pro.local)"','id:"3214039276626790405"']])
abc

enter image description here

我想以某种方式过滤它,以便 before : 中的值将成为列名称,而 引号 "" 中的值将成为值,同样适用所有列。输出应该是这样的:- enter image description here

到目前为止,我正在努力做到这一点:-

abc.rename(columns={0:'time',1:'instance',2:'id'},inplace=True)

然后

abc['time'] = abc['time'].map(lambda x: str(x)[:-1])
abc['time'] = abc['time'].map(lambda x: str(x)[6:])

abc['instance'] = abc['instance'].map(lambda x: str(x)[:-1])
abc['instance'] = abc['instance'].map(lambda x: str(x)[10:])

abc['id'] = abc.id.str.extract('(\d+)', expand=True).astype(int)

对 lambda 表达式或任何一个衬垫的任何建议来执行此操作。

我的原始日志输出如下:-

    time:"11:22:20.000" instance:"(null)" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.000631" level:"info" operation:"Init" message:"Initialize (version 4.9.0002.30618) ... "

    time:"11:22:21.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.067122" level:"info" operation:"Connect" message:"Connecting to https://hrpd.www.vivox.com/api2/"

    time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.685700" level:"info" operation:"Connect" message:"Connected to https://hrpd.www.vivox.com/api2/"

    time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.814268" level:"info" operation:"Login" message:"Logged in .tester_food."

    time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.912255" level:"error" operation:"Call" message:".tester_food. failed to join sip:confctl-2@hrpd.vivox.com error:Access token has invalid signature(403)"

 time:"12:30:41.000" instance:"Ops-MacBook-Pro.local" id:"10316899144153251411" channel:"sip:confctl-2@hrpd.vivox.com" type:"media" sampleperiod:"0.000000" incomingpktsreceived:"0" incomingpktsexpected:"0" incomingpktsloss:"0" incomingpktssoutoftime:"0" incomingpktsdiscarded:"0" outgoingpktssent:"0" predictedmos:"3" latencypktssent:"0" latencycount:"0" latencysum:"0.000000" latencymin:"0.000000" latencymax:"0.000000" callid:"2477580077" r_factor:"0.000000"

最佳答案

将字典列表提供给pd.DataFrame

pd.DataFrame 构造函数直接接受字典列表。您可以在列表理解中使用 str.rstripstr.split:

res = pd.DataFrame([dict(i.rstrip('"').split(':"') for i in row) for row in abc.values])

print(res)

                    id                 instance          time
0  3214039276626790405                   (null)  08:59:38.000
1  3214039276626790405  (Ops-MacBook-Pro.local)  08:59:38.000
2  3214039276626790405  (Ops-MacBook-Pro.local)  08:59:38.000

目前尚不清楚您使用什么逻辑来确定仅 'null' 字符串被括号括起来。

关于python - 过滤引号内的 df 值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53378687/

相关文章:

python - 如何将 easy_install 升级到 easy_install-3.4?

python - 带 GUI 的 Pi 相机预览 - Raspberry Pi

python - 如何将一列数字转换为python数据框中的日期

python - 如何将这 2 个日期/时间列转换为 1 个?

python - 如何从列 pyspark 中获得第二高的值?

python - 如何从模板调用 Flask 上的函数/方法

python - 弧形中的不同颜色

Python - 构建 HTTP 请求字符串变量

python Pandas : create a new column for each different value of a source column (with boolean output as column values)

python - 如何在给定整数索引的情况下检索 pandas 数据帧行的标签索引?