我有一个小程序文件,这是相关代码:
import numpy as np
import pandas as pd
from docx import Document
#### Setup the file names, also make provisions for having the user select the file ####
SHRD_filename = "SHRD - SVN 12485.docx"
SHDD_filename = "SHDD - SVN 12485.doc"
#SHRD_name = PCB_utility.get_file('Select SHRD file')
#SHDD_name = PCB_utility.get_file('Select SHDD file')
data = []
keys = {}
document_SHRD = Document(SHRD_filename)
tables_SHRD = document_SHRD.tables[30]
for i, row in enumerate(tables_SHRD.rows):
text = (cell.text for cell in row.cells)
if i == 0:
keys = tuple(text)
continue
row_data = dict(zip(keys, text))
data.append(row_data)
df_SHRD = pd.DataFrame.from_dict(data)
#cols = df_SHRD.columns.tolist()
print(df_SHRD.tail(20))
s = df_SHRD['HLR Trace Tag'].str.split(' ').apply(pd.Series, 1).stack()
s.index = s.index.droplevel(-1)
s.name = 'HLR Tags'
del df_SHRD['HLR Trace Tag']
df_SHRD.join(s)
当我最初制作数据框时,它看起来像这样:
300 HLR-0000094 HLR-0000095 HLR-0000340 LRU-0000440
301 HLR-0000094 HLR-0000095 HLR-0000341 LRU-0000441
302 HLR-0000094 HLR-0000095 HLR-0000342 LRU-0000442
303 HLR-0000675 LRU-0000745
304 HLR-0000676 LRU-0000746
305 HLR-0000677 LRU-0000747
306 HLR-0000678 LRU-0000748
307 HLR-0000679 LRU-0000749
308 HLR-0000680 LRU-0000750
我需要将 HLR 标签拆分为单独的行。在我的程序结束时,它返回如下:
300 LRU-0000440
301 LRU-0000441
302 LRU-0000442
303 LRU-0000745
304 LRU-0000746
305 LRU-0000747
306 LRU-0000748
307 LRU-0000749
308 LRU-0000750
但是当我重新输入时:
In [25]:df_SHRD.join(s)
Out[25]:
300 LRU-0000440 HLR-0000094
300 LRU-0000440 HLR-0000095
300 LRU-0000440 HLR-0000340
301 LRU-0000441 HLR-0000094
301 LRU-0000441 HLR-0000095
301 LRU-0000441 HLR-0000341
302 LRU-0000442 HLR-0000094
302 LRU-0000442 HLR-0000095
302 LRU-0000442 HLR-0000342
303 LRU-0000745 HLR-0000675
304 LRU-0000746 HLR-0000676
305 LRU-0000747 HLR-0000677
306 LRU-0000748 HLR-0000678
307 LRU-0000749 HLR-0000679
308 LRU-0000750 HLR-0000680
[457 rows x 2 columns]
任何关于为什么该命令在 IPython 窗口中工作但在脚本中不起作用的帮助将不胜感激。
最佳答案
DataFrame.join
(other, ...
)Join columns with other DataFrame either on index or on a key column. Efficiently Join multiple DataFrame objects by index at once by passing a list.
Returns:
joined
:DataFrame
join
不是就地操作。它返回一个结果,如果您想存储结果,则必须将其分配回另一个变量。df = df_SHRD.join(s)
IPython 在没有
print
调用的情况下打印变量时会显示结果,而通过脚本运行则不会。这是因为 IPython 的 REPL 性质。无论哪种情况,您都必须将结果分配回。尝试在 IPython 中打印df_SHRD.join(s)
,然后打印df_SHRD
,您就会看到。
关于python - Dataframe 命令在 IPython 中有效,但在脚本中无效,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46990457/