python - 将 pandas DataFrame 与 NaN 合并以查找缺失行

标签 python pandas dataframe join merge

我想使用我的引用日历作为支架来填充我的主要数据中缺失的数据。为此,我想加入这两个数据框。

import pandas as pd
import numpy as np

d1 = { 'Year': [2019,2019,2019,2019,2019,2019],
        'Week': [1,2,3,5,5,6],
        'Part': ['A','A','A','A','B','B'],
        'Static': [20,20,20,20,40,40],
        'Value': [np.nan,10,np.nan,50,30,np.nan] }

d2 = { 'Year':[2019,2019,2019,2019,2019,2019,2019,2019,2019,2019],
        'Week':[1,2,3,4,5,6,7,8,9,10] }

df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)

预期输出如下

    Year  Week Part  Static  Value
0   2019     1    A      20    NaN
1   2019     2    A      20   10.0
2   2019     3    A      20    NaN
3   2019     4    A      20    NaN
4   2019     5    A      20   50.0
5   2019     6    A      20    NaN
6   2019     7    A      20    NaN
7   2019     8    A      20    NaN
8   2019     9    A      20    NaN
9   2019    10    A      20    NaN
10  2019     1    B      40    NaN
11  2019     2    B      40    NaN
12  2019     3    B      40    NaN
13  2019     4    B      40    NaN
14  2019     5    B      40   30.0
15  2019     6    B      40    NaN
16  2019     7    B      40    NaN
17  2019     8    B      40    NaN
18  2019     9    B      40    NaN
19  2019    10    B      40    NaN

最佳答案

内嵌评论。

# First, replicate `df2` for each unique Part.  
df3 = (df2.assign(Key=1)
          .merge(pd.DataFrame({'Part': df1.Part.unique(), 'Key': 1}), on='Key')
          .drop('Key', 1))
df3

    Year  Week Part
0   2019     1    A
1   2019     1    B
2   2019     2    A
3   2019     2    B
4   2019     3    A
5   2019     3    B
6   2019     4    A
7   2019     4    B
8   2019     5    A
9   2019     5    B
10  2019     6    A
11  2019     6    B
12  2019     7    A
13  2019     7    B
14  2019     8    A
15  2019     8    B
16  2019     9    A
17  2019     9    B
18  2019    10    A
19  2019    10    B

# Next, perform left outer merge with `df1`.     
df3.merge(df1, on=['Year', 'Week', 'Part'], how='left')

    Year  Week Part  Static  Value
0   2019     1    A    20.0    NaN
1   2019     1    B     NaN    NaN
2   2019     2    A    20.0   10.0
3   2019     2    B     NaN    NaN
4   2019     3    A    20.0    NaN
5   2019     3    B     NaN    NaN
6   2019     4    A     NaN    NaN
7   2019     4    B     NaN    NaN
8   2019     5    A    20.0   50.0
9   2019     5    B    40.0   30.0
10  2019     6    A     NaN    NaN
11  2019     6    B    40.0    NaN
12  2019     7    A     NaN    NaN
13  2019     7    B     NaN    NaN
14  2019     8    A     NaN    NaN
15  2019     8    B     NaN    NaN
16  2019     9    A     NaN    NaN
17  2019     9    B     NaN    NaN
18  2019    10    A     NaN    NaN
19  2019    10    B     NaN    NaN

关于python - 将 pandas DataFrame 与 NaN 合并以查找缺失行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54544460/

相关文章:

python - pyodbc DSN - 在连接字符串中没有 UID 和 PWD 的情况下连接到 SQL Server

python - 使用负向后查找的多个正则表达式匹配

python - 删除特定列pandas

python - 如何仅获取 pandas 数据帧的行索引

Python - 交换多个数据框中的值

python - 如何使用 Python 中的循环比较不同脚本中不同大小的多个列表

python - 如果不包含在另一个列表中,则从数据框中的列表项中删除元素的优雅方法

python - 基于前一行的 Pandas 数据框列

python - 如何有效地删除所有数字作为 Pandas 的数据清理?

r - 在向量上过滤数据帧