python - 将系列添加到 Pandas 数据框会产生 NaN 列

使用此数据集(为简洁起见省略了一些列和数百行)。 . .

    Year    Ceremony    Award          Winner   Name    
0   1927/1928   1       Best Actress    0.0     Louise Dresser  
1   1927/1928   1       Best Actress    1.0     Janet Gaynor
2   1937        10      Best Actress    0.0     Janet Gaynor
3   1927/1928   1       Best Actress    0.0     Gloria Swanson  
4   1929/1930   3       Best Actress    0.0     Gloria Swanson
5   1950        23      Best Actress    0.0     Gloria Swanson

我使用了以下命令。 . .

ba_dob.loc[ba_dob.Winner == 0.0, :].groupby('Name').Winner.count()

创建以下系列。 . .

Name
Ali MacGraw                1
Amy Adams                  1
Angela Bassett             1
Angelina Jolie             1
Anjelica Huston            1
Ann Harding                1
Ann-Margret                1
Anna Magnani               1
Anne Bancroft              4
Anne Baxter                1
Anne Hathaway              1
Annette Bening             3
Audrey Hepburn             4

我尝试像这样将系列添加到原始数据框中。 . .

ba_dob['New_Col'] = ba_dob.loc[ba_dob.Winner == 0.0, :].groupby('Name').Winner.count()

我得到了一列 NaN 值。

我读过其他帖子，暗示工作中可能存在一些错误的索引，但我不确定这将如何解决。更具体地说，为什么 Pandas 不能排列索引，因为 groupby 和 count 来自同一个表。还有其他事情吗？

最佳答案

我想你需要size , 不是 count ，因为 count 排除了 NaN:

最后map Groupby 创建的 Series 列 Name:

m = ba_dob.Winner == 0.0
ba_dob['new'] = ba_dob['Name'].map(ba_dob[m].groupby('Name').Winner.size())
print (ba_dob)
        Year  Ceremony         Award  Winner            Name  new
0  1927/1928         1  Best Actress     0.0  Louise Dresser    1
1  1927/1928         1  Best Actress     1.0    Janet Gaynor    1
2       1937        10  Best Actress     0.0    Janet Gaynor    1
3  1927/1928         1  Best Actress     0.0  Gloria Swanson    3
4  1929/1930         3  Best Actress     0.0  Gloria Swanson    3
5       1950        23  Best Actress     0.0  Gloria Swanson    3

另一种解决方案:

ba_dob['new'] = ba_dob['Name'].map(ba_dob.loc[m, 'Name'].value_counts())

关于python - 将系列添加到 Pandas 数据框会产生 NaN 列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45599279/

python - 将系列添加到 Pandas 数据框会产生 NaN 列

上一篇：python - Numpy:向量化一个集成二维数组的函数

下一篇：python - 将图例扩展到 2 个子图