A B C
0 2002-01-16 2002-02-28 Jack
1 2002-01-16 2002-01-30 Helen
2 2002-01-16 2002-02-28 Peter
3 2002-01-16 2002-01-30 Jud
4 2002-04-27 2002-04-30 Nick
5 2002-04-27 2002-05-25 Wendy
6 2002-04-27 2002-04-30 Bryan
7 2002-04-27 2002-05-25 Sarah
我想为每个 A
组选择 A
日期在时间上更接近 B
日期的行。
输出应该是:
A B C
1 2002-01-16 2002-01-30 Helen
3 2002-01-16 2002-01-30 Jud
4 2002-04-27 2002-04-30 Nick
6 2002-04-27 2002-04-30 Bryan
最佳答案
使用:
df = df[df['B'].sub(df['A']).groupby(df['A']).transform(lambda x: x == x.min())]
print (df)
A B C
1 2002-01-16 2002-01-30 Helen
3 2002-01-16 2002-01-30 Jud
4 2002-04-27 2002-04-30 Nick
6 2002-04-27 2002-04-30 Bryan
详细信息:
print (df['B'].sub(df['A']))
0 43 days
1 14 days
2 43 days
3 14 days
4 3 days
5 28 days
6 3 days
7 28 days
dtype: timedelta64[ns]
print (df['B'].sub(df['A']).groupby(df['A']).transform(lambda x: x == x.min()))
0 False
1 True
2 False
3 True
4 True
5 False
6 True
7 False
dtype: bool
关于python - 按日期之间的最小绝对差按组选择行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49634468/