python - 如何获取 Pandas Dataframe 中特定列值之后的 n 行之前或之后

我有一个结果集，我想从中获取与特定单元格值匹配的行之后(之前)的下 n 行(或前 n 行)。

例如，这是我的数据:

A    B   C
1   10   2018-11-01
2   20   2018-10-31
3   30   2018-10-30
4   40   2018-10-29
5   50   2018-10-28
6   60   2018-10-27

我有兴趣在 C=2018-10-28(日期类型)的行之前获取 3 行，包括 C=2018-10-28 行，所以我的输出应该是

 A    B   C
3   30   2018-10-30
4   40   2018-10-29
5   50   2018-10-28

我尝试了 loc，但它需要索引，因此这会引发错误:df2 = df2.loc[:C].tail(3) as TypeError: can't Compare datetime.date到 int。

最佳答案

检查df中的dtypes:如果C列的df.dtypes不是日期时间，则将其转换为日期时间:

df.dtypes
Out[46]:
B     int64
C    object
dtype: object

df['C'] = pd.to_datetime(df['C'])
df.dtypes
Out[48]:
B             int64
C    datetime64[ns]
dtype: object

现在“C”列与日期时间格式的字符串相当:

target_date = "2018-10-28"
df[df['C'] >= target_date].tail(3)
    B          C
A
3  30 2018-10-30
4  40 2018-10-29
5  50 2018-10-28

但在更一般的情况下(有多个目标列并且数据无序)，您可以使用以下方法:

df
A    B          C

0   10 2018-09-10
1   20 2018-07-11
2   20 2018-06-12
3   30 2018-07-13
4   50 2018-10-28
5   10 2018-11-01
6   20 2018-10-31
7   30 2018-10-30
8   40 2018-10-29
9   50 2018-10-28
10  60 2018-10-27

index = df[df['C'] == '2018-10-28'].index
index
Out:
Int64Index([4, 9], dtype='int64', name=0)

使用slice和.iloc来获取目标:

slices = [slice(i, i-3, -1) for i in indicies]
slices
Out: [slice(4, 1, -1), slice(9, 6, -1)]

pd.concat([df.iloc[sl] for sl in slices])
    B          C
A
4  50 2018-10-28
3  30 2018-07-13
2  20 2018-06-12
9  50 2018-10-28
8  40 2018-10-29
7  30 2018-10-30

结果帧未排序，但很容易修复。此方法仅适用于数字索引，但如果没有数字索引，您可以使用 pd.reset_index() 添加它。

关于python - 如何获取 Pandas Dataframe 中特定列值之后的 n 行之前或之后，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54105901/

python - 如何获取 Pandas Dataframe 中特定列值之后的 n 行之前或之后

上一篇：python - tf.image.decode_jpeg - 内容必须是标量，具有形状 [1]

下一篇：python - 当列中的值发生变化时，在 Python 数据框中插入空白行？