python - 组合 DataFrame 行以消除 NaN

标签 python pandas dataframe datetime nan

我有一个从数据记录器创建的 DataFrame,其中每个数据点都有自己的时间戳,如下所示:

df_orig = pd.DataFrame(
    {
        "val1": [ 1, np.nan, np.nan, 11, np.nan, np.nan, 21, np.nan, np.nan, ],
        "val2": [ np.nan, 2, np.nan, np.nan, 12, np.nan, np.nan, 22, np.nan, ],
        "val3": [ np.nan, np.nan, 3, np.nan, np.nan, 13, np.nan, np.nan, 23, ],
    },
    index=pd.to_datetime( [
        "2021-01-01 00:00", "2021-01-01 00:00:01", "2021-01-01 00:00:02",
        "2021-01-01 00:01", "2021-01-01 00:01:01", "2021-01-01 00:01:02",
        "2021-01-01 00:02", "2021-01-01 00:02:01", "2021-01-01 00:02:02",
    ] )
)
                     val1  val2  val3
2021-01-01 00:00:00   1.0   NaN   NaN
2021-01-01 00:00:01   NaN   2.0   NaN
2021-01-01 00:00:02   NaN   NaN   3.0
2021-01-01 00:01:00  11.0   NaN   NaN
2021-01-01 00:01:01   NaN  12.0   NaN
2021-01-01 00:01:02   NaN   NaN  13.0
2021-01-01 00:02:00  21.0   NaN   NaN
2021-01-01 00:02:01   NaN  22.0   NaN
2021-01-01 00:02:02   NaN   NaN  23.0

我实际上不需要记录每个数据点的精确度。我想通过消除 NaN 并合并非常接近的行来压缩 DataFrame。结果应该如下所示:

                     val1  val2  val3
2021-01-01 00:00:00     1     2     3
2021-01-01 00:01:00    11    12    13
2021-01-01 00:02:00    21    22    23

有办法做到这一点吗?

最佳答案

如果可能,使用 maxminfirst 使用每分钟重新采样简化解决方案:

df = df_orig.resample('Min').max()
print (df)
                     val1  val2  val3
2021-01-01 00:00:00   1.0   2.0   3.0
2021-01-01 00:01:00  11.0  12.0  13.0
2021-01-01 00:02:00  21.0  22.0  23.0

关于python - 组合 DataFrame 行以消除 NaN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69897125/

相关文章:

Python并发.futures : threads don't start

python - 将一列从一个 DataFrame 复制到另一个会给出 NaN 值?

python - Pandas 为每一行分配其 bin 的平均值

python - 如何获取 pandas 中出现频率较低的行的索引

python - 如何从列表字符串pandas数据帧的列中提取值

python - 属性错误: 'str' object has no attribute 'hist'

python - Google App Engine python 中的 UnicodeDecodeError

python - 为什么空闲 Python 线程消耗高达 90% 的 CPU?

python - 如何从 python 数据框中的列列表中删除重复项?

python - 如果满足某些条件,则在组内将日期移动到上一年的同一日期