python-3.x - 以特定于列的方式填充数字的 nan

标签 python-3.x pandas numpy dataframe apply

给定一个DataFrame和一个列表索引，是否有一个高效的pandas函数来输入nan值对于垂直位于列表中每个条目之前的所有值？

例如，假设我们有列表 [4,8] 和以下 DataFrame:

index     0      1
5         1      2
2         9      3 
4         3.2    3
8         9      8.7

所需的输出很简单:

index     0        1
5         nan      nan
2         nan      nan 
4         3.2      nan
8         9        8.7

对于这样快的函数有什么建议吗？

最佳答案

这是一种基于 np.searchsorted 的 NumPy 方法-

s = [4,8]

a = df.values
idx = df.index.values
sidx = np.argsort(idx)
matching_row_indx = sidx[np.searchsorted(idx, s, sorter = sidx)]
mask = np.arange(a.shape[0])[:,None] < matching_row_indx
a[mask] = np.nan

示例运行 -

In [107]: df
Out[107]: 
         0    1
index          
5      1.0  2.0
2      9.0  3.0
4      3.2  3.0
8      9.0  8.7

In [108]: s = [4,8]

In [109]: a = df.values
     ...: idx = df.index.values
     ...: sidx = np.argsort(idx)
     ...: matching_row_indx = sidx[np.searchsorted(idx, s, sorter = sidx)]
     ...: mask = np.arange(a.shape[0])[:,None] < matching_row_indx
     ...: a[mask] = np.nan
     ...: 

In [110]: df
Out[110]: 
         0    1
index          
5      NaN  NaN
2      NaN  NaN
4      3.2  NaN
8      9.0  8.7

关于python-3.x - 以特定于列的方式填充数字的 nan，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43251514/

上一篇：php - 将字符串日期转换为数据库的对象日期

下一篇：c# - Web API 获取多部分/表单数据响应的最简单方法

相关文章：

python - Numpy 向量化和原子向量

python-3.x - Python 3 keras : UnpicklingError: pickle data was truncated for partly downloaded keras cifar10 dataset

python - 如何将元素附加到 DataFrame 中的列表？

python - 在 Pandas 数据框顶部添加一行

python - 生成总和为 1 的随机变量数组(正数和负数)

python - gnumpy中是否有hstack的实现

Python:导入错误:没有名为 'tutorial.quickstart' 的模块

python - 解析器必须是字符串或字符流，而不是系列

python - 如何计算 pygame.Surface 的屏幕中间？

python-3.x - 通过计算数组中的字符串并添加数组中的值来填充 pandas 数据框