我有两列:
row1 row2
0 500
1400 -1
1330 -1
0 900
500 -1
这里,如果row1的值是0,那么row2的值就不是-1。如果 row2 的值为 -1,则 row1 的值不为 0。
我想像这样创建一个新行:
row3
500
1400
1330
900
500
在这一行中,如果 row1 的值为 0,则它的值将被 row2 替换。我该怎么做?
最佳答案
您可以使用 numpy.where
(我更愿意将其命名为 numpy.if_then_else
)。
>>> df['row3'] = np.where(df['row2'] == -1, df['row1'], df['row2'])
>>> df
row1 row2 row3
0 0 500 500
1 1400 -1 1400
2 1330 -1 1330
3 0 900 900
4 500 -1 500
或者,更简洁但非常针对您问题中的设置的上下文:
>>> df['row3'] = np.where(df['row1'], df['row1'], df['row2'])
>>> df
row1 row2 row3
0 0 500 500
1 1400 -1 1400
2 1330 -1 1330
3 0 900 900
4 500 -1 500
时间:
>>> df = pd.concat([df]*1000)
>>> df_c = df.copy()
>>> %timeit df.clip_lower(0).sum(1) # coldspeed 1
537 µs ± 5.17 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit df.row2.mask(df.row2.eq(-1)).combine_first(df.row1) # coldspeed 2
964 µs ± 15.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit df_c.loc[df_c.row2 == -1, 'row2'] = np.nan; df_c.row2.add(df_c.row1, fill_value=0) # coldspeed 3
2.66 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit [r1 if r2 == -1 else r2 for r1, r2 in zip(df.row1, df.row2)] # Daniel Mesejo
466 µs ± 1.79 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit df.replace(-1,0).sum(1) # W-B
783 µs ± 45.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit np.where(df['row2'] == -1, df['row1'], df['row2']) # timgeb 1
173 µs ± 4.29 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit np.where(df['row1'], df['row1'], df['row2']) # timgeb 2
38.1 µs ± 3.69 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
关于python - 如果 Pandas Python 中的单元格值为 -1,如何放置另一列的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53820725/