从我昨天完成的另一个问题开始Pandas set value if all columns are equal in a dataframe

从@anky_91解决方案开始，我正在研究类似的东西。如果所有列都相等，我想要更灵活的东西，而不是放置 1 或 -1 。事实上，如果(例如)70% 的列是 1，则我想要 1，对于相同但相反的条件，-1 和0 其他。

这就是我写的:

# Instead of using .all I use .sum to count the occurence of 1 and 0 for each row
m1 = local_df.eq(1).sum(axis=1)
m2 = local_df.eq(0).sum(axis=1)

# Debug print, it work
print(m1)
print(m2)

但我不知道如何更改这部分:

local_df['enseamble'] = np.select([m1, m2], [1, -1], 0)
m = local_df.drop(local_df.columns.difference(['enseamble']), axis=1)

我用伪代码写下我想要的内容:

tot = m1 + m2

if m1 > m2
    if(m1 * 100) / tot > 0.7 # simple percentage calculus
      df['enseamble'] = 1

else if m2 > m1
    if(m2 * 100) / tot > 0.7 # simple percentage calculus
      df['enseamble'] = -1   

else: 
   df['enseamble'] = 0

谢谢

编辑 1

这是预期输出的示例:

 NET_0  NET_1  NET_2  NET_3  NET_4  NET_5  NET_6   
date                                                                                                                                                                                                            
2009-08-02      0     1    1    1    0    1
2009-08-03      1     0    0    0    1    0
2009-08-04      1     1    1    0    0    0


 date    enseamble
 2009-08-02     1 # because 1 is more than 70%
 2009-08-03     -1 # because 0 is more than 70%
 2009-08-04     0 # because 0 and 1 are 50-50

最佳答案

满足以下条件可以获得指定的输出:

thr = 0.7
c1 = (df.eq(1).sum(1)/df.shape[1]).gt(thr)
c2 = (df.eq(0).sum(1)/df.shape[1]).gt(thr)
c2.astype(int).mul(-1).add(c1)

输出

2009-08-02    0
2009-08-03    0
2009-08-04    0
2009-08-05    0
2009-08-06   -1
2009-08-07    1
dtype: int64

或者使用np.select:

pd.DataFrame(np.select([c1,c2], [1,-1], 0), index=df.index, columns=['result'])

              result
2009-08-02       0
2009-08-03       0
2009-08-04       0
2009-08-05       0
2009-08-06      -1
2009-08-07       1

关于python - 如果数据框中大多数列相等，则 Pandas 设置值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55421118/

python - 如果数据框中大多数列相等，则 Pandas 设置值

编辑 1

上一篇：python - 我可以在 if 语句中不使用 break 来执行此操作吗？

下一篇：python - 如何跟踪一段代码中变量的使用？