python - 带有列子集的数据框 drop_duplicates

对于子集参数，我想指定前 n-1 列。我该怎么做呢？

例如:在以下数据集中

   0   1  2   3   4  5   6
0  0  12  1  99  23  2  75
1  0  12  1  99  23  2  66
2  5  12  1  99  23  2  66

我希望结果仅为第一行和第三行:

   0   1  2   3   4  5   6
0  0  12  1  99  23  2  75
1  5  12  1  99  23  2  66

如果我执行以下操作，则会出现错误:

df.drop_duplicates(subset=[0:df.shape[1]-1],keep='first',inplace=True)

最佳答案

您可以使用重复

df[~df.iloc[:,:-1].duplicated()]
Out[53]: 
   0   1  2   3   4  5   6
0  0  12  1  99  23  2  75
2  5  12  1  99  23  2  66

关于python - 带有列子集的数据框 drop_duplicates，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49433555/

相关文章：

python - 如何从循环中获取 tkinter 条目