python - 使用python在excel中删除具有某些列组合的重复行

我有一个读取 excel 文档的 python 程序。我只需要允许某些列组合的首次出现。例如:

    A     |  B
  -------------
  1.  200 | 201   
  2.  200 | 202
  3.  200 | 201
  4.  200 | 203
  5.  201 | 201
  6.  201 | 202
  .............

我想删除/跳过发现重复项的第三行并将其写入 CSV 文件。这是我到目前为止一直在尝试的功能。但它不起作用。

def validateExcel(filename):
   xls=xlrd.open_workbook(filename)  
   setcount = 0
   column = 0
   count = 0
   # sheetcount = 0
   for sheet in xls.sheets():
       header=""
       # sheetcount = sheetcount + 1
       number_of_rows = sheet.nrows
       number_of_columns = sheet.ncols
       sheetname = sheet.name          
       mylist = []
       for row in range (1, number_of_rows):  
           mylist = []
           for col in range(0, 2):      
               mylist.append(sheet.cell_value(row, col))

           print mylist

           myset = set(mylist)

           print myset

最佳答案

它对我有用:在 python 2.7 中

def validateExcel(filename):
   xls=xlrd.open_workbook(filename)  
   setcount = 0
   column = 0
   count = 0
   # sheetcount = 0
   for sheet in xls.sheets():
       header=""
       # sheetcount = sheetcount + 1
       number_of_rows = sheet.nrows
       number_of_columns = sheet.ncols
       sheetname = sheet.name          
       mylist = []
       for row in range(1, number_of_rows):  
            mylist.append((sheet.cell_value(row, 0), sheet.cell_value(row, 1)))
       myset = sorted(set(mylist), key=mylist.index)
       return myset

关于python - 使用python在excel中删除具有某些列组合的重复行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42759094/

上一篇：python - 什么时候使用完整列表切片？

下一篇：python - 使用 lambda、Pandas 添加 10 行

相关文章：

python - 根据python中的数据框重命名fasta文件

python - Python 的 socket.getaddrinfo/mercurial 未使用持久性 DNS 缓存

python - 使用数组分析输入并将其用作 Python 输入中数组列表的计数器

python - 读取Excel文件并将数据写入字典

python xlrd 格式不受支持，或文件损坏。

python - 传递给模块 __init__() 的隐式参数是什么？

python - 使用 Python 提取 ZipFile，显示进度百分比？

python - xlrd 模块可以更改文件属性吗？

python - 使用 xlrd、xlwt 和 xlutils 编辑现有 Excel 工作簿

python - excel 文件处理 xlrd 期间的 pytest 警告