我正在研究一个宏,它将通过电子表格并根据两个列(列 Q 和 D)中分别提供的两个条件删除重复的条目(行)。
这就是我所拥有的。我在一个小数据集上对其进行了测试,速度很慢。
Sub RemoveDupesKeepLast()
dim i As Integer
dim criteria1, criteria2 As String
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False
'start at bottom of sheet, go up
For i = ActiveSheet.UsedRange.Rows.Count to 2 Step -1
'if there is no entry, go to next row
If Cells(i, "Q").Value = "" Then
GoTo gogo:
End If
'set criteria that we will filter for
criteria1 = Cells(i, "D").Value
criteria2 = Cells(i, "Q").Value
'filter for criteria2, then criteria1 to get duplicates
ActiveSheet.Range("A":"CI").AutoFilter field:=17, Criteria1:=criteria2, Operator:=xlFilterValues
ActiveSheet.Range("A":"CI").AutoFilter field:=4, Criteria1:=criteria1, Operator:=xlFilterValues
'if there are duplicates, keep deleting rows until only bottom-most entry is left behind
Do While Range("Q2", Cells(Rows.Count, "Q").End(xlUp)).Cells.SpecialCells(xlCellTypeVisible).Count > 1
ActiveSheet.AutoFilter.Range.Offset(1).SpecialCells(xlCellTypeVisible).Cells(1,17).EntireRow.Delete
Loop
'reset autofilter
If ActiveSheet.FilterMode Then
Cells.AutoFilter
End If
gogo:
Next i
Application.EnableEvents = True
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub
有没有不同的方法可以解决这个问题来加快速度?就像现在一样,我基本上会检查每一行,直到我到达顶部。这些工作表实际上是从 30,000 行到最大的任何地方。在我看来,应该有一种更快、更清洁的方式来实现我想要做的事情,但我似乎想不出一个。
最佳答案
此过程删除由列 D 和 Q 标识的所有重复行。
在重复项中,它将保持最接近工作表底部的行。
基本上,在右侧创建一个索引列来对底部的所有重复行进行排序和移动,以便可以在一次调用中删除它们。
请注意,它不会更改单元格公式或格式(如果有)。
Sub DeleteDuplicatedRows()
Dim rgTable As Range, rgIndex As Range, dataColD(), dataColQ()
Set rgTable = ActiveSheet.UsedRange
' load each column representing the identifier in an array
dataColD = rgTable.Columns("D").value ' load values from column D
dataColQ = rgTable.Columns("Q").value ' load values from column Q
' get each unique row number with a dictionary
Dim dict As New VBA.collection, indexes(), r&, rr
On Error Resume Next
For r = UBound(dataColD) To 1 Step -1
dict.Add r, dataColD(r, 1) & vbNullChar & dataColQ(r, 1)
Next
On Error GoTo 0
' index all the unique rows in an array
ReDim indexes(1 To UBound(dataColD), 1 To 1)
For Each rr In dict: indexes(rr, 1) = rr: Next
' insert the indexes in the last column on the right
Set rgIndex = rgTable.Columns(rgTable.Columns.count + 1)
rgIndex.value = indexes
' sort the rows on the indexes, duplicates will move at the end
Union(rgTable, rgIndex).Sort key1:=rgIndex, Orientation:=xlTopToBottom, Header:=xlYes
' delete the index column on the right and the empty rows at the bottom
rgIndex.EntireColumn.Delete
rgTable.Resize(UBound(dataColD) - dict.count + 1).offset(dict.count).EntireRow.Delete
End Sub
关于excel - 删除重复项,保留最后一个条目——优化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36348195/