我想不通过滤器功能如何运行得如此之快。我对各种数据都使用了 Filter,无论数据类型如何,Filter 都会抹杀我使用的任何替代方法。我经常使用二分搜索算法和 Stephen Bullen 编写的 QuickArraySort 算法(在 Professional Excel Development 中找到)。二分搜索快如闪电(与过滤函数一样快,假设数组已排序),而快速排序算法是已知最快的排序算法之一。
我在下面编写了一些测试代码,用于比较在非常大的数组(大小 = 2,000,000)中查找随机元素的速度。我故意以无序的方式填充数组(应该注意的是,我尝试了各种无序分配方法,无论分配方法如何,结果都是相似的)。
Sub SearchTest()
Dim i As Long, strMyArray() As String, lngSize As Long, strTest As String
Dim TimeBinarySearch As Long, TimeFilterSearch As Long
Dim lngResultBinary As Long, lngResultFilter As Long
Dim StartHour As Long, StartMinute As Long, StartSecond As Long
Dim StartMiliSecond As Long, StartTime As Long
Dim EndHour As Long, EndMinute As Long, EndSecond As Long
Dim EndMiliSecond As Long, EndTime As Long
lngSize = 2000000
strTest = CStr(1735674 * 987)
ReDim strMyArray(lngSize)
For i = 1 To UBound(strMyArray)
If i Mod 2 = 0 Then
strMyArray(i) = CStr((i - 1) * 987)
Else
strMyArray(i) = CStr((i + 1) * 987)
End If
Next i
''Filter Test
'*******************************************************************
StartHour = Hour(Now()) * 60 * 60 * 1000
StartMinute = Minute(Now()) * 60 * 1000
StartSecond = Second(Now()) * 1000
StartMiliSecond = Format(Now(), "ms")
StartTime = StartHour + StartMinute + StartSecond + StartMiliSecond
lngResultFilter = CLng(Filter(strMyArray, strTest)(0))
EndHour = Hour(Now()) * 60 * 60 * 1000
EndMinute = Minute(Now()) * 60 * 1000
EndSecond = Second(Now()) * 1000
EndMiliSecond = Format(Now(), "ms")
EndTime = EndHour + EndMinute + EndSecond + EndMiliSecond
TimeFilterSearch = EndTime - StartTime
'*******************************************************************
''Binary Test
'*******************************************************************
StartHour = Hour(Now()) * 60 * 60 * 1000
StartMinute = Minute(Now()) * 60 * 1000
StartSecond = Second(Now()) * 1000
StartMiliSecond = Format(Now(), "ms")
StartTime = StartHour + StartMinute + StartSecond + StartMiliSecond
QuickSortString1D strMyArray
lngResultBinary = strMyArray(CLng(BinarySearchString(strTest, strMyArray)))
EndHour = Hour(Now()) * 60 * 60 * 1000
EndMinute = Minute(Now()) * 60 * 1000
EndSecond = Second(Now()) * 1000
EndMiliSecond = Format(Now(), "ms")
EndTime = EndHour + EndMinute + EndSecond + EndMiliSecond
TimeBinarySearch = EndTime - StartTime
'*******************************************************************
MsgBox lngResultFilter & vbCr & vbCr & lngResultBinary
MsgBox TimeFilterSearch & vbCr & vbCr & TimeBinarySearch
End Sub
两种方法返回的结果相同,但Filter方法的返回时间为0毫秒,QuickSort/BinarySearch方法的返回时间接近20秒。这是一个巨大的差异!如前所述,如果数组已排序,则二分查找方法也会返回 0 毫秒(众所周知,二分查找要求数组首先排序)
那么,Filter 函数如何查看 2,000,000 个未排序的条目并立即找到正确的结果?它不能简单地遍历每个条目并将其与过滤器值进行比较(这是迄今为止最慢的方法),但是基于这些初步测试,它也不能使用二进制搜索,因为它必须排序首先是数组。即使已经编写了一个很棒的排序算法,我也很难相信它可以瞬间对大于一百万的数组进行排序。
顺带一提,下面是QuickSort算法和Binary Search算法。
Sub QuickSortString1D(ByRef saArray() As String, _
Optional ByVal bSortAscending As Boolean = True, _
Optional ByVal lLow1 As Variant, _
Optional ByVal lHigh1 As Variant)
'Dimension variables
Dim lLow2 As Long
Dim lHigh2 As Long
Dim sKey As String
Dim sSwap As String
On Error GoTo ErrorExit
'If not provided, sort the entire array
If IsMissing(lLow1) Then lLow1 = LBound(saArray)
If IsMissing(lHigh1) Then lHigh1 = UBound(saArray)
'Set new extremes to old extremes
lLow2 = lLow1
lHigh2 = lHigh1
'Get value of array item in middle of new extremes
sKey = saArray((lLow1 + lHigh1) \ 2)
'Loop for all the items in the array between the extremes
Do While lLow2 < lHigh2
If bSortAscending Then
'Find the first item that is greater than the mid-point item
Do While saArray(lLow2) < sKey And lLow2 < lHigh1
lLow2 = lLow2 + 1
Loop
'Find the last item that is less than the mid-point item
Do While saArray(lHigh2) > sKey And lHigh2 > lLow1
lHigh2 = lHigh2 - 1
Loop
Else
'Find the first item that is less than the mid-point item
Do While saArray(lLow2) > sKey And lLow2 < lHigh1
lLow2 = lLow2 + 1
Loop
'Find the last item that is greater than the mid-point item
Do While saArray(lHigh2) < sKey And lHigh2 > lLow1
lHigh2 = lHigh2 - 1
Loop
End If
'If the two items are in the wrong order, swap the rows
If lLow2 < lHigh2 Then
sSwap = saArray(lLow2)
saArray(lLow2) = saArray(lHigh2)
saArray(lHigh2) = sSwap
End If
'If the pointers are not together, advance to the next item
If lLow2 <= lHigh2 Then
lLow2 = lLow2 + 1
lHigh2 = lHigh2 - 1
End If
Loop
'Recurse to sort the lower half of the extremes
If lHigh2 > lLow1 Then
QuickSortString1D saArray, bSortAscending, lLow1, lHigh2
End If
'Recurse to sort the upper half of the extremes
If lLow2 < lHigh1 Then
QuickSortString1D saArray, bSortAscending, lLow2, lHigh1
End If
ErrorExit:
End Sub
'***********************************************************
' Comments: Uses a binary search algorithm to quickly locate
' a string within a sorted array of strings
'
' Arguments: sLookFor The string to search for in the array
' saArray An array of strings, sorted ascending
' lMethod Either vbBinaryCompare or vbTextCompare
' Defaults to vbTextCompare
' lNotFound The value to return if the text isn’t
' found. Defaults to -1
'
' Returns: Long The located position in the array,
' or lNotFound if not found
'
' Date Developer Action
' ———————————————————————————————-
' 02 Jun 04 Stephen Bullen Created
'
Function BinarySearchString(ByRef sLookFor As String, _
ByRef saArray() As String, _
Optional ByVal lMethod As VbCompareMethod = vbTextCompare, _
Optional ByVal lNotFound As Long = -1) As Long
Dim lLow As Long
Dim lMid As Long
Dim lHigh As Long
Dim lComp As Long
On Error GoTo ErrorExit
'Assume we didn’t find it
BinarySearchString = lNotFound
'Get the starting positions
lLow = LBound(saArray)
lHigh = UBound(saArray)
Do
'Find the midpoint of the array
lMid = (lLow + lHigh) \ 2
'Compare the mid-point element to the string being searched for
lComp = StrComp(saArray(lMid), sLookFor, lMethod)
If lComp = 0 Then
'We found it, so return the location and quit
BinarySearchString = lMid
Exit Do
ElseIf lComp = 1 Then
'The midpoint item is bigger than us - throw away the top half
lHigh = lMid - 1
Else
'The midpoint item is smaller than us - throw away the bottom half
lLow = lMid + 1
End If
'Continue until our pointers cross
Loop Until lLow > lHigh
ErrorExit:
End Function
编辑:看来我应该先做一些“蛮力”测试。通过按照 John Coleman 建议的 Filter 函数执行的方式简单地以线性方式遍历数组,上述相同场景的返回时间为 0 毫秒。见下文:
Sub Test3()
Dim i As Long, strMyArray() As String, lngSize As Long, strTest As String
Dim lngResultBrute As Long, TimeBruteSearch As Long
lngSize = 2000000
strTest = CStr(936740 * 97)
ReDim strMyArray(lngSize)
For i = 1 To UBound(strMyArray)
If i Mod 2 = 0 Then
strMyArray(i) = CStr((i - 1) * 97)
Else
strMyArray(i) = CStr((i + 1) * 97)
End If
Next i
StartTime = Timer
' Brute force search
For i = 1 To UBound(strMyArray)
If strMyArray(i) = strTest Then
lngResultBrute = CLng(strTest)
Exit For
End If
Next i
EndTime = Timer
TimeBruteSearch = EndTime - StartTime
MsgBox TimeBruteSearch
End Sub
最佳答案
我相信您在这里比较苹果和橘子。看起来当您测试 Filter
函数时,您将无序数组作为输入,然后使用 Filter
查找与测试值匹配的项。直觉告诉我们那是 O(N) = 200 万次操作——您对每个数组元素测试一次。然后你就完成了。
当您使用自定义 VBA 函数进行过滤时,您首先要排序,这是非常昂贵的 O(N * Log2(N)) = 2900 万。对数组进行排序后,您确实会得到搜索有序数组的好处,即 O(Log2(N)) = 14。即使您极大地加快了搜索速度,排序的代价也会让您丧命。
希望对您有所帮助。
关于arrays - 使用 VBA 过滤器函数时的性能注意事项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33371653/