algorithm - 寻求有效的算法来分析类似于VBA中数据透视表的数据

标签 algorithm vba excel-vba dictionary excel

介绍
我已经有了这个问题的有效解决方案。作为编程新手,我认为我的解决方案相当难看,但:1)我不确定这是真的,2)我在网上学习了整整一周后,没有看到一个更优雅的解决方案。由于我可以使用透视表轻松地临时解决问题,而且主题电子表格是常见的报表样式,因此我认为许多其他人已经解决了此问题(但我还没有找到或知道要搜索的关键字)。我将提供原始数据、所需的输出、我目前正在工作的代码,以及一些我已经看过但没有使用的替代方案以及原因我要求你提供一种更好的编程方法——如果你看到的话——或者至少验证一下我的方法是尽可能有效的。谢谢您。
原始数据
以下是包含要摘要的数据的报表示例:

Report #|    Assignee|    Type of Report|    Department|    Status
1       |     Shannon|            Ad hoc|    Accounting|    Declined
2       |     Shannon|            Ad hoc|    Accounting|    Completed
3       |     Shannon|            Change|    Accounting|    New
4       |     Shannon|            Change|      Shipping|    In Progress
5       |     Shannon|          Training|      Shipping|    Declined
6       |     Shannon|          Training|   CustService|    Completed
7       |     Shannon|          Training|   CustService|    New
8       |       Jason|            Ad hoc|   CustService|    In Progress
9       |       Jason|            Change|   CustService|    Declined
10      |       Jason|            Ad hoc|    Accounting|    Completed
11      |       Jason|          Training|    Accounting|    New
12      |      Thomas|          Training|    Accounting|    In Progress
13      |       Jason|            Change|      Shipping|    Declined
14      |       Jason|            Ad hoc|      Shipping|    Completed
15      |      Thomas|            Ad hoc|   CustService|    New
16      |       Jason|               New|   CustService|    In Progress
17      |      Thomas|               New|   CustService|    Declined
18      |      Thomas|            Change|   CustService|    Completed
19      |      Thomas|            Ad hoc|      Shipping|    New
20      |      Thomas|            Change|      Shipping|    In Progress
     -Continues in similar fashion until:-
545     |     Phyllis|               New|    Accounting|    Declined

期望输出
数据需要以类似于下面的方式进行总结也就是说,它是按Assignee名称筛选的,子类别的计数是每个父类别的输出(注意:这个输出可以使用一个特别的pivot表很容易获得,但是我想把它放到一个运行中的表中,这个表随着时间的推移以编程方式构建,以便进行趋势分析。)
Shannon:    Type of Report       Department         Status
            Ad hoc= 25           Accounting= 45     Declined = 12
            Change= 13           CustService= 2     In Progress= 24
            Training= 3          Shipping= 75       New= 56
            New= 81                                 Completed= 30

Jason:      Type of Report       Department         Status
            Ad hoc= 12           Accounting= 21     Declined = 0
            Change= 3            CustService= 23    In Progress= 12
            Training= 20         Shipping= 4        New= 12
            New= 13                                 Completed= 24

-Continues for each "Assignee"-

我所做的总结与从数据透视表中得到的非常相似,这就是当前手动过程获取数据的方式。但是,我需要获取数据,并将其放入保留历史计数的每日跟踪表中,用于对每个被分配者进行趋势分析,因此数据透视表本身不是解决方案。
当前工作代码
这是整个工作代码,包括两个附加到末尾的子代码:
Sub CollateData()

Dim HdrNm As New Collection 'Collection used to read and reference column indices.

'Variables used for referencing the "Assigned To" column
Dim Assignee As New Scripting.Dictionary
Dim nmAssignee As New Scripting.Dictionary
Dim Asgn As String
Dim a As Integer
Dim aKey As Variant

'Variables used for referencing the "Type of Report" column
Dim TypRep As New Scripting.Dictionary
Dim nmTypRep As New Scripting.Dictionary
Dim arrTypRep() As Integer
Dim Typ As String
Dim t As Integer

'Variables used for referencing the "Department" column
Dim Dept As New Scripting.Dictionary
Dim nmDept As New Scripting.Dictionary
Dim arrDept() As Integer
Dim Bus As String
Dim b As Integer

'Variables used for referencing the "Task Status" column
Dim TskStatus As New Scripting.Dictionary
Dim nmTskStatus As New Scripting.Dictionary
Dim arrTskStatus() As Integer
Dim Tsk As String
Dim s As Integer

'Other variables
Dim DataWS As Worksheet
Dim ScratchWS As Worksheet
Dim lastrow As Integer, x As Integer

Set DataWS = ThisWorkbook.Worksheets("SheetWithRawData")
lastrow = DataWS.Cells(Rows.Count, 11).End(xlUp).Row

Call ReadHeaderRow(DataWS, HdrNm) 'Fills the HdrNm collection with column index using column headers for keys

'Initialize variables for the loop that follows
a = 1
t = 1
b = 1
s = 1

'This next seciont/first loop goes through the report to identify a unique list of assignees and category lists _
' which need to be summed. These lists will be used to ReDim the 2-dimensional arrays to appropriate _
' size, as well as reference the elements of the 2D array(s).
'
' NOTE: I am using the seemingly duplicative Dictionaries (e.g. TypRep & nmTypeRep) in order to have _
' access to the category as both a string and as an integer/index.

For x = 2 To lastrow

    If Not Assignee.Exists(DataWS.Cells(x, HdrNm("Assigned to")).Value) Then
        Assignee.Add DataWS.Cells(x, HdrNm("Assigned to")).Value, a
        nmAssignee.Add a, DataWS.Cells(x, HdrNm("Assigned to")).Value
        a = a + 1
    End If

    If Not TypRep.Exists(DataWS.Cells(x, HdrNm("Type of Report")).Value) Then
        TypRep.Add DataWS.Cells(x, HdrNm("Type of Report")).Value, t
        nmTypRep.Add t, DataWS.Cells(x, HdrNm("Type of Report")).Value
        t = t + 1
    End If

    If Not Dept.Exists(DataWS.Cells(x, HdrNm("Department")).Value) Then
        Dept.Add DataWS.Cells(x, HdrNm("Department")).Value, b
        nmDept.Add b, DataWS.Cells(x, HdrNm("Department")).Value
        b = b + 1
    End If

    If Not TskStatus.Exists(DataWS.Cells(x, HdrNm("Task Status")).Value) Then
        TskStatus.Add DataWS.Cells(x, HdrNm("Task Status")).Value, s
        nmTskStatus.Add s, DataWS.Cells(x, HdrNm("Task Status")).Value
        s = s + 1
    End If

Next x

'Assign the appropriate dimensions to the following 2D arrays
ReDim arrTypRep(1 To Assignee.Count, 1 To TypRep.Count)
ReDim arrDept(1 To Assignee.Count, 1 To Dept.Count)
ReDim arrTskStatus(1 To Assignee.Count, 1 To TskStatus.Count)

'The following, second loop now goes through and sums up the count of each category element for each _
' Assignee.  Using this technique, I only go through the list/report once (or twice, if you consider _
' the previous loop to dimension the arrays) in order to tabulate the desired data.

For x = 2 To lastrow

    Asgn = DataWS.Cells(x, HdrNm("Assigned to")).Value
    Typ = DataWS.Cells(x, HdrNm("Type of Report")).Value
    Bus = DataWS.Cells(x, HdrNm("Department")).Value
    Tsk = DataWS.Cells(x, HdrNm("Task Status")).Value
    arrTypRep(Assignee.item(Asgn), TypRep.item(Typ)) = arrTypRep(Assignee.item(Asgn), TypRep.item(Typ)) + 1
    arrDept(Assignee.item(Asgn), Dept.item(Bus)) = arrDept(Assignee.item(Asgn), Dept.item(Bus)) + 1
    arrTskStatus(Assignee.item(Asgn), TskStatus.item(Tsk)) = arrTskStatus(Assignee.item(Asgn), TskStatus.item(Tsk)) + 1

Next x

'Now to generate the output of the data we collected:
On Error Resume Next
Application.DisplayAlerts = False
With ThisWorkbook
    .Worksheets("DesiredOutput").Delete
    .Worksheets.Add after:=.Worksheets(1)
End With
Application.DisplayAlerts = True
On Error GoTo 0

Set ScratchWS = ThisWorkbook.ActiveSheet
ScratchWS.Name = "DesiredOutput"

x = 1

'Loop through each Assignee and dump out the collected counts
For Each aKey In Assignee

    Call OutputData("Type of Report", Assignee, nmAssignee, aKey, TypRep, nmTypRep, arrTypRep, x)
    Call OutputData("Department", Assignee, nmAssignee, aKey, Dept, nmDept, arrDept, x)
    Call OutputData("Task Status", Assignee, nmAssignee, aKey, TskStatus, nmTskStatus, arrTskStatus, x)

Next aKey

Range("B1").ColumnWidth = 3
Range("A1, C1").EntireColumn.AutoFit

End Sub

****************************************************************************
****************************************************************************

Sub OutputData(Title As String, Assignee As Scripting.Dictionary, nmAssignee As Scripting.Dictionary, _
    aKey As Variant, ReportCategory As Scripting.Dictionary, nmReportCategory As Scripting.Dictionary, _
    arrCategory() As Integer, x As Integer)

Dim CatKey As Variant

With Cells(x, 2)
   .Value = Title
   .Font.Bold = True
End With

x = x + 1

For Each CatKey In ReportCategory
    Cells(x, 1).Value = nmAssignee.item(Assignee.item(aKey))
    Cells(x, 3).Value = nmReportCategory.item(ReportCategory.item(CatKey))
    Cells(x, 4).Value = arrCategory(Assignee.item(aKey), ReportCategory.item(CatKey))
    x = x + 1
Next CatKey

x = x + 1

End Sub

**************************************************************************
**************************************************************************

Private Sub ReadHeaderRow(TargetWS As Worksheet, HdrNm As Collection)

Dim lastcolumn As Integer
Dim x As Integer

lastcolumn = TargetWS.Cells(1, Columns.Count).End(xlToLeft).Column

For x = 1 To lastcolumn
    HdrNm.Add TargetWS.Cells(1, x).Column, TargetWS.Cells(1, x).Value
Next x

End Sub

考虑/放弃的其他可能解决方案
每个受让人的循环列表
我考虑过生成一个字典/受让人集合,然后在收集每个通行证上每个受让人的数据的报告中循环,但是受让人的数量可能会改变(增加),报告列表可能会增加,因此许多无关的通行证会通过列表。
交错数组/集合/字典
当我第一次得知我可以做一本字典(数组数组等)时,我很兴奋,但据我所知,我不能用第一本字典作为被分配者(第一维),用第二本字典作为所有类别(例如,报告类型)的第二维。实际上,我需要为每个受让人和类别创建一个单独的词典。换句话说,如果我只有一个类别(报告类型)和15个被分配者,我实际上需要创建16个字典:第一个字典将是“被分配者”--被分配者的名称作为关键字,第二个到第十六个字典(typrep1到typrep15)作为与字典被分配者中的关键字相对应的项。另外,我不能动态地创建字典,因为被分配者的数量可能会发生变化,所以这个方法对我来说是不适用的,除非我误解了一些重要的东西(总是可能的)。我对锯齿状数据类型的了解来自这里:http://bytecomb.com/collections-of-collections-in-vba/
自定义数据类型
我没有尝试过,因为我只是遇到了它,我对它了解不多,但也许这个问题可以作为自定义数据类型来解决。我会去读更多关于它们的文章,但也许这是一个更好的解决方案,我还不明白。
结论声明
我知道有很多东西要看,对不起。谢谢你坚持这么久。对于如何实现上述代码所能达到的效果,我将非常感谢您的任何建议。我相信,我没有找到更好的方法的原因是,这个问题的解决方案对除了我以外的所有人都是显而易见的,而且似乎任何用vba/excel编写代码的人都会遇到这种情况。谢谢你的帮助。

最佳答案

我们在编程中使用对象。大多数计算机都有excel,所以你可以让excel为你做这件事。
windows附带了一个对象,我认为它是一个数据类型。在内存中创建的断开连接的记录集。
所以你会的

  rs.filter = "Assignee='Shannon' AND Status='Cancelled'"

然后
 msgbox rs.recordcount

给你电话号码。
或者你可以排序和枚举。
这将从文件的顶部或底部剪切行。
cscript scriptname.vbs "" t x 5 <infile.txt >outfile.txt



Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout

    Set rs = CreateObject("ADODB.Recordset")
    With rs
        .Fields.Append "LineNumber", 4 

        .Fields.Append "Txt", 201, 5000 
        .Open
        LineCount = 0
        Do Until Inp.AtEndOfStream
            LineCount = LineCount + 1
            .AddNew
            .Fields("LineNumber").value = LineCount
            .Fields("Txt").value = Inp.readline
            .UpDate
        Loop

        .Sort = "LineNumber ASC"

        If LCase(Arg(1)) = "t" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber < " & LCase(Arg(3)) + 1
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber > " & LCase(Arg(3))
            End If
        ElseIf LCase(Arg(1)) = "b" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber > " & LineCount - LCase(Arg(3))
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
            End If
        End If

        Do While not .EOF
            Outp.writeline .Fields("Txt").Value

            .MoveNext
        Loop
    End With

一种方法是将4个条件放入4个数组中—assignee、status、dept和otherone。
For each a in assignee()
    For each b in Status()
        For each c in Dept()
            For each d in other()
                .filter = "assinnee=" & a & "AND Status=" & b & "And dept=" & c
                msgbox .recordcount
            Next
        Next
    Next
Next

关于algorithm - 寻求有效的算法来分析类似于VBA中数据透视表的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28818913/

相关文章:

excel - 私有(private)函数可在工作表上使用

excel - 验证还是不验证?

objective-c - 使用两个 'alphabets' 的不同长度的序列组合

c++ - 查找值属于哪个 bin

vba - 根据背景颜色计算excel中文本的实例

excel - 子函数显示用户窗体

excel - Excel VBA UDF自动完成参数

arrays - 通过有限排列进行遍历

algorithm - 如何实现标签搜索?

Excel - 禁用 F11 按键