我有超过 100 万个 CSV 格式条目的大数据,其中包含我公司的用户信息。我使用 Recsv Editor 从文件中删除了额外的列。现在我有以下列
ID NAME EMAIL SUB_STATUS SUB_DATE SMS_RECEIVED MEMBER
1 John abc@abc.com true 01.01.2018 true true
2 David abc@abc.com false 01.01.2018 true true
3 Raza abc@abc.com true 01.01.2018 true false
4 Syed abc@abc.com false 01.01.2018 false false
5 Eidi abc@abc.com true 01.01.2018 false false
我有超过 100 万条记录,但我需要根据特定条件从中提取数据,例如这里是示例逻辑
Extract all users which SUB_STATUS=true and SMS_RECEIVED=false and MEMBER=true OR
SUB_STATUS=false and SMS_RECEIVED=false and MEMBER=false
然后我可以根据上述示例条件在 csv 上获取输出。
我该如何存档?我是 Windows 用户,尝试过 PowerShell、Recsveditor。文件太大,excel打不开。
最佳答案
将这个大文件导入 Excel 没有问题,只是您需要拆分数据。拆分后,您可以应用过滤器。
问题只是它需要的时间。我将这个宏用于一个 5000 万行的 CSV 文件,它可以工作。只是花时间复制。分隔符是“,”,请检查您的分隔符。
Sub ReadCSVFiles()
Dim i, j, k, l, m As Long
Dim UserFileName As String
Dim strTextLine As String
Dim iFile As Integer: iFile = FreeFile
Dim Word() As String
UserFileName = Application.GetOpenFilename
Open UserFileName For Input As #iFile
i = 1
j = 1
Check = False
Do Until EOF(1)
Line Input #1, strTextLine
If i >= 1048576 Then
i = 1
j = j + 1
Else
Sheets(1).Cells(i, j) = strTextLine
i = i + 1
End If
Loop
Close #iFile
Worksheets.Add
Set ws1 = ThisWorkbook.Worksheets(1)
Set ws2 = ThisWorkbook.Worksheets(2)
ws1Col = ws1.UsedRange.SpecialCells(xlCellTypeLastCell).Column
ws1Row = ws1.UsedRange.SpecialCells(xlCellTypeLastCell).Row
k = 0
l = 0
Dim Items(1 To 16384) As Integer
For i = 1 To ws1Col
For j = 1 To ws1Row
Length = UBound(Split(ws1.Cells(j, i).Value2, ",", , vbBinaryCompare))
'Change the separator here
If Length > k Then
k = Length
End If
For m = 0 To k
Word() = Split(ws1.Cells(j, i).Value2, ",", , vbBinaryCompare)
ws2.Cells(j, i + l + m).Value2 = Word(m)
'Change the separator here
Next
Next
If i = 1 Then
Items(i) = k
Else
Items(i) = k + Items(i - 1)
End If
k = 0
l = Items(i)
Next
End Sub
关于excel - 从大型 CSV 转储文件中提取特定数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56565778/