我有一组文本文件,每个文件在不同的行上包含几千个数字。 我想使用命令行或 Powershell 输出一个文件,总结每个文件中包含的数据,如下所示,例如计数.txt:
test.txt: <--Filename
(a) total: 4325 <-- Total number of lines in the file
(b) isbn: 2 <-- Count of numbers that don't start with 3618*
(c) duplicates: 13 <-- (a-b) - (Count of unique numbers that start with 3618*)
以上内容将附加到目录中每个文件的 Counts.txt 中。
到目前为止,我已成功使用以下方法将每个文件的总行数添加到 Counts.txt:
@echo off
setlocal enabledelayedexpansion
set cnt=0
set "out=Counts.txt"
if exist %out% del /q %out%
for /f %%a in ('dir /b /a-d') do (
for /f %%b in ('type "%%a"^|find /v /c ""') do (
set /a cnt=%%b & >>%out% echo(%%~nxa: "-total: %%b")
)
)
哪些输出:
test.txt: -total: 9
如何获取不以 3618* 开头的数字计数以及 (c) 点的计算结果?
一些有关格式设置的帮助也将不胜感激:)
最佳答案
有点不清楚以 3618* 开头的唯一重复数字的计数到底是什么意思,因此下面我为此添加了两个可能的选项,其中一个被注释掉了。您可以选择您需要的数量..
$result = Get-ChildItem -Path 'D:\Test' -Filter '*.txt' -File | ForEach-Object {
$data = Get-Content -Path $_.FullName
$isbn = @($data | Where-Object { $_ -like '97*' }).Count
# if only numbers starting with 3618 that actually have exact duplicates (like 3618123 found multiple times), do this:
$dupes = @($data | Where-Object { $_ -like '3618*' } | Group-Object | Where-Object {$_.Count -gt 1}).Count
# if ALL numbers starting with 3618 are to be regarded as duplicates, use this instead:
# $dupes = @($data | Where-Object { $_ -like '3618*' }).Count
# output the data in the format you showed in the question
@"
$($_.Name)
a) total: $($data.Count)
b) isbn: $isbn
c) duplicates: $dupes
"@
}
接下来,将结果写入文件
$result | Set-Content -Path '.\Counts.txt'
结果是这样的:
numbers1.txt
a) total: 10
b) isbn: 2
c) duplicates: 1
numbers2.txt
a) total: 9
b) isbn: 2
c) duplicates: 0
但就我个人而言,我希望输出为 CSV 文件:
$result = Get-ChildItem -Path 'D:\Test' -Filter '*.txt' -File | ForEach-Object {
$data = Get-Content -Path $_.FullName
$isbn = @($data | Where-Object { $_ -like '97*' }).Count
# if only numbers starting with 3618 that actually have exact duplicates (like 3618123 found multiple times), do this:
$dupes = @($data | Where-Object { $_ -like '3618*' } | Group-Object | Where-Object {$_.Count -gt 1}).Count
# if ALL numbers starting with 3618 are to be regarded as duplicates, use this instead:
# $dupes = @($data | Where-Object { $_ -like '3618*' }).Count
# output as PsObject
[PsCustomObject]@{
File = $_.Name
Total = $data.Count
Isbn = $isbn
Duplicates = $dupes
}
}
$result | Export-Csv -Path '.\Counts.csv' -UseCulture -NoTypeInformation
关于powershell - 使用 Powershell 对文本文件进行计数和过滤,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65125084/