我需要在日志文件中屏蔽敏感信息,如名字、姓氏、出生日期、ssn 等,但它们的出现没有特定的模式。需要在整个日志中找到如下字段,并用 xxxxx 屏蔽信息。请帮忙。
包含示例数据的日志 block :
ghix.log.2014-07-25: INFO 07/25/2014 17:13:14 (PlanDisplayRestClient.java:272) - Fetching IndividualPlanList : {"inputData":{"household":{"CSR":"CS4","APTC":"187.0"},"issuerVerifiedFlag":true,"totalContribution":null,"enrollmentType":"I","exchangeType":"ON","showCatastrophicPlan":"false","issuerId":null,"groupId":3098,"eligLeadBenefits":",Nutritional counseling,Weight loss programs","subscriberData":null,"planIdStr":"","planType":"Both","tenant":"","providers":[{"id":"18900405","name":"Dr. Rakshit Kumar","networkId":["33602-TXN001","87006-TXN001","91986-TXN001","32600-TXN001","32600-TXN002","355678-FLN001"],"networkTier":"","spciality":"Counseling/Social Work","city":"Austin","state":"TX","providerType":"DOCTOR","networkTierList":null,"networkIdList":["33602-TXN001","87226-TXN001","91716-TXN001","3278-TXN001","32698-TXN002","3545-FLN001"]}],"planLevel":"","isSpecialEnrollment":"NO","pgrmType":"INDIVIDUAL","coverageStartDate":"01/01/2001","insuranceType":"HEALTH","preferences":{"highDrugUseVal":0.0,"lowDrugUseVal":0.0,"moderateMedicalVal":0.0,"highMedicalVal":0.0,"vHighDrugUseVal":0.0,"moderateDrugUseVal":0.0,"vHighMedicalVal":0.0,"lowMedicalVal":0.0}},"groupDataList":[{"groupId":3098,"aptc":187.0,"remainingAptc":0.0,"csr":"CS4","zipcode":"44444","countycode":"45555","personDataList":[{"personId":"1","externalPersonId":null,"existingMedicalEnrollmentID":null,"existingSADPEnrollmentID":null,"firstname":"Primary","lastname":"Tax Filer","dob":"1/1/2001","smoker":"N","dentalEligible":"NO","relationship":"Self","employerContribution":null,"gender":null},{"personId":"2","externalPersonId":null,"existingMedicalEnrollmentID":null,"existingSADPEnrollmentID":null,"firstname":"Primary","lastname":"Tax Filer","dob":"1/1/2001","smoker":"N","dentalEligible":"NO","relationship":"Child","employerContribution":null,"gender":null}]}],"pldHouseholdPersonList":null,"providersList":[{"id":"1000405","name":"Dr. Rakshit Kumar","networkId":["33002-TXN001","87000-TXN001","91000-TXN001","30003-TXN001","32003-TXN002","35000-FLN001"],"networkTier":"","spciality":"Counseling/Social Work","city":"Austin","state":"TX","providerType":"DOCTOR","networkTierList":null,"networkIdList":["33000-TXN001","87200-TXN001","90006-TXN001","30003-TXN001","30003-TXN002","35000-FLN001"]}],"eligLeadId":null,"ssapApplicationId":null,"consumerData":null}
要屏蔽的数据:
"dob":"1/1/2001"
"name":"Dr. Rakshit Kumar"
"smoker":"N"
"dentalEligible":"NO"
应该看起来像:
"dob":"xxxxx"
"name":"xxxxx"
"smoker":"x"
"dentalEligible":"x"
最佳答案
您的日志包含有效的 JSON 字符串。所以你需要这样做:
- 阅读每一行
- 从行中提取 JSON
- 读取 JSON 并将其解析为某些内部数据结构
- 更改所需字段
- 将更改后的数据结构导出为 JSON
- 将屏蔽日志写入新文件
- 完成
- 利润
编辑
也许它可以通过 bash
使用一些工具来完成,但我使用 Perl 语言来完成此类任务。尝试从零开始教 Perl 确实是题外话。
或者,尝试在 google 上搜索一些从 bash 操作 JSON
等内容。
要获取 JSON 部分,您可以使用类似下一个的内容
while read -r line
do
part1=$(sed 's/\(.*IndividualPlanList : \).*/\1/' <<< "$line")
json=$(sed 's/.*IndividualPlanList : //' <<< "$line")
#do something with the JSON
newjson=$(echo "$json")
#write out the new line
echo "$part1$newjson"
done < logfile.txt
关于linux - 需要 shell/perl 脚本来屏蔽 Linux 上日志文件中的敏感信息,如名字、出生日期、ssn 等,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25003481/