我最近不得不在紧急情况下复制我网站上的数据库。
我使用我的管理代码在 Python 中创建的一些函数来抓取它。 数据库的格式如下:
Name:
Phone Number:
Has played the game:
所有内容都已复制到 .txt
文件中,但有时,我会在文件中发现一些错误,例如:
Name: Name: Name: Bob
我如何使用 shell 命令或 Python 清理这个困惑但保持相同的顺序(我希望它仍然是姓名、电话号码等)?
最佳答案
假设你在 db.txt
中有这个
Phone Number:
Phone Number: Phone Number: Phone Number: 0118521358 Name: Name: Name: Name: Bob
Has played the game:
Name: Name: Name: Name: Bob
你可以试试这样的小脚本
import re
#create a new file called new_file
new_file=open("new_file",'w')
#open the database file with the discrepancies
file_with_error=open('db.txt','r')
#make a list of all your columns in the db
db_header=['Name:','Phone Number:']
#iterate through each line in your database file and find matches to replace
for line in file_with_error:
for col_name in db_header:
line=re.sub("(%s[ ]*)+" %(col_name,),col_name,line)
new_file.write(line) #write your new line your file
new_file.close()
exit(0)
关于python - 清理具有多个重复项的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37763443/