我在使用 python3 循环读取文件 N 行并 POST 到服务器时遇到了一些麻烦。 我的文件 urls.txt 是这样的(数万行):
https://www.xxxx.com/html/1.html
https://www.xxxx.com/html/2.html
https://www.xxxx.com/html/3.html
https://www.xxxx.com/html/4.html
https://www.xxxx.com/html/5.html
https://www.xxxx.com/html/6.html
https://www.xxxx.com/html/7.html
https://www.xxxx.com/html/8.html
我想将它们发布到一次限制 2000 行的服务器上,所以我想找到一些方法来解决这个问题,我的代码如下:
filename = 'urls.txt'
max_lines = 2000
url_list = []
with open(filename,'r',encoding='utf-8') as f:
while True:
next_n_lines = list(islice(f,max_lines))
if not next_n_lines:
break
url_list = [line.strip() for line in next_n_lines if line.strip() != '']
data = '\n'.join(url_list)
domain = 'www.xxxx.com'
token = 'xsdssdsddsdsd'
url = 'http://post.xxxx.com/urls?domain=%s&token=%s' % (
domain, token)
headers = {
'Host': 'post.xxxx.com',
'Content-Type': 'text/plain',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0'
}
r = requests.post(url, headers=headers, data=data, timeout=5)
data = r.json()
print(data)
但是,当我运行我的代码时,它只将前 2000 行 url 发布到服务器,我的代码有什么问题,你能给我一些建议或其他方法吗? 感谢阅读!
最佳答案
据我所知,您只从文件中提取了前 2000 行。考虑一下:
filename = 'urls.txt'
max_lines = 2000
start_index = 0;
url_list = []
with open(filename,'r',encoding='utf-8') as f:
while True:
next_n_lines = list(islice(f, start_index, start_index + max_lines))
start_index = start_index + max_lines
if not next_n_lines:
break
url_list = [line.strip() for line in next_n_lines if line.strip() != '']
data = '\n'.join(url_list)
domain = 'www.xxxx.com'
token = 'xsdssdsddsdsd'
url = 'http://post.xxxx.com/urls?domain=%s&token=%s' % (
domain, token)
headers = {
'Host': 'post.xxxx.com',
'Content-Type': 'text/plain',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0'
}
r = requests.post(url, headers=headers, data=data, timeout=5)
data = r.json()
print(data)
关于python - 如何使用 python3 循环读取文件 N 行并 POST 到服务器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53590280/