我用 python 创建了一个脚本来解析一些 url 并将它们存储在数据框中。我的脚本可以做到。但是,它并没有按照我的预期进行。
我尝试过:
import requests
from bs4 import BeautifulSoup
import pandas as pd
base = 'http://opml.radiotime.com/Search.ashx?query=kroq'
linklist = []
r = requests.get(base)
soup = BeautifulSoup(r.text,"xml")
for item in soup.select("outline[type='audio'][URL]"):
find_match = base.split("=")[-1].lower()
if find_match in item['text'].lower():
linklist.append(item['URL'])
df = pd.DataFrame(linklist, columns=[find_match])
print(df)
当前输出:
0 http://opml.radiotime.com/Tune.ashx?id=s35105
1 http://opml.radiotime.com/Tune.ashx?id=s26581
2 http://opml.radiotime.com/Tune.ashx?id=t122458...
3 http://opml.radiotime.com/Tune.ashx?id=t132149...
4 http://opml.radiotime.com/Tune.ashx?id=t131867...
5 http://opml.radiotime.com/Tune.ashx?id=t120569...
6 http://opml.radiotime.com/Tune.ashx?id=t125126...
7 http://opml.radiotime.com/Tune.ashx?id=t131068...
8 http://cdn-cms.tunein.com/service/Audio/nostre...
9 http://cdn-cms.tunein.com/service/Audio/notcom...
预期输出(如果可能的话,我也希望剔除索引):
0 http://opml.radiotime.com/Tune.ashx?id=s35105
1 http://opml.radiotime.com/Tune.ashx?id=s26581
2 http://opml.radiotime.com/Tune.ashx?id=t122458
3 http://opml.radiotime.com/Tune.ashx?id=t132149
4 http://opml.radiotime.com/Tune.ashx?id=t131867
5 http://opml.radiotime.com/Tune.ashx?id=t120569
6 http://opml.radiotime.com/Tune.ashx?id=t125126
7 http://opml.radiotime.com/Tune.ashx?id=t131068
8 http://cdn-cms.tunein.com/service/Audio/nostre
9 http://cdn-cms.tunein.com/service/Audio/notcom
最佳答案
你可以对齐。要摆脱索引,请在写入 csv 时将其删除
df.style.set_properties(**{'text-align': 'left'})
df.to_csv(r'Data.csv', sep=',', encoding='utf-8-sig',index = False )
关于python - 无法在数据框中以自定义方式存储输出,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57221108/