python - 无法将抓取的数据从列表转换为常规字符串

当我运行爬虫时，它会以列表形式获取结果。但是，我希望将其以常规字符串的形式显示在两列中。感谢您的任何建议。

import requests
from lxml import html

url="http://www.wiseowl.co.uk/videos/"
def Startpoint(links):
    response = requests.get(links)
    tree = html.fromstring(response.text)
    Title= tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/text()")
    Link=tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/@href")
    print(Title,Link)

Startpoint(url)

得到这样的结果:

但是，我期望的输出如下:

最佳答案

您的Title和Link实际上不包含单个元素，但两者分别包含所有标题和链接的列表(这些XPath 表达式匹配多个元素)。

因此，为了获取标题、链接对的列表，您需要 zip()他们在一起:

pairs = zip(titles, links)

一旦获得，您可以使用 for 循环迭代这些对，并打印左对齐的项目，以便获得列:

print('{:<70}{}'.format(title, link))

(有关如何打印左对齐项目的详细信息，请参阅 this answer)。

一切都在一起:

import requests
from lxml import html

url = "http://www.wiseowl.co.uk/videos/"


def startpoint(links):
    response = requests.get(links)
    tree = html.fromstring(response.text)
    titles = tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/text()")
    links = tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/@href")
    pairs = zip(titles, links)

    for title, link in pairs:
        # Replace '70' with whatever you expect the maximum title length to be
        print('{:<70}{}'.format(title, link))

startpoint(url)

关于python - 无法将抓取的数据从列表转换为常规字符串，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43924526/

上一篇：python - 在 pandas groupby 对象中创建求和和除法

下一篇：python - 在 Pandas 的下一个 K 连续行中找到最高值？

python - 周期性返回 None 的函数的流控制

python - 抓取雅虎股票新闻

python - 如何在 Qualtrics 中循环数百个图像 - 实现代码所需的帮助

r - 如何清理和组织这个已抓取数据列表？

python - CNN 给出有偏见的结果

python - 如何在 python 中模拟用户并使用 os.system

javascript - 使用 PhantomJS 和 pjscrape 来抓取动态生成的网页内容

javascript - 如何使用 node.js 抓取需要身份验证的网站？

python - 如何在 Seaborn boxplot 中编辑 mustache 、传单、帽等的属性