python - 数据没有被正确抓取

尝试使用Scrapy抓取以下网页，https://www2.trollandtoad.com/buylist/?_ga=2.123753418.115346513.1562026676-1813285172.1559913561#!/M/10591 ，并且我正确抓取了部分数据，但无法正确抓取卡名称，因为它的选择器与集合名称相同，所以我也只获取卡名称的集合名称。

 def parse(self, response):
        #  Initialize item to function GameItem located in items.py, will be called multiple times
        item = GameItem()
        # Extract card category from URL using html code from website that identifies the category.  Will be outputted before rest of data
        for data in response.css("tr.ng-scope"):
            item["Set"] =data.css("a.ng-binding.ng-scope::text").get()
            if item["Set"] == None:
                item["Set"] = data.css("span.ng-binding.ng-scope::text").get()
            item["Card_Name"] = data.css("a.ng-binding.ng-scope::text").get()
            # Call item again in order to extract the condition, stock, and price using the corresponding html code from the website
            item["Condition"] = data.css("td\.5557170.buylist_condition::text").get()
            item["Quantity"] = data.css("span.ng-binding::text").get()
            item["Price"] = data.css("span.ng-binding::text").get()

更新#1

我使用 xpath 代替，并且能够获取卡名称而不是设置名称，但它为每一行返回相同的卡名称，而不是不同的卡名称。

item["Card_Name"] = data.xpath("/html/body/div[2]/div[2]/div[1]/table[1]/tbody/tr[1]/td[2]/a/text()").get()

最佳答案

card_names = response.xpath("//div/table/tbody/tr/td[contains(@class,'buylist_productname item')]/a/text()").getall()

将根据页面中的顺序返回不同卡片名称的列表。

关于python - 数据没有被正确抓取，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56858673/

python - 数据没有被正确抓取

上一篇：python - 添加 kivy 自定义小部件仅添加第一个布局

下一篇：google-chrome - WebDriver异常: unknown error: DevToolsActivePort file doesn't exist while trying to initiate Chrome Browser