我这里有一个代码,可以从特定网站https://vancouver.craigslist.org/search/ela
抓取数据。 。我的问题是当我执行代码时,它给我一个错误 'list' object has no attribute 'get_attribute'
在线asdf = images.get_attribute("src")
。我正在使用 selenium 库来抓取数据。我想要的是从名为 images
的表中插入图像 url但是我不能。我的代码有什么问题吗?我对 python 还不熟悉,这就是我问问题的原因。非常感谢您的考虑。
当前代码
x = driver.find_elements_by_class_name('hdrlnk')
y = driver.find_elements_by_xpath('//p[@class="result-info"]/span[@class="result-meta"]//span[@class="result-price"]')
images = driver.find_elements_by_xpath('//*[@id="sortable-results"]/ul/li/a/img')
for img in images:
print(img.get_attribute('src'))
for i in range(len(x)):
asdf = images.get_attribute("src")
prod = (x[i].text)
price = (y[i].text)
image = asdf
sql = """INSERT INTO products (name,price,image) VALUES (%s,%s,%s)"""
mycursor.execute(sql,(prod,price,image))
mydb.commit()
当我评论这一行时
for img in images:
print(img.get_attribute('src'))
并删除 asdf
和 image
变量,我可以插入数据,并且当我注释这行代码并保留图像的打印时,
#for i in range(len(x)):
#asdf = images.get_attribute("src")
#prod = (x[i].text)
#price = (y[i].text)
#image = asdf
#sql = """INSERT INTO products (name,price,image) VALUES (%s,%s,%s)"""
#mycursor.execute(sql,(prod,price,image))
#mydb.commit()
我得到了我想要的结果,就像这样
https://images.craigslist.org/00z0z_4cqgwC5PIXs_300x300.jpg
https://images.craigslist.org/00J0J_f6AnAonGjXd_300x300.jpg
https://images.craigslist.org/00606_mtKNjKREOO_300x300.jpg
https://images.craigslist.org/00U0U_l5t0QnjZEPt_300x300.jpg
https://images.craigslist.org/00505_gIXt1C8aeqk_300x300.jpg
https://images.craigslist.org/00N0N_6P1GmSiL2vI_300x300.jpg
x
的示例数据和 y
变量 i
循环:
x = Spigen Magnetic Car Phone Mount
y= $20
为了将包含产品名称和图像的图像网址插入一行中,我需要做什么? TIA。
编辑。我尝试了@terahertz的答案并像这样重写了我的代码
x = driver.find_elements_by_class_name('hdrlnk')
y = driver.find_elements_by_xpath('//p[@class="result-info"]/span[@class="result-meta"]//span[@class="result-price"]')
images = driver.find_elements_by_xpath('//*[@id="sortable-results"]/ul/li/a/img')
for img in images:
# print(img.get_attribute('src'))
for i in range(len(x)):
asdf = img.get_attribute("src")
prod = (x[i].text)
price = (y[i].text)
image = asdf
sql = """INSERT INTO products (name,price,image) VALUES (%s,%s,%s)"""
mycursor.execute(sql,(prod,price,image))
mydb.commit()
当前数据库数据
+-----+------------------------------------------------------------------------+--------+-------------------------------------------------------------+
| id | name | price | image |
+-----+------------------------------------------------------------------------+--------+-------------------------------------------------------------+
| 1 | Spigen Magnetic Car Phone Mount | $20 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
| 2 | Netgear Nighthawk x6 r8000 wireless router | $120 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
| 3 | iPod Touch 8gb 2nd generation - Loaded with Classic Rock | $60 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
| 4 | 3 plug 3.1A fast USB wallplugs | $10 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
| 5 | Audio and Video Cables | $3 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
| 6 | Like New Samsung 50" HD TV ForSale | $400 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
| 7 | SONY Alarm Clock | $20 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
| 8 | Bowers & Wilkins P7 Wireless MINT | $450 | https://images.craigslist.org/00i0i_7PvHxDMvR2o_300x300.jpg |
+-----+------------------------------------------------------------------------+--------+-------------------------------------------------------------+
现在我可以插入到我的数据库中,但问题是 image
列与其他列具有相同的值。就像只插入了一个图像网址。当我访问该链接时,产品名称和图像不匹配。
最佳答案
将 asdf = images.get_attribute("src")
更改为 asdf = img.get_attribute("src")
您的外循环正在使用变量 img
访问 images
列表中的每个项目。但在您的内部循环中,您正在访问 images
列表。
关于插入数据库时Python嵌套循环错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58810734/