在输出列表中不在第二个列表中的项目时,我遇到以下问题。
这是代码:
def getInitialList(): # Define initial list with the use of requests and BS, will return a set
getInHtml = requests.get("http://127.0.0.1")
parseInHtml = BeautifulSoup(getInHtml.content, "html.parser")
processInHtml = parseInHtml.find_all("div", class_="inner-article")
firstList = []
for items in processInHtml:
firstList.append(items)
return firstList
def getSecList(): #Define second list with the use of requests and BS, will return a set
getHtml = requests.get("http://127.0.0.1")
parseHtml = BeautifulSoup(getHtml.content, "html.parser")
processHtml = parseHtml.find_all("div", class_="inner-article")
secList = []
for items in processHtml:
secList.append(items)
return secList
def catch_new_item():
initList = getInitialList()
while True:
if initList == getSecList():
print("No new items")
else:
print("New items found")
break
secList = getSecList()
return set(secList) - set(initList)
最后一个函数(catch_new_items())应该返回 secList 中不在 initList 中的内容,但是当我运行它时,它返回一个空集。
地址 127.0.0.1 是一个本地 Web 服务器,我正在运行该服务器来确定这两个项目之间的差异。我所做的就是编辑 html 并向其中添加一个元素。
请告诉我你的想法?
最佳答案
我以这种方式修改了代码,以进行调试:
def getInitialList(): # Define initial list with the use of requests and BS, will return a set
firstList = ['1', '2', '3']
return firstList
def getSecList(): #Define second list with the use of requests and BS, will return a set
secList = ['a', 'b', '3', '1']
return secList
def catch_new_item():
initList = getInitialList()
while True:
if initList == getSecList():
print("No new items")
else:
print("New items found")
break
secList = getSecList()
return set(secList) - set(initList)
print(catch_new_item())
它返回:
New items found
{'a'}
所以元素检测的逻辑是好的。
- 您是否尝试过从 getInitialList() 打印出您的列表并 getSecList()函数看看是否为空?
- 列表真的包含不同的项目吗? (如果它们不为空,请参阅 p。 1)
关于python - 2个列表之间的差异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52581359/