python - 无法访问带有漂亮汤的推文 ID

我的目标是在 Twitter 搜索中检索正在发布的推文的 ID。到目前为止，我的代码如下所示:

import requests
from bs4 import BeautifulSoup

keys = some_key_words + " -filter:retweets AND -filter:replies"
query = "https://twitter.com/search?f=tweets&vertical=default&q=" + keys + "&src=typd&lang=es"
req = requests.get(query).text
soup = BeautifulSoup(req, "lxml")

for tweets in soup.findAll("li",{"class":"js-stream-item stream-item stream-item"}):
    print(tweets)

但是，这不会返回任何内容。是代码本身有问题还是我看错了源码的地方？我知道 ID 应该存储在这里:

<div class="stream">
  <ol class="stream-items js-navigable-stream" id="stream-items-id">
    <li class="js-stream-item stream-item stream-item" **data-item-id**="1210306781806833664" id="stream-item-tweet-1210306781806833664" data-item-type="tweet">

最佳答案

from bs4 import BeautifulSoup
data = """
<div class="stream">
    <ol class="stream-items js-navigable-stream" id="stream-items-id">
        <li class="js-stream-item stream-item stream-item
" **data-item-id**="1210306781806833664"
id="stream-item-tweet-1210306781806833664"
data-item-type="tweet"
>
        ...
"""


soup = BeautifulSoup(data, 'html.parser')

for item in soup.findAll("li", {'class': 'js-stream-item stream-item stream-item'}):
    print(item.get("**data-item-id**"))

输出:

1210306781806833664

关于python - 无法访问带有漂亮汤的推文 ID，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59502596/

上一篇：eclipse-rcp - 在 Eclipse RCP 4.2 中获取参数化命令的参数

下一篇：google-maps - 加载任意地址的简单地理编码示例

python - 脚本加密，但仍然保存明文而不是密文

python - c pickle : UnpicklingError: invalid load key, 'A'

python - ListProperty 与 GoogleAppEngine

javascript - 浏览器中的滚动位置 (scrollTop) 硬件加速了吗？

php - 为什么必须定期进行用户注册？

python - 使用堆栈名称标记 ec2 实例

javascript - 动画中的重复内容阅读更多/更少

ios - 使用 twitterkit 将图片发布到 Twitter

java - 在 appengine 中使用 twitter4j 的正确方法