python-3.x - Cloud Natural Language API 返回 socket.gaierror : nodename nor servname provided after performing Sentiment Analysis every now and then

标签 python-3.x google-cloud-platform python-requests

我在 Jupyter notebook 上运行代码,我修改了这个 link 中的代码所以它从 Jupyter 笔记本而不是控制台获取它并迭代文件列表。

"""Demonstrates how to make a simple call to the Natural Language API."""

import argparse
import requests
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


def print_result(annotations, movie_review_filename):


    score = annotations.document_sentiment.score
    magnitude = annotations.document_sentiment.magnitude


    file_path_split = movie_review_filename.split("/")
    fileName = file_path_split[len(file_path_split) - 1][:-4]

    sentencelist = []  
    statuslist = []

    for index, sentence in enumerate(annotations.sentences):
        sentence_sentiment = sentence.sentiment.score
        singlesentence = [fileName, sentence.text.content, sentence.sentiment.magnitude, sentence_sentiment]
        sentencelist.append(singlesentence)


    outputdf = pd.DataFrame(sentencelist, columns = ['status_id', 'sentence', 'sentence_magnitude', 'sentence_sentiment'])        

    outputdf.to_csv("/Users/abhi/Desktop/RetrySentenceCSVs/" + fileName + ".csv", index = False)

    return 0


def analyze(movie_review_filename):
    """Run a sentiment analysis request on text within a passed filename."""
    client = language.LanguageServiceClient()

    with open(movie_review_filename, 'r') as review_file:
        # Instantiates a plain text document.
        content = review_file.read()

    document = types.Document(
        content=content,
        type=enums.Document.Type.PLAIN_TEXT)
    annotations = client.analyze_sentiment(document=document)

    # Print the results
    print_result(annotations, movie_review_filename)


if __name__ == '__main__':

    import glob
    csv_file_list = glob.glob("/Users/abhi/Desktop/mytxtfilepath/*.txt")
    for file in csv_file_list: #Iterate through a list of file paths

        analyze(file)

对于 10% 的文本文件集(我有 687 个),代码运行良好,但过了一段时间它开始抛出错误:
ERROR:root:AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x113b76588>" raised exception!
Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 56, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/anaconda3/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 849, in _validate_conn
    conn.connect()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 314, in connect
    conn = self._new_conn()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 180, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 445, in send
    timeout=timeout
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/anaconda3/lib/python3.6/site-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/requests.py", line 120, in __call__
    **kwargs)
  File "/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 512, in request
    resp = self.send(prep, **send_kwargs)
  File "/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 622, in send
    r = adapter.send(request, **kwargs)
  File "/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 513, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/grpc/_plugin_wrapping.py", line 77, in __call__
    callback_state, callback))
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/grpc.py", line 77, in __call__
    callback(self._get_authorization_headers(context), None)
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/grpc.py", line 65, in _get_authorization_headers
    headers)
  File "/anaconda3/lib/python3.6/site-packages/google/auth/credentials.py", line 122, in before_request
    self.refresh(request)
  File "/anaconda3/lib/python3.6/site-packages/google/oauth2/service_account.py", line 322, in refresh
    request, self._token_uri, assertion)
  File "/anaconda3/lib/python3.6/site-packages/google/oauth2/_client.py", line 145, in jwt_grant
    response_data = _token_endpoint_request(request, token_uri, body)
  File "/anaconda3/lib/python3.6/site-packages/google/oauth2/_client.py", line 106, in _token_endpoint_request
    method='POST', url=token_uri, headers=headers, body=body)
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/requests.py", line 124, in __call__
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
google.auth.exceptions.TransportError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
ERROR:root:AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x113b76588>" raised exception!
Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 56, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/anaconda3/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 849, in _validate_conn
    conn.connect()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 314, in connect
    conn = self._new_conn()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 180, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x113b84470>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:
...

错误重复出现,然后对文件运行 SentimentAnalysis,然后多次出现,然后对文件运行 SentimentAnalysis,最后以 RendezVous 停止。错误(忘记捕获此消息)我想知道的是,代码如何为某些文件集工作一段时间并抛出错误消息,工作多一点,抛出错误消息,然后在一段时间后完全停止工作观点?

我重新运行了代码,却发现它在文件夹中随机出现了一些文件后返回了 socket.gaierror。因此,您可以相当自信地看到问题不是文件内容。

EDIT1:该文件只是任何 .txt包含文字的文件。
有人可以帮我解决这个问题吗?我也可以向你保证,我在所有 680 个文件中的所有文本占总共 1400 个请求,我根据 Cloud Natural API 对请求的定义进行了非常细致的计算。所以我在我的范围内很好。

EDIT2:我试过 sleep(10)这似乎可以正常工作一段时间,但再次开始抛出错误..

最佳答案

我想到了。您将不必一次读取所有 600 个文件,而是尝试分批读取 50 个文件。 (创建 12 个文件夹,每个文件夹包含 50 个文件),并在每次扫描完文件夹时手动运行代码。我不确定为什么这似乎可行,但它确实有效。

关于python-3.x - Cloud Natural Language API 返回 socket.gaierror : nodename nor servname provided after performing Sentiment Analysis every now and then,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51775484/

相关文章:

python - bs4 - 根据其他值提取特定的 href

python - 如何为字母和字母数字密码创建暴力密码破解程序?

python - 模块未找到错误 : No module named 'pygame.locals' ; 'pygame' is not a package

python - 无法使用请求解析网页的确切结果

java - 使用 ValueProvider 从 Dataflow 模板读取 BigQuery 时出现异常

node.js - 谷歌云数据存储速度慢(> 800ms)来自计算引擎的简单查询

python 请求从浏览器或 urllib 返回不同的网页

python - 如何抑制或捕获 subprocess.run() 的输出?

python - 从Python中的名称列中删除前缀

google-cloud-platform - 使用node gcloud创建新项目时出现"The caller does not have permission"错误