python - 优化 python 处理从 fb-graph-api 检索的 json

我从 facebook-graph-api 获取有关以下内容的 json 数据:

我和 friend 的关系
我的 friend 之间的关系。

现在我的程序看起来像这样(在Python伪代码中，请注意出于隐私考虑，一些变量已被更改):

import json
import requests

# protected
_accessCode = "someAccessToken"
_accessStr = "?access_token=" + _accessCode
_myID = "myIDNumber"

r = requests.get("https://graph.facebook.com/" + _myID + "/friends/" + _accessStr)
raw = json.loads(r.text)

terminate = len(raw["data"])

# list used to store the friend/friend relationships
a = list()

for j in range(0, terminate + 1):
    # calculate terminating displacement:
    term_displacement = terminate - (j + 1) 
    print("Currently processing: " + str(j) + " of " + str(terminate))
    for dj in range(1, term_displacement + 1):
        # construct urls based on the raw data:
        url = "https://graph.facebook.com/" + raw["data"][j]["id"] + "/friends/" + raw["data"][j + dj]["id"] + "/" + _accessStr
        # visit site *THIS IS THE BOTTLENECK*:
        reqTemp = requests.get(url)
        rawTemp = json.loads(reqTemp.text)
        if len(rawTemp["data"]) != 0:
            # data dumps to list which dumps to file
            a.append(str(raw["data"][j]["id"]) + "," + str(rawTemp["data"][0]["id"]))

outputFile = "C:/Users/franklin/Documents/gen/friendsRaw.csv"
output = open(outputFile, "w")

# write all me/friend relationship to file
for k in range(0, terminate):
    output.write(_myID + "," + raw["data"][k]["id"] + "\n")

# write all friend/friend relationships to file
for i in range(0, len(a)):
    output.write(a[i])

output.close()

所以它的作用是:首先它调用我的页面并获取我的 friend 列表(这是通过使用 access_token 通过 facebook api 允许的)调用 friend 的 friend 列表是不允许的，但我可以通过请求之间的关系来解决这个问题我的名单上的一个 friend 和我的名单上的另一个 friend 。因此，在第二部分(由双 for 循环表示)中，我发出另一个请求，看看某个 friend a 是否也是 b 的 friend (两者都在我的列表中)；如果是这样，将会有一个长度为 1、 friend a 的名字的 json 对象。

但是对于大约 357 个 friend ，实际上需要发出数千个页面请求。换句话说，程序花费了大量时间等待 json 请求。

我的问题是可以重写它以提高效率吗？目前，由于安全限制，不允许调用好友的好友列表属性。而且 api 似乎不允许这样做。有没有什么Python技巧可以让它运行得更快？也许并行？

更新修改后的代码粘贴在下面的答案部分。

最佳答案

更新这是我想出的解决方案。感谢@DMCS 的 FQL 建议，但我只是决定使用我所拥有的。当我有机会研究实现时，我将发布 FQL 解决方案。正如您所看到的，此方法只是使用了更精简的 API 调用。

顺便说一下，为了将来引用，API 调用限制是 600 calls per 600 seconds, per token & per IP ，因此对于每个唯一的 IP 地址，使用唯一的访问 token ，调用次数限制为每秒 1 次调用。我不确定这对于异步调用 @Gerrat 意味着什么，但就是这样。

import json
import requests

# protected
_accessCode = "someaccesscode"
_accessStr = "?access_token=" + _accessCode
_myID = "someidnumber"

r = requests.get("https://graph.facebook.com/" 
    + _myID + "/friends/" + _accessStr)
raw = json.loads(r.text)

terminate = len(raw["data"])

a = list()
for k in range(0, terminate - 1):
    friendID = raw["data"][k]["id"]
    friendName = raw["data"][k]["name"]
    url = ("https://graph.facebook.com/me/mutualfriends/" 
        + friendID + _accessStr)
    req = requests.get(url)
    temp = json.loads(req.text)
    print("Processing: " + str(k + 1) + " of " + str(terminate))
    for j in range(0, len(temp["data"])):
        a.append(friendID + "," + temp["data"][j]["id"] + "," 
            + friendName + "," + temp["data"][j]["name"])

# dump contents to file:
outputFile = "C:/Users/franklin/Documents/gen/friendsRaw.csv"
output = open(outputFile, "w")
print("Dumping to file...")
# write all me/friend relationships to file
for k in range(0, terminate):
    output.write(_myID + "," + raw["data"][k]["id"] 
        + ",me," + str(raw["data"][k]["name"].encode("utf-8", "ignore")) + "\n")

# write all friend/friend relationships to file
for i in range(0, len(a)):
    output.write(str(a[i].encode("utf-8", "ignore")) + "\n")

output.close()

关于python - 优化 python 处理从 fb-graph-api 检索的 json，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/14106255/

python - 优化 python 处理从 fb-graph-api 检索的 json

上一篇：python - 无法导入 zc.buildout 中的设置

下一篇：python - 使用列表作为列表框变量而不是元组