python - 在 Python 中列出所有 Google Drive 文件和文件夹并保存 ID

我正在尝试编写一个程序来将文件夹和所有内容(包括子文件夹等)复制到另一个文件夹。

我可能过于复杂化了，但我觉得第一步是获取与它们关联的所有文件名和 ID，并将它们保存到两个列表中 - 一个用于文件，一个用于文件夹。

我无法让我的程序递归地遍历所有子文件夹，我认为 for 循环可以使用 i 从正在填充的列表中选择索引来实现这一点。

正如您从下面的输出中看到的，我的程序正在遍历传递给函数的目录和第一个子文件夹，但随后程序会干净地退出。

对大量代码表示歉意，但我认为上下文很重要。

输入:

listofdictFolders = []
listofdictFiles = []


def mapFolderContents(folderid):
    # retrieves parent name from folderid
    parentfolder = service.files().get(fileId=folderid).execute()
    parentname = 'Title: %s' % parentfolder['name']
    # sets query as argument passed to function, searches for mimeType matching folders and saves to variable
    folderquery = "'" + folderid + "'" + " in parents and mimeType='application/vnd.google-apps.folder'"
    childrenFoldersDict = service.files().list(q=folderquery,
                                               spaces='drive',
                                               fields='files(id, name)').execute()
    # sets query as argument passed to function, searches for mimeType matching NOT folders and saves to variable
    notfolderquery = "'" + folderid + "'" + \
                     " in parents and not mimeType='application/vnd.google-apps.folder'"
    childrenFilesDict = service.files().list(q=notfolderquery,
                                             spaces='drive',
                                             fields='files(name, id)').execute()
    # takes value pair of 'files' which is a list of dictionaries containing ID's and names.
    childrenFolders = (childrenFoldersDict['files'])
    # takes value pair of 'files' which is a list of dictionaries containing ID's and names.
    childrenFiles = (childrenFilesDict['files'])
    # if no files found, doesn't append to list
    if len(childrenFiles) > 0:
        listofdictFiles.append(['Parent Folder ' + parentname, childrenFiles])
    # if no folders found, doesn't append to list 
    if len(childrenFolders) > 0:
        listofdictFolders.append(['Parent Folder ' + parentname, childrenFolders])
    # finds length of list for use in for loop later to avoid index out of range error
    maxIndex = len(listofdictFolders)
    # for loop to find ID's and names of folders returned above and append name and ID's to dictionary
    for i in range(0, maxIndex):
        # strip variables are to access ID values contained in dictionary
        strip1 = listofdictFolders[0]
        strip2 = strip1[1]
        print('Now indexing ' + str(strip2[i]['name']) + '...')
        # saves query to variable using strip2 variable, index and 'id' key
        loopquery = "'" + str(strip2[i]['id']) + "'" \
                    + " in parents and mimeType='application/vnd.google-apps.folder'"
        loopquery2 = "'" + str(strip2[i]['id']) + "'" \
                    + " in parents and not mimeType='application/vnd.google-apps.folder'"
        # saves return value (dictionary) to variable
        loopreturn = service.files().list(q=loopquery,
                                          spaces='drive',
                                          fields='files(id, name)').execute()
        loopreturn2 = service.files().list(q=loopquery2,
                                          spaces='drive',
                                          fields='files(id, name)').execute()
        loopappend = (loopreturn['files'])
        loopappend2 = (loopreturn2['files'])
        # appends list of dictionaries to listofdictFolders
        listofdictFolders.append(['Parent Folder Title: ' + str(strip2[i]['name']), loopappend])
        listofdictFiles.append(['Parent Folder Title: ' + str(strip2[i]['name']), loopappend2])

mapFolderContents(blankJobFolderID)
pprint.pprint(listofdictFiles)
print('')
pprint.pprint(listofdictFolders)

输出:

Now indexing subfolder 1...
[['Parent Folder Title: Root',
  [{'id': 'subfolder 1 ID', 'name': 'subfolder 1'},
   {'id': 'subfolder 2 ID', 'name': 'subfolder 2'},
   {'id': 'subfolder 3 ID', 'name': 'subfolder 3'}]],
 ['Parent Folder Title: subfolder 1',
  [{'id': 'sub-subfolder1 ID', 'name': 'sub-subfolder 1'},
   {'id': 'sub-subfolder2 ID', 'name': 'sub-subfolder 2'}]]]

[['Parent Folder Title: Venue',
  [{'id': 'sub-file 1 ID',
    'name': 'sub-file 1'}]]]

Process finished with exit code 0

最佳答案

您可以使用递归 BFS 来检索所有文件和文件夹

这是我的方法:

def getChildrenFoldersByFolderId(folderid):
  folderquery = "'" + folderid + "'" + " in parents and mimeType='application/vnd.google-apps.folder'"
  childrenFoldersDict = service.files().list(q=folderquery,
                                              spaces='drive',
                                              fields='files(id, name)').execute()

  return childrenFoldersDict['files']

def getChildrenFilesById(folderid):
  notfolderquery = "'" + folderid + "'" + \
                    " in parents and not mimeType='application/vnd.google-apps.folder'"
  childrenFilesDict = service.files().list(q=notfolderquery,
                                            spaces='drive',
                                            fields='files(name, id)').execute()

  return childrenFilesDict['files']

def getParentName(folderid):
  # retrieves parent name from folderid
  parentfolder = service.files().get(fileId=folderid).execute()
  parentname = 'Title: %s' % parentfolder['name']

  return parentname

def bfsFolders(queue=[]):
  listFilesFolders = {}
  while len(queue) > 0:
    
    currentFolder = queue.pop()
    childrenFolders = getChildrenFoldersByFolderId(currentFolder['id'])
    childrenFiles = getChildrenFilesById(currentFolder['id'])
    parentName = getParentName(currentFolder['id'])
    listFilesFolders['folderName'] = currentFolder['name']
    listFilesFolders['folderId'] = currentFolder['id']
    listFilesFolders['parentName'] = parentName
    if len(childrenFiles) > 0:
      listFilesFolders['childrenFiles'] = childrenFiles

    if len(childrenFolders) <= 0:
      return listFilesFolders

    listFilesFolders['childrenFolders'] = []
    for child in childrenFolders:
      queue.append(child)
      listFilesFolders['childrenFolders'].append(bfsFolders(queue))
    
  return listFilesFolders



filesAndFolders = bfsFolders([{'id': "ASDASDASDASDVppeC1zVVlWdDhkASDASDQ", 'name': 'folderRoot'}])

pprint.pprint(filesAndFolders)

首先将函数分开以简化脚本。完成此操作后，通过使用根节点作为包含文件夹 ID 和名称的参数来使用 BFS。

广度优先搜索将递归地使用一个名为 listFilesFolders 的 FIFO 列表，其中包含一个字典。一旦设置了字典，它将返回节点(字典本身)，除非有更多文件夹要“扩展”。

关于python - 在 Python 中列出所有 Google Drive 文件和文件夹并保存 ID，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66111744/

python - 在 Python 中列出所有 Google Drive 文件和文件夹并保存 ID

您可以使用递归 BFS 来检索所有文件和文件夹

上一篇：Python 类方法行为异常

下一篇：amazon-web-services - 隔离 VPC 子网中的 Lambda 函数无法访问 SSM 参数