python - 使用 os.scandir 进行文件夹搜索

我编写了一个小程序，它基本上搜索网络驱动器中的一些 mat 文件。我使用的是Python3.6，所以我可以访问os.scandir()据说比 os.walk(). 更好的命令

但是我面临一个奇怪的问题，当我第一次运行程序时，需要很长时间才能获取数据。但是当我在几个小时后运行同一个程序时，它运行得非常快。

谁能解释一下这是为什么？以下是我的代码。

注意:我的网速非常好，因此网络驱动器的映射是无缝的。

class WorkThread(QObject):
    def scantree(self,path):
        try:
            for entry in scandir(path):
                if entry.is_dir(follow_symlinks=False):
                    yield from self.scantree(entry.path)  # see below for Python 2.x
                else:
                    yield entry
        except FileNotFoundError:
            print("Excluded file path")

    def searchFiles(self):
        start=time.time()
        ui.progressBar.setValue(0)
        usePATH='V:\Messdatenbank_Powertrain' # Location to the network drive
        os.chdir(usePATH)
        fileLevels = 0
        i=0
        k=0
        tableSize = ui.tableView.width()
        ui.tableView.setColumnWidth(4, int(tableSize/4) + 30 )
        ui.tableView.setColumnWidth(3, int(tableSize/4) + 300 )
        for entry in self.scantree(usePATH):
            if entry.name.endswith('COMPARE.mat') and 'MATLAB_NVH_TOOL' not in entry.path and 'old' not in entry.path and 'MESSDATENBANK' not in entry.path and 'old_' not in entry.path:
                ui.progressBar.setValue(0)
                i=i+1
                fileLevels=0# if 'COMPARE.mat' in f and not 'MIN' in f and not 'MAX' in f / if 'COMPARE.mat' in f )   # if 'COMPARE.mat' in f and not 'MIN' in f and not 'MAX' in f
                fileLevels=(entry.path.split('\\'))                            # Split path string at all '/'
                #print (fileLevels)
                t_row=[QtGui.QStandardItem(str(fileLevels[2])),QtGui.QStandardItem( str(fileLevels[3])),QtGui.QStandardItem(str(fileLevels[4])),QtGui.QStandardItem(str(fileLevels[len(fileLevels)-1])),QtGui.QStandardItem(str(entry.path))]
                ui.tableView.model().appendRow(t_row)
                ui.tableView.model().layoutChanged.emit()
                fileLevels.remove(fileLevels[len(fileLevels)-1])
                tmp_file_levels='\\'.join(fileLevels)
                ui.files.append(tmp_file_levels) # All files path stored here
                ui.file_loc_name.append(entry.path)
                ui.progressBar.setValue(50)
                # Implement try catch blocks
                if str(fileLevels[2]) not in ui.clusterlist:
                    ui.clusterlist.append(str(fileLevels[2]))
                if str(fileLevels[2]) not in ui.enginedict:
                    ui.enginedict[str(fileLevels[2])]=[str(fileLevels[3])]
                else:
                    if str(fileLevels[3]) not in ui.enginedict[str(fileLevels[2])]:
                        ui.enginedict[str(fileLevels[2])].append(str(fileLevels[3]))
                if str(fileLevels[3]) not in ui.measurementdict:
                    ui.measurementdict[str(fileLevels[3])]=[str(fileLevels[4])]
                else:
                    if str(fileLevels[4]) not in ui.measurementdict[str(fileLevels[3])]:
                        ui.measurementdict[str(fileLevels[3])].append(str(fileLevels[4]))                               
                ui.progressBar.setValue(100)
                QApplication.processEvents() 
            else:
                ui.label_7.setText(str(i))
                ui.tableView.model().layoutChanged.emit()
                ui.progressBar.setValue(0)
        end=time.time()
        print(end-start)
        ui.label_2.setText('Update Complete')
        ui.pushButton.setEnabled(False)
        print(str(len(ui.files)))
        ui.tableView.resizeColumnToContents (2)
        ui.comboBox.setEnabled(True)
        ui.label_7.setText(str(len(ui.files)))
        ui.comboBox.clear()
        ui.comboBox.addItems(["--Select Cluster--"])
        ui.comboBox.addItems(ui.clusterlist)
        ui.progressBar.setValue(100)
        QApplication.processEvents()
        ui.pushButton_2.setEnabled(True)
        ui.pushButton_24.setEnabled(True)

最佳答案

python.org PEP 471 -- os.scandir()描述了 os.scandir 的实现

os.scandir - This new function adds useful functionality and increases the speed of os.walk() by 2-20 times

第一次执行和下一次执行之间的差异是由于第一次执行期间缓存数据造成的。

Notes on caching

The DirEntry objects are relatively dumb -- the name and path attributes are obviously always cached, and the is_X and stat methods cache their values (immediately on Windows via FindNextFile , and on first use on POSIX systems via a stat system call) and never refetch from the system.

For this reason, DirEntry objects are intended to be used and thrown away after iteration, not stored in long-lived data structured and the methods called again and again.

If developers want "refresh" behaviour (for example, for watching a file's size change), they can simply use pathlib.Path objects, or call the regular os.stat() or os.path.getsize() functions which get fresh data from the operating system every call.

关于python - 使用 os.scandir 进行文件夹搜索，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44157216/

python - 使用 os.scandir 进行文件夹搜索

上一篇：python - 根据 bool python 获取数组部分的中位数

下一篇：python - 使用 Mautic API，如何在创建电子邮件时发送参数 "lists"？