python - 当我 "read"列的文本内容时，为什么会收到 StaleElementReferenceException？

仅当我尝试收集(“抓取”)表的内容两次时才会出现此问题。我第一次读取表的内容成功，但第二次总是失败。

只有 Chrome 浏览器(带有相应 chromedriver 的版本 74)才会出现这种情况。我在 FireFox 上尝试了同样的操作，但从未发生过。我在 Chrome 中找到了某种解决方法，这没有任何意义，但它可以完成工作。

当我“转到”包含表格的屏幕以外的其他屏幕然后返回时，表格抓取成功。

以下是我用来收集表格的函数:

def Get_Faults_List(Port_Number=None, PSU=None, Retries=5):
    for attempt in range(Retries):
        try:
            if Port_Number:
                # Show the Faults view in the context of "Port_Number"
                Device_Panel_Frame.Click_Port(self, Port_Number)
            elif PSU:
                if not Device_Panel_Frame.Click_PSU(self, PSU):
                    return None
            Left_Panel_Frame.Click_Fault(self)

            self.driver.switch_to_default_content()
            Main_Body = self.driver.find_element_by_name('main_page')
            self.driver.switch_to.frame(Main_Body)
            alarms_tab = self.driver.find_element_by_id('tab_alarms')
            alarms_tab.click()
            Fault_Screen = self.driver.find_element_by_name('faults')
            self.driver.switch_to.frame(Fault_Screen)
            # the rows that the following variable collect are automatically
            # the relevant fault lines. The XPATH that was used omits the two
            # irrelevant lines
            faultTable_rows = WebDriverWait(self.driver, timeout=3, poll_frequency=0.5).until(
                EC.presence_of_all_elements_located((By.XPATH, "//table[@id='faultTab']//tr[not(@id or @style)]")))

            current_faults = []
            row_index = 0
            for row in faultTable_rows:  # Go through each of the rows
                current_faults.append([])
                # Collect all the column elements of a certain row into a list
                faultTable_row_cols = row.find_elements_by_tag_name("td")
                for col in faultTable_row_cols:
                    # Each row of the Faults table is separated into 5 columns each column holds a string
                    current_faults[row_index].append(col.text)
                row_index += 1

            break
        except:
            print(attempt + 1, 'attempt failed', Retries - (attempt + 1), 'to go')
            self.Refresh_Screen()
            sleep(5)
            continue

如果我打开一个新的浏览器，我也将成功收集表格的内容。顺便说一句，失败总是发生在下表的第一行(标题之后)。该行是 current_faults[row_index].append(col.text) ，我不明白为什么。异常(exception)没有任何意义。

还有其他方法可以有效地抓取表格的内容吗？

表格:

最佳答案

参见this answer因为你得到 Stale Element Reference Exception .

A Stale Element Reference Exception occurs when an element:

Has been deleted

Is no longer attached to the DOM (as in your case)

Has changed

From the docs:

You should discard the current reference you hold and replace it, possibly by locating the element again once it is attached to the DOM.

即:再次“查找”该元素。

我的建议是捕获 HTML 并循环它:

您可以使用driver.page_source然后 BeautifulSoup 像这样:
html = driver.page_source
soup = BeautifulSoup(html, "lxml")
这应该在切换框架后实现。

希望这对您有帮助!

关于python - 当我 "read"列的文本内容时，为什么会收到 StaleElementReferenceException？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56206359/

python - 当我 "read"列的文本内容时，为什么会收到 StaleElementReferenceException？

我的建议是捕获 HTML 并循环它:

上一篇：python - 将 JSON 页面列表转换为一个对象

下一篇：python - 使用 pandas 选择多列并在多列中 fillna() 的另一种方法