VBA Web Scrape (getelementsbyclassname)

标签 vba excel web-scraping

我正在尝试抓取以下链接“www.tutorialspoint.com/vba/index.htm”右侧 Pane 中给出的 VBA 类(class)项目列表

但由于一些错误,我无法抓取列表:

Sub tutorailpointsscrap()
      Dim ie As InternetExplorer

      Set ie = New InternetExplorer

      With ie
      .navigate "https://www.tutorialspoint.com//vba/index.htm"
      .Visible = True
      Do While ie.readyState <> READYSTATE_COMPLETE
      DoEvents
      Loop
      End With

      Dim html As HTMLDocument
      Set html = ie.document


      Dim ele As IHTMLElement

      Dim lists As IHTMLElementCollection
      Dim row As Long

      Set ele = html.getElementsByClassName("nav nav-list primary left-menu")

      Set lists = ele.getElementsByTagName("a")
      row = 1


      For Each li In lists
      Cells(row, 1) = li.innerText
      row = row + 1
      Next

      ie.Quit

  End Sub

包含数据的 HTML 是:

<ul class="nav nav-list primary left-menu">
<li class="heading">VBA Tutorial</li>
<li><a href="/vba/index.htm" style="background-color: rgb(214, 214, 214);">VBA - Home</a></li>
<li><a href="/vba/vba_overview.htm">VBA - Overview</a></li>
<li><a href="/vba/vba_excel_macros.htm">VBA - Excel Macros</a></li>
<li><a href="/vba/vba_excel_terms.htm">VBA - Excel Terms</a></li>
<li><a href="/vba/vba_macro_comments.htm">VBA - Macro Comments</a></li>
<li><a href="/vba/vba_message_box.htm">VBA - Message Box</a></li>
<li><a href="/vba/vba_input_box.htm">VBA - Input Box</a></li>
<li><a href="/vba/vba_variables.htm">VBA - Variables</a></li>
<li><a href="/vba/vba_constants.htm">VBA - Constants</a></li>
<li><a href="/vba/vba_operators.htm">VBA - Operators</a></li>
<li><a href="/vba/vba_decisions.htm">VBA - Decisions</a></li>
<li><a href="/vba/vba_loops.htm">VBA - Loops</a></li>
<li><a href="/vba/vba_strings.htm">VBA - Strings</a></li>
<li><a href="/vba/vba_date_time.htm">VBA - Date and Time</a></li>
<li><a href="/vba/vba_arrays.htm">VBA - Arrays</a></li>
<li><a href="/vba/vba_functions.htm">VBA - Functions</a></li>
<li><a href="/vba/vba_sub_procedure.htm">VBA - SubProcedure</a></li>
<li><a href="/vba/vba_events.htm">VBA - Events</a></li>
<li><a href="/vba/vba_error_handling.htm">VBA - Error Handling</a></li>
<li><a href="/vba/vba_excel_objects.htm">VBA - Excel Objects</a></li>
<li><a href="/vba/vba_text_files.htm">VBA - Text Files</a></li>
<li><a href="/vba/vba_programming_charts.htm">VBA - Programming Charts</a></li>
<li><a href="/vba/vba_userforms.htm">VBA - Userforms</a></li>
</ul>

最佳答案

如果我正确理解了您的问题,您需要以下内容:

Dim lists As IHTMLElementCollection
Dim anchorElements As IHTMLElementCollection
Dim ulElement As HTMLUListElement
Dim liElement As HTMLLIElement
Dim row As Long

Set lists = html.getElementsByClassName("nav nav-list primary left-menu")
row = 1

For Each ulElement In lists
    For Each liElement In ulElement.getElementsByTagName("li")
        Set anchorElements = liElement.getElementsByTagName("a")
        If anchorElements.Length > 0 Then
            Cells(row, 1) = anchorElements.Item(0).innerText
            row = row + 1
        End If
    Next liElement
Next ulElement

导致这个(对于所有列表):
VBA - Home
VBA - Overview
VBA - Excel Macros
VBA - Excel Terms
VBA - Macro Comments
VBA - Message Box
VBA - Input Box
VBA - Variables
VBA - Constants
VBA - Operators
VBA - Decisions
VBA - Loops
VBA - Strings
VBA - Date and Time
VBA - Arrays
VBA - Functions
VBA - SubProcedure
VBA - Events
VBA - Error Handling
VBA - Excel Objects
VBA - Text Files
VBA - Programming Charts
VBA - Userforms
VBA - Quick Guide
VBA - Useful Resources
VBA - Discussion
Developer's Best Practices
Questions and Answers
Effective Resume Writing
HR Interview Questions
Computer Glossary
Who is Who

如果您只想要第一个列表的 anchor 内容,那么就像这样。
For Each liElement In lists.Item(0).getElementsByTagName("li")
    Set anchorElements = liElement.getElementsByTagName("a")
    If anchorElements.Length > 0 Then
        Cells(row, 1) = anchorElements.Item(0).innerText
        row = row + 1
    End If
Next liElement

关于VBA Web Scrape (getelementsbyclassname),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41851152/

相关文章:

excel - 如何匹配重音字符而不是制表符

python - 新手: How to scrape multiple web pages with only one start_urls?

python - 美汤刮痧 : Why won't the get_text method return the text of this element?

javascript - 如何抓取 Google 关键字工具?

vba - 什么时候在 VBA 中使用 Workbooks.Close?

python - 将几个 .xlsx 与一个工作表合并(合并)到一个工作簿中(Python)

excel - 检查是否有任何索引匹配结果符合条件

新版本 Excel 中的 VBA

sumproduct 函数的 VBA 替代方案

vba - FollowHyperlink 事件 - 如何阻止链接打开?