我正在尝试抓取以下链接“www.tutorialspoint.com/vba/index.htm”右侧 Pane 中给出的 VBA 类(class)项目列表
但由于一些错误,我无法抓取列表:
Sub tutorailpointsscrap()
Dim ie As InternetExplorer
Set ie = New InternetExplorer
With ie
.navigate "https://www.tutorialspoint.com//vba/index.htm"
.Visible = True
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
End With
Dim html As HTMLDocument
Set html = ie.document
Dim ele As IHTMLElement
Dim lists As IHTMLElementCollection
Dim row As Long
Set ele = html.getElementsByClassName("nav nav-list primary left-menu")
Set lists = ele.getElementsByTagName("a")
row = 1
For Each li In lists
Cells(row, 1) = li.innerText
row = row + 1
Next
ie.Quit
End Sub
包含数据的 HTML 是:
<ul class="nav nav-list primary left-menu">
<li class="heading">VBA Tutorial</li>
<li><a href="/vba/index.htm" style="background-color: rgb(214, 214, 214);">VBA - Home</a></li>
<li><a href="/vba/vba_overview.htm">VBA - Overview</a></li>
<li><a href="/vba/vba_excel_macros.htm">VBA - Excel Macros</a></li>
<li><a href="/vba/vba_excel_terms.htm">VBA - Excel Terms</a></li>
<li><a href="/vba/vba_macro_comments.htm">VBA - Macro Comments</a></li>
<li><a href="/vba/vba_message_box.htm">VBA - Message Box</a></li>
<li><a href="/vba/vba_input_box.htm">VBA - Input Box</a></li>
<li><a href="/vba/vba_variables.htm">VBA - Variables</a></li>
<li><a href="/vba/vba_constants.htm">VBA - Constants</a></li>
<li><a href="/vba/vba_operators.htm">VBA - Operators</a></li>
<li><a href="/vba/vba_decisions.htm">VBA - Decisions</a></li>
<li><a href="/vba/vba_loops.htm">VBA - Loops</a></li>
<li><a href="/vba/vba_strings.htm">VBA - Strings</a></li>
<li><a href="/vba/vba_date_time.htm">VBA - Date and Time</a></li>
<li><a href="/vba/vba_arrays.htm">VBA - Arrays</a></li>
<li><a href="/vba/vba_functions.htm">VBA - Functions</a></li>
<li><a href="/vba/vba_sub_procedure.htm">VBA - SubProcedure</a></li>
<li><a href="/vba/vba_events.htm">VBA - Events</a></li>
<li><a href="/vba/vba_error_handling.htm">VBA - Error Handling</a></li>
<li><a href="/vba/vba_excel_objects.htm">VBA - Excel Objects</a></li>
<li><a href="/vba/vba_text_files.htm">VBA - Text Files</a></li>
<li><a href="/vba/vba_programming_charts.htm">VBA - Programming Charts</a></li>
<li><a href="/vba/vba_userforms.htm">VBA - Userforms</a></li>
</ul>
最佳答案
如果我正确理解了您的问题,您需要以下内容:
Dim lists As IHTMLElementCollection
Dim anchorElements As IHTMLElementCollection
Dim ulElement As HTMLUListElement
Dim liElement As HTMLLIElement
Dim row As Long
Set lists = html.getElementsByClassName("nav nav-list primary left-menu")
row = 1
For Each ulElement In lists
For Each liElement In ulElement.getElementsByTagName("li")
Set anchorElements = liElement.getElementsByTagName("a")
If anchorElements.Length > 0 Then
Cells(row, 1) = anchorElements.Item(0).innerText
row = row + 1
End If
Next liElement
Next ulElement
导致这个(对于所有列表):
VBA - Home
VBA - Overview
VBA - Excel Macros
VBA - Excel Terms
VBA - Macro Comments
VBA - Message Box
VBA - Input Box
VBA - Variables
VBA - Constants
VBA - Operators
VBA - Decisions
VBA - Loops
VBA - Strings
VBA - Date and Time
VBA - Arrays
VBA - Functions
VBA - SubProcedure
VBA - Events
VBA - Error Handling
VBA - Excel Objects
VBA - Text Files
VBA - Programming Charts
VBA - Userforms
VBA - Quick Guide
VBA - Useful Resources
VBA - Discussion
Developer's Best Practices
Questions and Answers
Effective Resume Writing
HR Interview Questions
Computer Glossary
Who is Who
如果您只想要第一个列表的 anchor 内容,那么就像这样。
For Each liElement In lists.Item(0).getElementsByTagName("li")
Set anchorElements = liElement.getElementsByTagName("a")
If anchorElements.Length > 0 Then
Cells(row, 1) = anchorElements.Item(0).innerText
row = row + 1
End If
Next liElement
关于VBA Web Scrape (getelementsbyclassname),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41851152/