我正在尝试使用 Excel VBA 从具有相同 html 格式的网站列表中复制 CData 节点之间的 URL 内容。 HTML 示例在这里:
<script>
//<![CDATA[
Wistia.iframeInit({"assets":[{"type":"original","slug":"original","display_name":
"Original file","ext":"mp4","size":2,"bitrate":2677,"public":true,
"url":"https://embed-ssl.wistia.com/deliveries/1.bin"},
{"type":"original","slug":"original","display_name":"Original file",
"ext":"mp4","size":1,"bitrate":2677,"public":true,
"url":"https://embed-ssl.wistia.com/deliveries/2.bin"},
//]]>
</script>
我似乎无法单独使用 Excel VBA 提取 CDATA 信息。每次我使用下面的脚本时,我都会得到空白或“[object HTMLScriptElement]”
Sub test()
Dim ie As Object
Dim html As Object
Dim mylinks As Object
Dim link As Object
Dim lastRow As Integer
Dim myURL As String
Dim erow As Long
Set ie = CreateObject("InternetExplorer.Application")
lastRow = Sheet1.Cells(Rows.Count, "A").End(xlUp).Row
For i = 2 To lastRow
myURL = Sheet1.Cells(i, "A").Value
ie.navigate myURL
ie.Visible = False
While ie.readyState <> 4
DoEvents
Wend
Set html = ie.document
Set mylinks = html.getElementsByName("script")(1).innerText
For Each link In mylinks
erow = Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1).Value = link
Cells(erow, 1).Columns.AutoFit
Next
End Sub
最佳答案
根据我的经验,Internet Explorer 的自动化非常不稳定。所以我会尽可能长时间地使用 XMLHTTP。当然,您的 HTML 标签汤不是 XML,因此无法进行解析。但我们至少可以通过 XMLHTTP 获取responseText,然后进一步使用文本方法。
示例:
Sub test()
sURL = "https://fast.wistia.net/embed/iframe/vud7ff4i6w"
Dim oXMLHTTP As Object
Set oXMLHTTP = CreateObject("MSXML2.XMLHTTP")
oXMLHTTP.Open "GET", sURL, False
oXMLHTTP.Send
sResponseText = oXMLHTTP.responseText
aScriptParts = Split(sResponseText, "<script", , vbTextCompare) 'separate in parts delimited with <script
For i = LBound(aScriptParts) + 1 To UBound(aScriptParts) 'lbound+1 because the first part should not be script. It is the body html.
sScriptPart = Split(aScriptParts(i), "</script", , vbTextCompare)(0) 'only the part before </script belongs to the script
MsgBox sScriptPart
Next
End Sub
您还可以使用正则表达式而不是 Split
方法将脚本部分与整个文本分开。但是,您应该向 RegEx
专家提出一个单独的问题。我不是一个 RegEx
专家。
关于javascript - VBA 提取 HTML CDATA,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36097431/