我有翻译语言的功能:
Public Function Translate(rng As Range, Optional translateFrom As String = "nl", Optional translateTo As String = "en")
Dim getParam As String, Trans As String, objHTTP As Object, URL As String
Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
getParam = ConvertToGet(rng.Value)
URL = "https://translate.google.com/m?hl=" & translateFrom & "&sl=" & translateFrom & "&tl=" & translateTo & "&ie=UTF-8&prev=_m&q=" & getParam
objHTTP.Open "GET", URL, False
objHTTP.setRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
objHTTP.send ("")
If InStr(objHTTP.responseText, "div dir=""ltr""") > 0 Then
Trans = RegexExecute(objHTTP.responseText, "div[^""]*?""ltr"".*?>(.+?)</div>")
Translate = CleanA(Trans)
Else
Translate = CVErr(xlErrValue)
End If
End Function
中文翻译:
A1 = Hello
B1 = Translate(A1,"en","zh-cn")
结果是“Nǐ hǎo”,正确结果是“你好”
谷歌链接:https://translate.google.com/m?hl=en&sl=en&tl=zh-CN&ie=UTF-8&prev=_m&q=hello
我想要结果:
B1 = 你好
C1 = Nǐ hǎo
我想我需要修复这段代码:
Trans = RegexExecute(objHTTP.responseText, "div[^""]*?""ltr"".*?>(.+?)</div>")
请帮帮我,谢谢!
最佳答案
页面返回两个<div>
就像:
<div dir="ltr" class="o1">Nǐ hǎo</div>
<div dir="ltr" class="t0">你好</div>
最好不要尝试使用正则表达式解析 HTML,因为解析 HTML 很容易遇到困难 - 您可以使用 VBA 中的 Microsoft HTML 对象库来获得类似的结果。
获取那两个<div>
的内容您可以根据以下示例使用此代码的标签:
' o1 has Anglicised translation, t0 as tranlsation in target language
Set objDivs = objHTML.getElementsByTagName("div")
For Each objDiv In objDivs
If objDiv.className = "o1" Then
strTranslatedO1 = objDiv.innerText
End If
If objDiv.className = "t0" Then
strTranslatedT0 = objDiv.innerText
End If
Next objDiv
这基本上是遍历所有 <div>
返回的 HTML 中的标签并检查类名 o1
和 t0
然后得到 innerText
属性(property)。使用这种技术,您可以获得翻译后的值并将它们写回工作表,例如:
完整代码:
Option Explicit
Public Sub Test()
Dim ws As Worksheet
' testing worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
ws.Cells.Delete
' test inputs
ws.Range("A1:E1") = Array("Input", "From", "To", "T0", "O1")
ws.Range("A2:A4") = "hello"
ws.Range("B2:B4") = "English"
ws.Range("C2:C4") = Application.Transpose(Array("Chinese", "Spanish", "Russian"))
' test
ws.Range("D2") = Translate("hello", "en", "zh-cn", True)
ws.Range("E2") = Translate("hello", "en", "zh-cn", False)
ws.Range("D3") = Translate("hello", "en", "es", True)
ws.Range("E3") = Translate("hello", "en", "es", False) 'Spanish uses latin alphabet
ws.Range("D4") = Translate("hello", "en", "ru", True)
ws.Range("E4") = Translate("hello", "en", "ru", False)
End Sub
Public Function Translate(strInput As String, strFromLanguageCode As String, strToLanguageCode As String, blnTargetAlphabet As Boolean) As String
Dim strURL As String
Dim objHTTP As Object
Dim objHTML As Object
Dim objDivs As Object, objDiv
Dim strTranslatedT0 As String
Dim strTranslatedO1 As String
' send query to web page
strURL = "https://translate.google.com/m?hl=" & strFromLanguageCode & _
"&sl=" & strFromLanguageCode & _
"&tl=" & strToLanguageCode & _
"&ie=UTF-8&prev=_m&q=" & strInput
Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
objHTTP.Open "GET", strURL, False
objHTTP.setRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
objHTTP.send ""
' create a html document
Set objHTML = CreateObject("htmlfile")
With objHTML
.Open
.Write objHTTP.responseText
.Close
End With
' o1 has Anglicised translation, t0 as tranlsation in target language
Set objDivs = objHTML.getElementsByTagName("div")
For Each objDiv In objDivs
If objDiv.className = "o1" Then
strTranslatedO1 = objDiv.innerText
End If
If objDiv.className = "t0" Then
strTranslatedT0 = objDiv.innerText
End If
Next objDiv
' choose which to return
If blnTargetAlphabet Then
Translate = strTranslatedT0
Else
Translate = strTranslatedO1
End If
CleanUp:
Set objHTML = Nothing
Set objHTTP = Nothing
End Function
2020 年 12 月更新
看起来这种方法可能在 11 月中旬就不再有效了。
查看响应
div
类名已更改为更晦涩的名称- 有一些深奥的
c-wiz
元素做一些美妙的事情... - 另外,我怀疑某些客户端脚本在检索文档后调用实际翻译
选项:Selenium、Microsoft Translate、Google 翻译 API 的免费和付费层级;)
关于html - 使用 VBA 从谷歌翻译中提取 div 的内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47248969/