我想下载在线页面的内部 html,但是当我这样做时,像 šđčćž 这样的字符会被 ć¡ 等替换。
我正在使用的代码:
Dim sourceString As String = New System.Net.WebClient().DownloadString("SomeWebPage")
TextBox1.Text = sourceString
最佳答案
您可能必须下载字节,然后使用 Encoding
类转换为 UTF8:
Async Function GetHtmlString(address As String) As Task(Of String)
Using client As New WebClient
Dim bytes = Await client.DownloadDataTaskAsync(address)
Dim s = Encoding.UTF8.GetString(bytes)
return s
End Using
End Function
感谢 @dave 的评论,一种更简单的方法:
Async Function GetHtmlString(address As String) As Task(Of String)
Using client As New WebClient
client.Encoding = Encoding.UTF8
Dim s = Await client.DownloadStringTaskAsync(address)
return s
End Using
End Function
使用示例:
Imports System.Net
Imports System.Text
Public Class Form1
Private Async Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim s = Await GetHtmlString("http://www.radiomerkury.pl/")
End Sub
Async Function GetHtmlString(address As String) As Task(Of String)
Using client As New WebClient
client.Encoding = Encoding.UTF8
Dim s = Await client.DownloadStringTaskAsync(address)
Return s
End Using
End Function
End Class
关于html - 将网页 HTML 下载为 UTF-8 字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39133766/