html - 将网页 HTML 下载为 UTF-8 字符串

标签 html vb.net utf-8

我想下载在线页面的内部 html,但是当我这样做时,像 šđčćž 这样的字符会被 ć¡ 等替换。

我正在使用的代码:

Dim sourceString As String = New System.Net.WebClient().DownloadString("SomeWebPage")
TextBox1.Text = sourceString

最佳答案

您可能必须下载字节,然后使用 Encoding 类转换为 UTF8:

Async Function GetHtmlString(address As String) As Task(Of String)
    Using client As New WebClient
        Dim bytes  = Await client.DownloadDataTaskAsync(address)
        Dim s  = Encoding.UTF8.GetString(bytes)
        return s
    End Using
End Function

感谢 @dave 的评论,一种更简单的方法:

Async Function GetHtmlString(address As String) As Task(Of String)
    Using client As New WebClient
        client.Encoding = Encoding.UTF8
        Dim s  = Await client.DownloadStringTaskAsync(address)
        return s
    End Using
End Function

使用示例:

Imports System.Net
Imports System.Text

Public Class Form1
    Private Async Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        Dim s = Await GetHtmlString("http://www.radiomerkury.pl/")
    End Sub

    Async Function GetHtmlString(address As String) As Task(Of String)
        Using client As New WebClient
            client.Encoding = Encoding.UTF8
            Dim s = Await client.DownloadStringTaskAsync(address)
            Return s
        End Using
    End Function
End Class

关于html - 将网页 HTML 下载为 UTF-8 字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39133766/

相关文章:

html - 元素不可见但仍可在 safari 浏览器上点击

html - 使用 pre 标签时出现问题

c# - 是还是不是?我可以在不同的程序集中划分接口(interface)和实现类吗?

database - UTF-8 和 ISO 8859-9

python - Selenium Python : doesn't print the elements text

html - 消除水平元素之间的空间

vb.net - 如果字符串不为 null 或为空,则在字符串之间添加字符串

vb.net - 如何将带有 "!@@@"的字符串格式函数从vb6.0迁移到vb.net?

c# - XML - 帮助 RSS UTF-8 支持

PHP mysql charset utf8问题