我正在尝试 DownloadData
方法来自 WebClient
。我当前的问题是我无法弄清楚如何转换 ASCII
result ( <
到 <
、 \n
、 >
到 >
)由 Encoding.ASCII.GetString(myDataBuffer);
生成,出此page .
(来源:iforce.co.nz)
/// <summary>
/// Curl data from the PMID
/// </summary>
private void ClientPMID(int pmid)
{
//generate the URL for the client
StringBuilder pmid_url_string = new StringBuilder();
pmid_url_string.Append("http://www.ncbi.nlm.nih.gov/pubmed/").Append(pmid.ToString()).Append("?report=xml");
Uri PMIDUri = new Uri(pmid_url_string.ToString());
//declare and initialize the client
WebClient client = new WebClient();
// Download the Web resource and save it into a data buffer.
byte[] myDataBuffer = client.DownloadData(PMIDUri);
this.DownloadCompleted(myDataBuffer);
}
/// <summary>
/// Crawl over the binary from myDataBuffer
/// </summary>
/// <param name="myDataBuffer">Binary Buffer</param>
private void DownloadCompleted(byte[] myDataBuffer)
{
string download = Encoding.ASCII.GetString(myDataBuffer);
PMIDCrawler pmc = new PMIDCrawler(download, "/pre/PubmedArticle/MedlineCitation/Article");
//iterate over each node in the file
foreach (XmlNode xmlNode in pmc.crawl)
{
string AbstractTitle = xmlNode["ArticleTitle"].InnerText;
string AbstractText = xmlNode["Abstract"]["AbstractText"].InnerText;
}
}
PMIDCrawler 的代码可在我关于 DownloadStringCompletedEventHandler
的其他问题中找到。 。虽然输出来自 string html = HttpUtility.HtmlDecode(nHtml);
无效 HTML (OR XML) (由于它没有响应 xml
http header ),在收到来自 Encoding.ASCII.GetString
的内容后.
最佳答案
不幸的是,该服务器无法正确响应接受:text/xml
或接受:application/xml
,因此您必须以困难的方式完成此操作(HttpUtility)
string download = HttpUtility.HtmlDecode(Encoding.ASCII.GetString(myDataBuffer));
(或 .NET Fx 4.5+ 上的 WebUtility.Decode)
或
string download = Encoding.ASCII.GetString(myDataBuffer);
if (download != null) { // this won't get all HTML escaped characters...
download = download.Replace("<", "<").Replace(">", ">");
}
另请参阅 this question 了解更多信息。
关于c# - 将 ASCII 编码为 HTML,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15376101/