我有这样的网址:
http://www.matweb.com/search/DataSheet.aspx?MatGUID=849e2916ab1541be9ff6a17b78f95c82
我想使用此代码从该页面下载源代码:
private static string urlTemplate = @"http://www.matweb.com/search/DataSheet.aspx?MatGUID=";
static string GetSource(string guid)
{
try
{
Uri url = new Uri(urlTemplate + guid);
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.Method = "GET";
HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();
Stream responseStream = webResponse.GetResponseStream();
StreamReader responseStreamReader = new StreamReader(responseStream);
String result = responseStreamReader.ReadToEnd();
return result;
}
catch (Exception ex)
{
return null;
}
}
当我这样做时,我得到:
You do not seem to have cookies enabled. MatWeb Requires cookies to be enabled.
好的,我明白了,所以我加了几行:
CookieContainer cc = new CookieContainer();
webRequest.CookieContainer = cc;
我得到了:
Your IP Address has been restricted due to excessive use. The problem may be compounded when an IP address may be shared by many people in a company or through an internet service provider. We apologize for any inconvenience.
我能理解这一点,但当我尝试使用网络浏览器访问此页面时,我没有收到此消息。我该怎么做才能获得源代码?一些 cookie 或 http header ?
最佳答案
它可能不喜欢您的 UserAgent。试试这个:
webRequest.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)"; //maybe substitute your own in here
关于c# - matweb.com : How to get source of page?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4493454/