c# - 屏幕抓取 : unable to authenticate into a site utilizing ASP . NET Forms 身份验证

标签 c# forms-authentication screen-scraping webrequest

使用 C# WebRequest,我正在尝试使用 ASP.NET Forms Authentication 对网站进行屏幕抓取。

首先,应用程序对登录页面执行 GET 操作,并从隐藏的输入字段中提取 __VIEWSTATE 和 __EVENTVALIDATION 键,并从其 cookie 中提取 .NET SessionId。接下来,应用程序使用用户名、密码、其他必需的表单字段和上述三个 .NET 变量对表单操作执行 POST。

从使用 Chrome 验证进入网站的 Fiddler session ,我期望 302 带有存储在 cookie 中的 token ,以允许导航网站的安全区域。我不明白为什么我总是收到没有 token 的 302,将我重定向到该网站未经身份验证的主页。在 Fiddler 中,我的应用程序请求看起来与在 Chrome 或 Firefox 中发出的请求完全相同。

        // Create a request using a URL that can receive a post. 
        var request = (HttpWebRequest)WebRequest.Create(LoginUrl);
        // Set the Method property of the request to POST.
        _container = new CookieContainer();

        request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36";
        request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        request.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
        request.Headers["Accept-Language"] = "en-US,en;q=0.8";

        var response = (HttpWebResponse)request.GetResponse();
        _container.Add(response.Cookies);

        string responseFromServer;

        using (var decompress = new GZipStream(response.GetResponseStream(), CompressionMode.Decompress))
        {
            using (var reader = new StreamReader(decompress))
            {
                // Read the content.
                responseFromServer = reader.ReadToEnd();
            }
        }

        var doc = new HtmlDocument();
        doc.LoadHtml(responseFromServer);

        var hiddenFields = doc.DocumentNode.SelectNodes("//input[@type='hidden']").ToDictionary(input => input.GetAttributeValue("name", ""), input => input.GetAttributeValue("value", ""));

        request = (HttpWebRequest)WebRequest.Create(LoginUrl);

        request.Method = "POST";
        request.CookieContainer = _container;

        // Create POST data and convert it to a byte array.  Modify this line accordingly
        var postData = String.Format("ddlsubsciribers={0}&memberfname={1}&memberpwd={2}&chkRemberMe=true&Imgbtn=LOGIN&__EVENTTARGET&__EVENTARGUMENT&__LASTFOCUS", Agency, Username, Password);
        postData = hiddenFields.Aggregate(postData, (current, field) => current + ("&" + field.Key + "=" + field.Value));

        ServicePointManager.ServerCertificateValidationCallback = AcceptAllCertifications;

        var byteArray = Encoding.UTF8.GetBytes(postData);
        //request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36";
        // Set the ContentType property of the WebRequest.
        request.ContentType = "application/x-www-form-urlencoded";
        request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        request.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
        request.Headers["Accept-Language"] = "en-US,en;q=0.8";
        // Set the ContentLength property of the WebRequest.
        request.ContentLength = byteArray.Length;
        // Get the request stream.
        var dataStream = request.GetRequestStream();
        // Write the data to the request stream.
        dataStream.Write(byteArray, 0, byteArray.Length);
        // Close the Stream object.
        dataStream.Close();
        // Get the response.
        response = (HttpWebResponse)request.GetResponse();
        _container.Add(response.Cookies);

        // Clean up the streams.
        dataStream.Close();
        response.Close();

最佳答案

事实证明,__EVENTVALIDATION 变量中的一些古怪字符被编码为换行符,然后 ASP.NET 假设 session 已损坏而将其丢弃。解决方案是使用 Uri.EscapeDataString 转义 ASP.NET 变量。

postData = hiddenFields.Aggregate(postData, (current, field) => current + ("&" + field.Key + "=" + Uri.EscapeDataString(field.Value)));

关于c# - 屏幕抓取 : unable to authenticate into a site utilizing ASP . NET Forms 身份验证,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18821670/

相关文章:

c# - 如何在 C# 中进行双向 tls 身份验证

ruby-on-rails - Rails 身份验证 token 和 Ajax

python - 使用 Python 进行屏幕抓取

c# - MVC3项目中的表单例份验证、自动重定向

screen-scraping - 使用 Nokogiri 进行抓取的链接

Python Urllib UrlOpen 读取

c# - IDbSet<T> 上没有 FindAsync() 方法

c# - 设置 gridView.FirstDisplayedScrollingRowIndex 时获取 "No room is available to display rows"

c# - 如何在 WPF 3D 中使用顶点处的值对网格着色?

asp.net - 我应该如何/应该保留 ClaimsPrincipal?