我正在尝试使用 php curl 登录 ets.org/toefl 帐户。但我无法登录该网站。我通常会收到一条错误消息,说服务器正忙,但当我使用浏览器登录时它会起作用。我已附上我的代码。谁能看出出了什么问题吗?
<?php
include('simple_html_dom.php');
$login_url = 'https://toefl-registration.ets.org/TOEFLWeb/logon.do';
$username='****';
$password='***';
$ck = 'cookie.txt';
$agent = 'Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0';
// extra headers
$headers[] = "Connection: keep-alive";
//$headers[]= "Accept-Encoding: gzip, deflate";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, $ck);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ck);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
//curl_setopt($ch, CURLOPT_URL, 'https://toefl-registration.ets.org/TOEFLWebextISERLogonPrompt.do');
$output = curl_exec($ch);
//echo $output;
$html = new simple_html_dom();
$html = str_get_html($output);
$e = $html->find(".loginform");
$a = $e[0]->find('input');
$str = $a[0]->outertext;
preg_match("/value=\"(.*)\"/",$str,$match);
$h_attr = $match[1];
$fields['org.apache.struts.taglib.html.TOKEN'] = $h_attr;
$fields['currentLocale']= 'en_US';
$fields['username'] = $username;
$fields['password'] = $password;
$fields['x'] = 11;
$fields['y'] = 4;
//print_r($fields);
//echo "\r\n";
$POSTFIELDS = http_build_query($fields);
//echo $POSTFIELDS;
$headers[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$headers[] = "Accept-Language: en-US,en;q=0.5";
$headers[]="Referer: https://toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do";
curl_setopt($ch, CURLOPT_URL, $login_url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $POSTFIELDS);
$result = curl_exec($ch);
print $result;
(根据评论更新)
通过浏览器发布:
org.apache.struts.taglib.html.TOKEN=c1b88957e9914492fe8cc20b33ef1cdd¤tLocale=en_US&username=name&password=pass&x=23&y=3 By me. org.apache.struts.taglib.html.TOKEN=345a9f935b2db8a69f55c5b4d3372190¤tLocale=en_US&username=name&password=pass&x=11&y=4
由 php curl verbose 生成的帖子:
POST /TOEFLWeb/logon.do HTTP/1.1 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0 Host: toefl-registration.ets.org Cookie: au=MTM3Mjc4ODQwMg%3d%3d; server=3; JSESSIONID=23C39022E2641B8F5AC944295837315E Connection: keep-alive Accept: / Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8 Accept-Language: en-US,en;q=0.5 Referer: toefl-registration.ets.org/TOEFLWeb/extISERLogonPrompt.do Content-Length: 134 Content-Type: application/x-www-form-urlencoded
最佳答案
尝试将 CURL 脚本发送的 HTTP header 与浏览器发送的 header 进行比较(使用 Chrome 开发工具)。也许远程服务器由于缺少某些 header 信息而拒绝您。
确保 cookie 文件具有完全权限。来自 php.net:
When specifing CURLOPT_COOKIEFILE or CURLOPT_COOKIEJAR options, don't forget to "chmod 777" that directory where cookie-file must be created.
关于PHP Curl 登录 https,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17432508/