c - Libcurl 不会在重定向时发送 cookie

标签 c http redirect cookies libcurl

我编写了一些 C++ 代码(使用 libcurl)来登录 WordPress 博客。不过我遇到了一个有趣的问题。成功登录后,Libcurl 不会发送适当的 cookie。 首先,我获取登录页面(解析发布数据并获取 WP 所需的 cookie)。这完成了它应该做的事情,你会看到 libcurl 已经适本地“添加了 cookie wordpress_test_cookie”。接下来我发布登录数据。 WordPress登录实现登录成功后的重定向。从 302 响应和 header 中的 Location 字段可以看到。我还获得了 cookie,以便我现在可以导航到 wp-admin 面板。接下来,我获取重定向位置并尝试获取它。这就是它失败的地方。

编译使用:

g++-5 -o loginTest loginTest.cpp -lcurl -std=c++17

G++ 版本:

g++ (Ubuntu 5.1.0-0ubuntu11~14.04.1) 5.1.0

curl 版本:

curl 7.35.0 (x86_64-pc-linux-gnu) libcurl/7.35.0 OpenSSL/1.0.1f zlib/1.2.8 libidn/1.28 librtmp/2.3

完整代码:

#include <curl/curl.h>
#include <iostream>
#include <regex>
#include <string>
#include <vector>

using namespace std;

string ParseLoginPageForPOSTParameters(string loginPageBody);
size_t MyWriteCallback(char *ptr, size_t size, size_t nmemb, void *userdata);

int main() {
    CURL *curl;

    if (curl_global_init(CURL_GLOBAL_NOTHING) != 0) {
        cerr << "Something catastrophic happened during global init" << endl;
        return 1;
    }
    curl = curl_easy_init();

    if (curl) {
        //First get the login page (to parse post data and get a cookie that WP requires)
        curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
        curl_easy_setopt(curl, CURLOPT_HTTPGET, 1L);
        curl_easy_setopt(curl, CURLOPT_URL, "http://example.com/wp-login.php");
        //Empty string for encoding means accept all that are supported
        curl_easy_setopt(curl, CURLOPT_ACCEPT_ENCODING, "");
        curl_easy_setopt(curl, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0");
        curl_easy_setopt(curl, CURLOPT_COOKIEJAR, 0);
        string loginPageBody = "";
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &loginPageBody);
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, &MyWriteCallback);
        curl_easy_perform(curl);

        //We now have the login page stored in loginPageBody
        string parsedPostParameters = ParseLoginPageForPOSTParameters(loginPageBody);
        string completePostParameters = "log=<myUsername>&pwd=<myPassword>" + parsedPostParameters;
        curl_easy_setopt(curl, CURLOPT_POSTFIELDS, completePostParameters.c_str());
        curl_easy_perform(curl);

        //We now sent the login info
        char *newUrl;
        curl_easy_getinfo(curl, CURLINFO_REDIRECT_URL, &newUrl);
        cout << flush;
        cout << "Redirecting to " << newUrl << endl;
        curl_easy_setopt(curl, CURLOPT_HTTPGET, 1L);
        curl_easy_setopt(curl, CURLOPT_URL, newUrl);
        string afterLoginBody = "";
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, &afterLoginBody);
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, &MyWriteCallback);
        curl_easy_perform(curl);
        cout << flush;
        cout << afterLoginBody;
    } else {
        cerr << "Couldnt easy_init" << endl;
    }

    curl_easy_cleanup(curl);

    return 0;
}

string ParseLoginPageForPOSTParameters(string loginPageBody) {
    string postParameters = "";
    //find all input tags, they're formatted like this
    //  <input type="hidden" name="testcookie" value="1" />

    regex inputRegex(R"V.S.(<input.*?/>)V.S.", regex::icase|regex::optimize);
    auto firstInputMatch = sregex_iterator(loginPageBody.begin(), loginPageBody.end(), inputRegex);
    auto lastInputMatch = sregex_iterator();

    for (sregex_iterator i = firstInputMatch; i != lastInputMatch; ++i) {
        //for each input field in the form
        smatch inputMatch = *i;
        string inputMatchString = inputMatch.str();
        regex nameRegex(R"V.S.(name="(.*?)")V.S.", regex::icase|regex::optimize);
        regex valueRegex(R"V.S.(value="(.*?)")V.S.", regex::icase|regex::optimize);
        smatch nameMatch, valueMatch;
        if (regex_search(inputMatchString,nameMatch,nameRegex) && regex_search(inputMatchString,valueMatch,valueRegex)) {
            //Found a name and value pair inside the input field
            string value = valueMatch[1].str();
            string name = nameMatch[1].str();
            if (name != "log" && name != "pwd" && name != "" && value != "") {
                postParameters += "&"+name+"="+value;
            }
        }
    }
    return postParameters;
}

size_t MyWriteCallback(char *ptr, size_t size, size_t nmemb, void *userdata) {
    //userdata is a std::string
    size_t dataAmnt = size * nmemb;
    string *userdataString = (string*)userdata;
    for (int i=0; i<nmemb; ++i) {
        *userdataString += ptr[i];
    }
    return size * nmemb;
}

完整输出:

* Hostname was NOT found in DNS cache
*   Trying <IP of example.com>...
* Connected to example.com (<IP of example.com>) port 80 (#0)
> GET /wp-login.php HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0
Host: example.com
Accept: */*
Accept-Encoding: deflate, gzip

< HTTP/1.1 200 OK
< Date: Sat, 17 Oct 2015 15:56:43 GMT
< Content-Type: text/html; charset=UTF-8
< Content-Length: 4759
< Connection: keep-alive
< Keep-Alive: timeout=30
* Server Apache/2 is not blacklisted
< Server: Apache/2
< X-Powered-By: PHP/5.3.29
< Expires: Wed, 11 Jan 1984 05:00:00 GMT
< Cache-Control: no-cache, must-revalidate, max-age=0
< Pragma: no-cache
* Added cookie wordpress_test_cookie="WP+Cookie+check" for domain example.com, path /, expire 0
< Set-Cookie: wordpress_test_cookie=WP+Cookie+check; path=/
< X-Frame-Options: SAMEORIGIN
< 
* Connection #0 to host example.com left intact
* Found bundle for host example.com: 0x9a0c10
* Re-using existing connection! (#0) with host example.com
* Connected to example.com (<IP of example.com>) port 80 (#0)
> POST /wp-login.php HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0
Host: example.com
Accept: */*
Accept-Encoding: deflate, gzip
Cookie: wordpress_test_cookie=WP+Cookie+check
Content-Length: 115
Content-Type: application/x-www-form-urlencoded

* upload completely sent off: 115 out of 115 bytes
< HTTP/1.1 302 Found
< Date: Sat, 17 Oct 2015 15:56:44 GMT
< Content-Type: text/html; charset=iso-8859-1
< Content-Length: 215
< Connection: keep-alive
< Keep-Alive: timeout=30
* Server Apache/2 is not blacklisted
< Server: Apache/2
< X-Powered-By: PHP/5.3.29
< Expires: Wed, 11 Jan 1984 05:00:00 GMT
< Cache-Control: no-cache, must-revalidate, max-age=0
< Pragma: no-cache
* Replaced cookie wordpress_test_cookie="WP+Cookie+check" for domain example.com, path /, expire 0
< Set-Cookie: wordpress_test_cookie=WP+Cookie+check; path=/
< X-Frame-Options: SAMEORIGIN
* Added cookie wordpress_0c9410477a030adc6c64b9a4a70917f3="admin%7C1446307004%7CF38piRdUOaqnMHaJnETwYyMAdIU5Wsjq4quj9bDloZr%7C2bd7e640536c134e3426ef89f1a62b44bd011a4d93d5ffa88f336167d1144fb2" for domain example.com, path /wp-content/plugins, expire 1446350204
< Set-Cookie: wordpress_0c9410477a030adc6c64b9a4a70917f3=admin%7C1446307004%7CF38piRdUOaqnMHaJnETwYyMAdIU5Wsjq4quj9bDloZr%7C2bd7e640536c134e3426ef89f1a62b44bd011a4d93d5ffa88f336167d1144fb2; expires=Sun, 01-Nov-2015 03:56:44 GMT; path=/wp-content/plugins; httponly
* Added cookie wordpress_0c9410477a030adc6c64b9a4a70917f3="admin%7C1446307004%7CF38piRdUOaqnMHaJnETwYyMAdIU5Wsjq4quj9bDloZr%7C2bd7e640536c134e3426ef89f1a62b44bd011a4d93d5ffa88f336167d1144fb2" for domain example.com, path /wp-admin, expire 1446350204
< Set-Cookie: wordpress_0c9410477a030adc6c64b9a4a70917f3=admin%7C1446307004%7CF38piRdUOaqnMHaJnETwYyMAdIU5Wsjq4quj9bDloZr%7C2bd7e640536c134e3426ef89f1a62b44bd011a4d93d5ffa88f336167d1144fb2; expires=Sun, 01-Nov-2015 03:56:44 GMT; path=/wp-admin; httponly
* Added cookie wordpress_logged_in_0c9410477a030adc6c64b9a4a70917f3="admin%7C1446307004%7CF38piRdUOaqnMHaJnETwYyMAdIU5Wsjq4quj9bDloZr%7Ca5b38cc8bb0d656573d147fcfc00e54016271723bee9bf3613c5c45f504b2aca" for domain example.com, path /, expire 1446351204
< Set-Cookie: wordpress_logged_in_0c9410477a030adc6c64b9a4a70917f3=admin%7C1446307004%7CF38piRdUOaqnMHaJnETwYyMAdIU5Wsjq4quj9bDloZr%7Ca5b38cc8bb0d656573d147fcfc00e54016271723bee9bf3613c5c45f514b2aca; expires=Sun, 01-Nov-2015 03:56:44 GMT; path=/; httponly
< Location: http://www.example.com/wp-admin/
< 
* Connection #0 to host example.com left intact
Redirecting to http://www.example.com/wp-admin/
* Hostname was NOT found in DNS cache
*   Trying <IP of example.com>...
* Connected to www.example.com (<IP of example.com>) port 80 (#1)
> GET /wp-admin/ HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0
Host: www.example.com
Accept: */*
Accept-Encoding: deflate, gzip

< HTTP/1.1 302 Found
< Date: Sat, 17 Oct 2015 15:56:45 GMT
< Content-Type: text/html; charset=iso-8859-1
< Content-Length: 285
< Connection: keep-alive
< Keep-Alive: timeout=30
* Server Apache/2 is not blacklisted
< Server: Apache/2
< X-Powered-By: PHP/5.3.29
< Expires: Wed, 11 Jan 1984 05:00:00 GMT
< Cache-Control: no-cache, must-revalidate, max-age=0
< Pragma: no-cache
< Location: http://www.example.com/wp-login.php?redirect_to=http%3A%2F%2Fwww.example.com%2Fwp-admin%2F&reauth=1
< Accept-Ranges: bytes
< Age: 0
< 
* Connection #1 to host www.example.com left intact
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://www.example.com/wp-login.php?redirect_to=http%3A%2F%2Fwww.example.com%2Fwp-admin%2F&amp;reauth=1">here</a>.</p>
</body></html>

如您所见,libcurl 正在发送不带任何 cookie 的 GET。因此,WordPress 将我重定向回登录页面,要求重新授权。另外,在发送 GET 请求时,它并没有像我发布时那样说“* Found Bundle for host example.com:...”,这似乎很奇怪。

有什么想法为什么会发生这种情况吗?

最佳答案

答案有点奇怪,但我认为是有效的。

我对网页的初始请求是“http://example.com ”地址。

但是重定向到“http://www.example.com ”地址。

由于“www”,libcurl 将新地址视为全新地址。

我认为这在技术上是有效的行为。

关于c - Libcurl 不会在重定向时发送 cookie,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33182925/

相关文章:

c - 如何在 C 程序中从 X11 获取更新的系统 DPI 信息?

c - typedef 在c 中的困惑

http - 如果找不到请求的图像,我应该返回 "500"还是 "404"?

javascript - 如何使用 Node.JS(无框架)正确上传和保存照片

PHP登录重定向流程

magento - 从旧 url Magento 重定向到新 url

c - 如何通过 C 中的套接字发送和接收内存地址?

c - 在调用传递多个参数的函数时处理进入堆栈的被调用者寄存器

http - Golang 写入 http 响应会中断输入读取?

Cakephp 从 Controller 到另一个 Controller 的内部重定向