java - 如何使用 JSOUP 绕过 cloudflare ddos​​ 或 5 秒后重定向?

标签 java https jsoup

我正在尝试在此站点中获取动漫列表,https://ww1.gogoanime.io

这是代码,

org.jsoup.Connection.Response usage = Jsoup.connect("https://ww1.gogoanime.io/anime-list-A")
            .header("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
            .header("accept-encoding", "gzip, deflate, sdch, br")
            .header("accept-language", "en-US,en;q=0.8")
            .header("cache-control", "max-age=0")
            .header("user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36")
            .header("upgrade-insecure-requests", "1")
            .ignoreHttpErrors(true)
            .followRedirects(true)
            .method(Connection.Method.GET)
            .timeout(30000)
            .execute();

System.out.println(usage.parse());

此代码适用于其他网站,但对于此网站,结果是 Cloudflare DDOS 保护 我已经添加了所有的标题,但是 chrome 可以毫无问题地访问这个 url。

顺便说一句,如果我没有设置,

ignoreHttpErrors(true)

为 true,这将引发异常 503。无论我做什么,它都不会消失,直到我将其更改为 true。所以我被困在 ddos​​ 保护页面,它说将在 5 秒内重定向到该网站。

我也尝试了下面的代码,

org.jsoup.Connection.Response usage = Jsoup.connect("https://ww1.gogoanime.io/anime-list-A")
        .header("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
        .header("accept-encoding", "gzip, deflate, sdch, br")
        .header("accept-language", "en-US,en;q=0.8")
        .header("cache-control", "max-age=0")
        .header("user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36")
        .header("upgrade-insecure-requests", "1")
        .ignoreHttpErrors(true)
        .followRedirects(true)
        .method(Connection.Method.GET)
        .timeout(30000)
        .execute();

Thread.sleep(5000);

org.jsoup.Connection.Response usg = Jsoup.connect("https://ww1.gogoanime.io/anime-list-A")
            .header("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8")
            .header("accept-encoding", "gzip, deflate, sdch, br")
            .header("accept-language", "en-US,en;q=0.8")
            .header("cache-control", "max-age=0")
            .header("user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36")
            .header("upgrade-insecure-requests", "1")
            .ignoreHttpErrors(true)
            .followRedirects(true)
            .cookies(usage.cookies())
            .method(Connection.Method.GET)
            .timeout(30000)
            .execute();

这也没有用。我的浏览器访问这个 url 没有任何问题。所以我认为它与jsoup有关?

顺便说一句,我认为这是关于证书的东西,所以我也使用了它。但它也没有用。

TrustManager[] trustAllCerts = new TrustManager[] { new X509TrustManager() {
        public java.security.cert.X509Certificate[] getAcceptedIssuers() {
            return null;
        }

        public void checkClientTrusted(java.security.cert.X509Certificate[] certs, String authType) {
        }

        public void checkServerTrusted(java.security.cert.X509Certificate[] certs, String authType) {
        }
    } };

    // Install the all-trusting trust manager
    try {
        SSLContext sc = SSLContext.getInstance("SSL");
        sc.init(null, trustAllCerts, new java.security.SecureRandom());
        HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());
    } catch (Exception e) {
        throw new RuntimeException(e);
    }

最佳答案

这是我的做法。在您的项目中,像这样创建一个 CloudFlare.java 类:

import android.os.Looper;
import android.text.TextUtils;
import android.util.Log;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.CookieHandler;
import java.net.CookieManager;
import java.net.CookiePolicy;
import java.net.HttpCookie;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import com.eclipsesource.v8.V8;

public class Cloudflare {

    private String mUrl;
    private String mUser_agent;
    private cfCallback mCallback;
    private int mRetry_count;
    private URL ConnUrl;
    private List<HttpCookie> mCookieList;
    private CookieManager mCookieManager;
    private HttpURLConnection mCheckConn;
    private HttpURLConnection mGetMainConn;
    private HttpURLConnection mGetRedirectionConn;

    private static final int MAX_COUNT = 5;
    private static final int CONN_TIMEOUT = 60000;
    private static final String ACCEPT = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;";

    private boolean canVisit = false;

    public Cloudflare(String url) {
        mUrl = url;
    }

    public Cloudflare(String url, String user_agent) {
        mUrl = url;
        mUser_agent = user_agent;
    }

    public String getUser_agent() {
        return mUser_agent;
    }

    public void setUser_agent(String user_agent) {
        mUser_agent = user_agent;
    }

    public void getCookies(final cfCallback callback){
        new Thread(new Runnable() {
            @Override
            public void run() {
                urlThread(callback);
            }
        }).start();
    }

    private void urlThread(cfCallback callback){
        mCookieManager = new CookieManager();
        mCookieManager.setCookiePolicy(CookiePolicy.ACCEPT_ALL); //接受所有cookies
        CookieHandler.setDefault(mCookieManager);
        HttpURLConnection.setFollowRedirects(false);

        while (!canVisit){
            if (mRetry_count>MAX_COUNT){
                break;
            }
            try {

                int responseCode = checkUrl();
                if (responseCode==200){
                    canVisit=true;
                    break;
                }else {
                    getVisiteCookie();
                }
            } catch (IOException | InterruptedException e) {
                if (mCookieList!=null){
                    mCookieList.clear();
                }
                e.printStackTrace();
            } finally {
                closeAllConn();
            }
            mRetry_count++;
        }
        if (callback!=null){
            Looper.prepare();
            if (canVisit){
                callback.onSuccess(mCookieList);
            }else {
                e("Get Cookie Failed");
                callback.onFail();
            }


        }
    }



    private void getVisiteCookie() throws IOException, InterruptedException {
        ConnUrl = new URL(mUrl);
        mGetMainConn = (HttpURLConnection) ConnUrl.openConnection();
        mGetMainConn.setRequestMethod("GET");
        mGetMainConn.setConnectTimeout(CONN_TIMEOUT);
        mGetMainConn.setReadTimeout(CONN_TIMEOUT);
        if (!TextUtils.isEmpty(mUser_agent)){
            mGetMainConn.setRequestProperty("user-agent",mUser_agent);
        }
        mGetMainConn.setRequestProperty("accept",ACCEPT);
        mGetMainConn.setRequestProperty("referer", mUrl);
        if (mCookieList!=null&&mCookieList.size()>0){
            mGetMainConn.setRequestProperty("cookie",listToString(mCookieList));
        }
        mGetMainConn.setUseCaches(false);
        mGetMainConn.connect();
        switch (mGetMainConn.getResponseCode()){
            case HttpURLConnection.HTTP_OK:
                e("MainUrl","visit website success");
                return;
            case HttpURLConnection.HTTP_FORBIDDEN:
                e("MainUrl","IP block or cookie err");
                return;
            case HttpURLConnection.HTTP_UNAVAILABLE:
                InputStream mInputStream = mCheckConn.getErrorStream();
                BufferedReader mBufferedReader = new BufferedReader(new InputStreamReader(mInputStream));
                StringBuilder sb = new StringBuilder();
                String str;
                while ((str = mBufferedReader.readLine()) != null){
                    sb.append(str);
                }
                mInputStream.close();
                mBufferedReader.close();
                mCookieList = mCookieManager.getCookieStore().getCookies();
                str = sb.toString();
                getCheckAnswer(str);
                break;
            default:

                break;
        }
    }

    /**
     * 获取值并跳转获得cookies
     * @param str
     */
    private void getCheckAnswer(String str) throws InterruptedException, IOException {
        String jschl_vc = regex(str,"name=\"jschl_vc\" value=\"(.+?)\"").get(0);    //正则取值
        String pass = regex(str,"name=\"pass\" value=\"(.+?)\"").get(0);            //
        double jschl_answer = get_answer(str);
        e(String.valueOf(jschl_answer));
        Thread.sleep(3000);
        String req = String.valueOf("https://"+ConnUrl.getHost())+"/cdn-cgi/l/chk_jschl?"
                +"jschl_vc="+jschl_vc+"&pass="+pass+"&jschl_answer="+jschl_answer;
        e("RedirectUrl",req);
        getRedirectResponse(req);
    }

    private void getRedirectResponse(String req) throws IOException {
        HttpURLConnection.setFollowRedirects(false);
        mGetRedirectionConn = (HttpURLConnection) new URL(req).openConnection();
        mGetRedirectionConn.setRequestMethod("GET");
        mGetRedirectionConn.setConnectTimeout(CONN_TIMEOUT);
        mGetRedirectionConn.setReadTimeout(CONN_TIMEOUT);
        if (!TextUtils.isEmpty(mUser_agent)){
            mGetRedirectionConn.setRequestProperty("user-agent",mUser_agent);
        }
        mGetRedirectionConn.setRequestProperty("accept",ACCEPT);
        mGetRedirectionConn.setRequestProperty("referer", req);
        if (mCookieList!=null&&mCookieList.size()>0){
            mGetRedirectionConn.setRequestProperty("cookie",listToString(mCookieList));
        }
        mGetRedirectionConn.setUseCaches(false);
        mGetRedirectionConn.connect();
        switch (mGetRedirectionConn.getResponseCode()){
            case HttpURLConnection.HTTP_OK:
                mCookieList = mCookieManager.getCookieStore().getCookies();
                break;
            case HttpURLConnection.HTTP_MOVED_TEMP:
                mCookieList = mCookieManager.getCookieStore().getCookies();
                break;
            default:throw new IOException("getOtherResponse Code: "+
                    mGetRedirectionConn.getResponseCode());
        }
    }


    private int checkUrl()throws IOException {
        URL ConnUrl = new URL(mUrl);
        mCheckConn = (HttpURLConnection) ConnUrl.openConnection();
        mCheckConn.setRequestMethod("GET");
        mCheckConn.setConnectTimeout(CONN_TIMEOUT);
        mCheckConn.setReadTimeout(CONN_TIMEOUT);
        if (!TextUtils.isEmpty(mUser_agent)){
            mCheckConn.setRequestProperty("user-agent",mUser_agent);
        }
        mCheckConn.setRequestProperty("accept",ACCEPT);
        mCheckConn.setRequestProperty("referer",mUrl);
        if (mCookieList!=null&&mCookieList.size()>0){
            mCheckConn.setRequestProperty("cookie",listToString(mCookieList));
        }
        mCheckConn.setUseCaches(false);
        mCheckConn.connect();
        return mCheckConn.getResponseCode();
    }

    private void closeAllConn(){
        if (mCheckConn!=null){
            mCheckConn.disconnect();
        }
        if (mGetMainConn!=null){
            mGetMainConn.disconnect();
        }
        if (mGetRedirectionConn!=null){
            mGetRedirectionConn.disconnect();
        }
    }


    public interface cfCallback{
        void onSuccess(List<HttpCookie> cookieList);
        void onFail();
    }

    private double get_answer(String str) {  //取值
        double a = 0;

        try {
            List<String> s = regex(str,"var s,t,o,p,b,r,e,a,k,i,n,g,f, " +
                    "(.+?)=\\{\"(.+?)\"");
            String varA = s.get(0);
            String varB = s.get(1);
            StringBuilder sb = new StringBuilder();
            sb.append("var a=");
            sb.append(regex(str,varA+"=\\{\""+varB+"\":(.+?)\\}").get(0));
            sb.append(";");
            List<String> b = regex(str,varA+"\\."+varB+"(.+?)\\;");
            for (int i =0;i<b.size()-1;i++){
                sb.append("a");
                sb.append(b.get(i));
                sb.append(";");
            }

            e("add",sb.toString());
            V8 v8 = V8.createV8Runtime();
            a = v8.executeDoubleScript(sb.toString());
            List<String> fixNum = regex(str,"toFixed\\((.+?)\\)");
            if (fixNum!=null){
                a = Double.parseDouble(v8.executeStringScript("String("+String.valueOf(a)+".toFixed("+fixNum.get(0)+"));"));
            }
            a += new URL(mUrl).getHost().length();
            v8.release();
        }catch (IndexOutOfBoundsException e){
            e("answerErr","get answer error");
            e.printStackTrace();
        }
        catch (MalformedURLException e) {
            e.printStackTrace();
        }
        return a;
    }

    /**
     * 正则
     * @param text 本体
     * @param pattern 正则式
     * @return List<String>
     */
    private List<String> regex(String text, String pattern){
        try {
            Pattern pt = Pattern.compile(pattern);
            Matcher mt = pt.matcher(text);
            List<String> group = new ArrayList<>();

            while (mt.find()) {
                if (mt.groupCount() >= 1) {
                    if (mt.groupCount()>1){
                        group.add(mt.group(1));
                        group.add(mt.group(2));
                    }else group.add(mt.group(1));
                }
            }
            return group;
        }catch (NullPointerException e){
            Log.i("MATCH","null");
        }
        return null;
    }

    /**
     * 转换list为 ; 符号链接的字符串
     * @param list
     * @return
     */
    public static String listToString(List list ) {
        char separator = ";".charAt(0);
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < list.size(); i++) {
            sb.append(list.get(i)).append(separator);
        }
        return sb.toString().substring(0, sb.toString().length() - 1);
    }


    /**
     * 转换为jsoup可用的Hashmap
     * @param list  HttpCookie列表
     * @return Hashmap
     */
    public static Map<String,String> List2Map(List<HttpCookie> list){
        Map<String, String> map = new HashMap<>();
        try {
            if (list != null) {
                for (int i = 0; i < list.size(); i++) {
                    String[] listStr = list.get(i).toString().split("=");
                    map.put(listStr[0], listStr[1]);
                }
                Log.i("List2Map", map.toString());
            } else {
                return map;
            }

        } catch (IndexOutOfBoundsException e) {
            e.printStackTrace();
        }

        return map;
    }

    private void e(String tag,String content){
        Log.e(tag,content);
    }

    private void e(String content){
        Log.e("cloudflare",content);
    }

现在要使用它,只需像这样调用上面的类并将 cookie 转换为 Map 以便与 jsoup 一起使用:

Cloudflare cf = new Cloudflare("YOUR URL HERE");
cf.setUser_agent("YOUR USER AGENT HERE");
cf.getCookies(new Cloudflare.cfCallback() {
    @Override
    public void onSuccess(List< HttpCookie > cookieList) {
        //convert the cookielist to a map
        Map<String, String> cookies = Cloudflare.List2Map(cookieList);
        Log.d("COOKIES : ", cookies.toString());

    }

    @Override
    public void onFail() {
        Log.d("OMG IT FAILED!!!");
    }
});

接下来,在 onSuccess 中启动您的异步任务,并在您的 doinbackground 中的 jsoup 请求中使用 cookie 和相同的用户代理。像这样:

try {
    Connection.Response response = Jsoup.connect("YOUR URL HERE").userAgent("YOUR USER AGENT HERE").cookies(cookies).execute();
    Document doc = response.parse();
    Log.d("THE DOCUMENT : ", doc.toString());
} catch (Exception M){
    M.printStackTrace();
}

在 .gradle[app] 中将其添加到您的依赖项中:

implementation 'com.eclipsesource.j2v8:j2v8_android:3.0.5@aar'

关于java - 如何使用 JSOUP 绕过 cloudflare ddos​​ 或 5 秒后重定向?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43453491/

相关文章:

java - Android Volley : BasicNetwork. performRequest:意外的响应代码 404

java - spring-boot拦截器没有拦截

ssl - HTTPS下的 Tornado 服务器错误

java - 哪些包必须导入?

java - 如何在 Jsoup Tokenizer 中禁用错误​​跟踪?

java - 在 java 上从 url 解析 pdf。我可以使用 jsoup 吗?

java - NetBeans 项目上下文菜单

Java函数设置未知类型的数组列表

node.js - Nodejs https socket.io

laravel - Chrome 将 .dev 重定向到 https