java - Apache httpclient 4.3.3 - 速度调整

标签 java apache-httpclient-4.x

我想做的是在千兆位连接上收集一百万个网址,速度在 5MB/s 和 12MB/s(兆字节每秒)之间变化,这远低于带宽最大值。我使用的代码:

    DnsResolver dnsResolver = new SystemDefaultDnsResolver();
    X509HostnameVerifier hostnameVerifier = new AllowAllHostnameVerifier();
    SSLContext sslcontext = SSLContexts.createSystemDefault();
    RedirectStrategy redirectStrategy = new LaxRedirectStrategy();

    HttpConnectionFactory<HttpRoute, ManagedHttpClientConnection> connFactory= = new ManagedHttpClientConnectionFactory(
                    new DefaultHttpRequestWriterFactory(),
                   new DefaultHttpResponseParserFactory());

    Registry<ConnectionSocketFactory> socketFactoryRegistry = RegistryBuilder
                        .<ConnectionSocketFactory> create()
                        .register(
                                "https",
                                new SSLConnectionSocketFactory(sslcontext,
                                        hostnameVerifier))
                        .register("http", new PlainConnectionSocketFactory())
                        .build();
    SocketConfig socketConfig = SocketConfig.custom().setSoKeepAlive(false)
                    .setSoReuseAddress(false)
                    .setSoTimeout(15000).build();
    PoolingHttpClientConnectionManager manager = new PoolingHttpClientConnectionManager(socketFactoryRegistry,connFactory, dnsResolver);
     manager.setDefaultSocketConfig(socketConfig);
     manager.setMaxTotal(1000);
    CloseableHttpClient httpClient = HttpClientBuilder.create().setUserAgent("Mozilla")
                    .setConnectionManager(manager)
                    .setRedirectStrategy(redirectStrategy)               
                    .setMaxConnPerRoute(-1).build();

    RequestConfig defaultConfig = RequestConfig.custom()
                    .setCookieSpec(CookieSpecs.IGNORE_COOKIES)
                    .setExpectContinueEnabled(false)
                    .setStaleConnectionCheckEnabled(false)
                    .setRedirectsEnabled(true)
                    .setStaleConnectionCheckEnabled(false)
                    .setMaxRedirects(5).build();

    RequestConfig rConfig= RequestConfig.copy(defaultConfig)
                    .setSocketTimeout(15000)
                    .setConnectionRequestTimeout(-1)
                    .setConnectTimeout(15000).build();

ExecutorService  executorService = Executors.newFixedThreadPool(640);

FutureRequestExecutionService service = new FutureRequestExecutionService(httpClient, executorService);

每个请求配置为:

 HttpGet httpget = new HttpGet("some url");
    httpget.setConfig(rConfig);
    httpget.setHeader("Connection", "close");

在 ResponseHandler 中,我使用以下代码来使用内容:

 stream = response.getEntity().getContent();
    final byte[] content = IOUtils.toByteArray(stream);

每个网址都来自不同的域。该机器具有 8 核和 8GB RAM - 64 位 linux - Debian。如何加快速度?

最佳答案

如果您不需要自动身份验证、重试、cookie 管理并且不介意手动处理重定向,请考虑使用最小的 HttpClient 实现。最小 HC 使用仅由强制协议(protocol)拦截器组成的最小执行管道构建,并且应具有相同并发参数(连接池设置)的最佳性能特征。

PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
CloseableHttpClient hc = HttpClients.createMinimal(cm);

自然地,您应该希望重复使用连接以获得最佳性能。这似乎与我认为的最佳实践背道而驰。

httpget.setHeader("Connection", "close"); // Huh?

关于java - Apache httpclient 4.3.3 - 速度调整,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23030882/

相关文章:

Android 9 和 Apache HTTP 的缺失

java - Sonar - 排除类的圈复杂度检查

Java/Eclipse(WindowBuilder 插件)- 如何有效地使用 Swing Actionlistener?

javax swing - 创建窗口

java - 在输入流中等待时执行某些操作

javax.net.ssl.SSLPeerUnverifiedException : peer not authenticated issue

java - 单个 java switch 语句返回 2 个变量

java - Android 中的光标是从 0 还是 1 引用列?

java - 从 httpclient 3 转换为 httpclient 4(Cookie 策略)

java - 通过 HTTP 代理的 HTTPS