r - 使用 conda 安装的 R 时,install.packages 不适用于代理

标签 r linux conda rpy2 install.packages

我在 Linux 服务器 RHEL6 上工作,我安装了 anaconda。 我有以下设置

 conda-env version : 4.3.13
 conda-build version : 2.1.4
 python version : 2.7.13.final.0
 rpy2 : 2.8.5

我安装了 rpy2 以在 python 中使用 R

> R.home()
[1] "/anaconda2/envs/py27CCA/lib/R"
> R.version 
version.string R version 3.3.2 (2016-10-31) 

我按以下方式设置我的代理:

> Sys.getenv("https_proxy")
[1] "https://login:pwd@xxx.net:8080/"

但是下载R包不行

> options(internet.info = 0)
> install.packages("httr")

* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
....
Warning: unable to access index for repository https://stat.ethz.ch/CRAN/src/contrib:
  cannot download all files
Warning message:
package 'httr' is not available (for R version 3.3.2)

但是如果我使用完全相同的代理设置安装相同的独立 R 版本,它可以正常工作

> R.version 
version.string R version 3.3.2 (2016-10-31) 
> install.packages("httr")
...
** testing if installed package can be loaded
* DONE (httr)
Making 'packages.html' ... done
...

是什么造成了这个问题?我检查了openssl版本,我在2个环境中都有相同的版本! 此链接解释了此类代理问题的可能原因 link stackoverflow discussion .

如果我在 python 中执行,我会遇到同样的问题和错误消息

>>> from rpy2.robjects.packages import importr
>>> utils = importr('utils')
>>> utils.install_packages('httr')

最佳答案

长话短说:

而不是将 https_proxy 设置为...:

https://login:pwd@xxx.net:8080/

...尝试将其设置为:

http://login:pwd@xxx.net:8080/

此外,这样做,如果有人嗅探您与代理服务器建立的初始连接的数据包,您将泄露您的凭据。进一步阅读以了解更多信息。


IMO,这个问题与 Conda 无关。这是一个非常常见的错误,我发现它在互联网上非常普遍。

发生这种情况的原因是“HTTPS 代理”一词存在混淆。

IIUC,这里是两个环境变量的意思:

http_proxy|HTTP_PROXY: The proxy server that you wish to use, for all your HTTP requests to the outside world.

https_proxy|HTTPS_PROXY: The proxy server that you wish to use, for all your HTTPS requests to the outside world.

http(s?)://proxy.mydomain.com:3128
 ^^^^^          ^^^^^         ^^^^   
   |              |             |    
scheme    proxy domain/IP   proxy port

现在,理想情况下,这些环境变量的值中指定的方案决定了客户端连接到代理服务器所应使用的协议(protocol)。


让我们看一下 HTTPS 代理的定义。从 curl >= v7.53 的手册页中窃取:

An HTTPS proxy receives all transactions over an SSL/TLS connection.
Once a secure connection with the proxy is established, the user agent
uses the proxy as usual, including sending CONNECT requests to instruct
the proxy to establish a [usually secure] TCP tunnel with an origin
server. HTTPS proxies protect nearly all aspects of user-proxy
communications as opposed to HTTP proxies that receive all requests
(including CONNECT requests) in vulnerable clear text.

With HTTPS proxies, it is possible to have two concurrent _nested_
SSL/TLS sessions: the "outer" one between the user agent and the proxy
and the "inner" one between the user agent and the origin server
(through the proxy). This change adds supports for such nested sessions
as well.

让我们通过示例(curl >= v7.53)来尝试看看:

在这里,我将使用不支持通过 SSL/TLS 进行客户端-代理连接的代理。

确保没有事先设置代理环境变量:

((curl-7_53_1))$ env | grep -i proxy
((curl-7_53_1))$ 

环境:http_proxy,outer_scheme:http,inner_scheme:http

((curl-7_53_1))$ http_proxy="http://proxy.mydomain.com:3128" ./src/curl -s -vvv http://stackoverflow.com -o /dev/null
* Rebuilt URL to: http://stackoverflow.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
> GET http://stackoverflow.com/ HTTP/1.1
> Host: stackoverflow.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
> Proxy-Connection: Keep-Alive
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
< X-Frame-Options: SAMEORIGIN
< X-Request-Guid: 539728ee-a91d-4964-bc7e-1d21d91a6f1d
< Content-Length: 228257
< Accept-Ranges: bytes
< Date: Thu, 16 Mar 2017 05:19:31 GMT
< X-Served-By: cache-jfk8137-JFK
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1489641571.098286,VS0,VE7
< Vary: Fastly-SSL
< X-DNS-Prefetch-Control: off
< Set-Cookie: prov=b2e2dcb8-c5ff-21d9-5712-a0e012573aa6; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
< X-Cache: MISS from proxy.mydomain.com
< X-Cache-Lookup: MISS from proxy.mydomain.com:3128
< Via: 1.1 varnish, 1.0 proxy.mydomain.com (squid)
* HTTP/1.0 connection set to keep alive!
< Connection: keep-alive
<
{ [2816 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

环境:http_proxy,outer_scheme:https,inner_scheme:http

((curl-7_53_1))$ http_proxy="https://proxy.mydomain.com:3128" ./src/curl -s -vvv http://stackoverflow.com -o /dev/null
* Rebuilt URL to: http://stackoverflow.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
* Closing connection 0

环境:https_proxy,outer_scheme:http,inner_scheme:https

((curl-7_53_1))$ https_proxy="http://proxy.mydomain.com:3128" ./src/curl -s -vvv https://stackoverflow.com -o /dev/null
* Rebuilt URL to: https://stackoverflow.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
* Establish HTTP proxy tunnel to stackoverflow.com:443
> CONNECT stackoverflow.com:443 HTTP/1.1
> Host: stackoverflow.com:443
> User-Agent: curl/7.53.1-DEV
> Proxy-Connection: Keep-Alive
>
< HTTP/1.0 200 Connection established
<
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [108 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3044 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=US; ST=NY; L=New York; O=Stack Exchange, Inc.; CN=*.stackexchange.com
*  start date: May 21 00:00:00 2016 GMT
*  expire date: Aug 14 12:00:00 2019 GMT
*  subjectAltName: host "stackoverflow.com" matched cert's "stackoverflow.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*  SSL certificate verify ok.
} [5 bytes data]
> GET / HTTP/1.1
> Host: stackoverflow.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
>
{ [5 bytes data]
< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
< X-Frame-Options: SAMEORIGIN
< X-Request-Guid: 96f8fe3c-058b-479e-8ef2-db6d09f485d3
< Content-Length: 226580
< Accept-Ranges: bytes
< Date: Thu, 16 Mar 2017 05:20:39 GMT
< Via: 1.1 varnish
< Connection: keep-alive
< X-Served-By: cache-jfk8135-JFK
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1489641639.425108,VS0,VE9
< Vary: Fastly-SSL
< X-DNS-Prefetch-Control: off
< Set-Cookie: prov=f1a401f1-f1a0-5f09-66ca-9a792543ee82; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
<
{ [2181 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

环境:https_proxy,outer_scheme:https,inner_scheme:https

((curl-7_53_1))$ https_proxy="https://proxy.mydomain.com:3128" ./src/curl -s -vvv https://stackoverflow.com -o /dev/null
* Rebuilt URL to: https://stackoverflow.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
* Closing connection 0

现在,我将展示支持通过 SSL/TLS 连接的代理的相同输出。为了运行本地 https 代理,我安装了 squid 4.0.17 版。我通过在 /etc/hosts 中覆盖它,将 proxy.mydomain.com 指向本地主机。相关的鱿鱼配置行是:

https_port 3127 cert=/etc/squid/ssl_cert/myCA.pem

请注意,我现在没有使用任何明确指定的(复杂的?)模式(sslbump/intercept/accel/tproxy)

我也已将证书添加到信任库:

sudo cp /etc/squid/ssl_cert/myCA.pem /etc/pki/ca-trust/source/anchors/mySquidCA.pem
sudo update-ca-trust

现在,进行真正的测试:

环境:http_proxy,outer_scheme:https,inner_scheme:http

/t/curl-curl-7_53_1 ❯❯❯ http_proxy=https://proxy.mydomain.com:3127 ./src/curl -s -vvv http://google.com -o /dev/null
* Rebuilt URL to: http://google.com/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (127.0.0.1) port 3127 (#0)
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [86 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [1027 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [262 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / AES256-GCM-SHA384
* Proxy certificate:
*  subject: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  start date: Mar 16 06:43:35 2017 GMT
*  expire date: Mar 16 06:43:35 2018 GMT
*  common name: proxy.mydomain.com (matched)
*  issuer: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  SSL certificate verify ok.
} [5 bytes data]
> GET http://google.com/ HTTP/1.1
> Host: google.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
{ [5 bytes data]
< HTTP/1.1 302 Found
< Cache-Control: private
< Content-Type: text/html; charset=UTF-8
< Location: http://www.google.co.in/?gfe_rd=cr&ei=ejTKWLGzM-Ts8AepwJyQCg
< Content-Length: 261
< Date: Thu, 16 Mar 2017 06:45:14 GMT
< X-Cache: MISS from lenovo
< X-Cache-Lookup: MISS from lenovo:3128
< Via: 1.1 lenovo (squid/4.0.17)
< Connection: keep-alive
< 
{ [5 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

环境:https_proxy,outer_scheme:https,inner_scheme:https

/t/curl-curl-7_53_1 ❯❯❯ https_proxy=https://proxy.mydomain.com:3127 ./src/curl -s -vvv https://google.com -o /dev/null
* Rebuilt URL to: https://google.com/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (127.0.0.1) port 3127 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [86 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [1027 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [262 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Proxy certificate:
*  subject: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  start date: Mar 16 06:43:35 2017 GMT
*  expire date: Mar 16 06:43:35 2018 GMT
*  common name: proxy.mydomain.com (matched)
*  issuer: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  SSL certificate verify ok.
* Establish HTTP proxy tunnel to google.com:443
} [5 bytes data]
> CONNECT google.com:443 HTTP/1.1
> Host: google.com:443
> User-Agent: curl/7.53.1-DEV
> Proxy-Connection: Keep-Alive
> 
{ [5 bytes data]
< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
} [5 bytes data]
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3757 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [148 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=US; ST=California; L=Mountain View; O=Google Inc; CN=*.google.com
*  start date: Mar  9 02:43:31 2017 GMT
*  expire date: Jun  1 02:20:00 2017 GMT
*  subjectAltName: host "google.com" matched cert's "google.com"
*  issuer: C=US; O=Google Inc; CN=Google Internet Authority G2
*  SSL certificate verify ok.
} [5 bytes data]
> GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
> 
{ [5 bytes data]
< HTTP/1.1 302 Found
< Cache-Control: private
< Content-Type: text/html; charset=UTF-8
< Location: https://www.google.co.in/?gfe_rd=cr&ei=hDTKWJXlMubs8Aek-6WQAg
< Content-Length: 262
< Date: Thu, 16 Mar 2017 06:45:24 GMT
< Alt-Svc: quic=":443"; ma=2592000; v="36,35,34"
< 
{ [262 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

从输出中可以明显看出,在这两种情况下都首先与代理服务器进行了 SSL 握手。


现在,我要吐槽一下。

许多客户端(例如:curl = 7.51.0)不支持与代理本身的 SSL/TLS 连接并抛出此类错误:

$ https_proxy=https://proxy.mydomain.com:3128 curl -vvvv https://google.com
* Rebuilt URL to: https://google.com/
* Unsupported proxy scheme for 'https://proxy.mydomain.com:3128'
* Closing connection -1
curl: (7) Unsupported proxy scheme for 'https://proxy.mydomain.com:3128'

然后,有些客户端(例如 curl=7.47.0)会忽略代理 URL 中不受支持的方案,这会误导人们相信他们所完成的事情。通常,他们永远不会通过 SSL/TLS 连接到代理服务器,即使变量明确指定方案为“https”并回退到使用与代理服务器的未加密连接。

然后还有其他客户端(例如 wget v1.18),这会让我们更加困惑:

  • 在以下情况下,错误消息具有误导性,因为 即使是对 HTTP 请求,scheme 也可以保存值 https:// 外部世界(如上例所示,使用 squid),因为我们希望通过 SSL/TLS 连接到代理服务器。

    http_proxy=https://proxy.mydomain.com:3128 wget http://google.com
    Error in proxy URL https://proxy.mydomain.com:3128: Must be HTTP.
    
  • 不仅如此,困惑增加了,当它回落时,让 我们认为它可能正在连接到代理服务器 SSL/TLS,但实际上并非如此,这也让我们认为 方案中的 https://应该仅在内部协议(protocol)为 还有 https://

    https_proxy=https://proxy.mydomain-research.com:3128 wget https://google.com
    --2017-03-16 11:21:06--  https://google.com/
    Resolving proxy.mydomain-research.com (proxy.mydomain-research.com)... 10.1.1.7
    Connecting to proxy.mydomain-research.com (proxy.mydomain-research.com)|10.1.1.7|:3128... connected.
    Proxy request sent, awaiting response... 301 Moved Permanently
    Location: https://www.google.com/ [following]
    --2017-03-16 11:21:07--  https://www.google.com/
    Connecting to proxy.mydomain-research.com (proxy.mydomain-research.com)|10.1.1.7|:3128... connected.
    Proxy request sent, awaiting response... 200 OK
    Length: unspecified [text/html]
    Saving to: ‘index.html’
    

要阅读有关通过 TLS/SSL 连接(和不连接)代理服务器的安全方面的更多信息,请访问:https://security.stackexchange.com/a/61336/114965

关于r - 使用 conda 安装的 R 时,install.packages 不适用于代理,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42806270/

相关文章:

根据另一个因子重新排序因子水平

r - R 中另一个数据帧的字符串匹配和替换的快速方法

c - 第一次尝试从套接字读取失败,然后工作正常

python - conda create --prefix over --name 有什么好处吗?

anaconda - 离线创建conda环境

python - 如何创建一个全新的(未安装任何软件包)conda 环境?

R:如何在两个列表上运行函数?

r - 用于选择一个或多个变量范围内的观测值的函数 (dplyr)

Linux 错误 - 排序 : both SI and IEC prefixes present on units

c - 关于 putenv() 和 setenv() 的问题