问题描述
我有一个使用第三方代理提供商 (luminati.io) 的网络爬虫,该提供商一直在为多个网站工作,没有出现任何问题。但是,今天我为一个新站点构建了一个爬虫,并在尝试连接到主机端点时遇到了 javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
。我正在运行 JDK 版本 1.8.0_151。这是我的代理客户端代码:
public class ProxyClient implements Client
{
private static final String username = "my-luminati-username";
private static final String password = "my-luminati-pw";
private static final String theHostname = "zproxy.lum-superproxy.io";
private static final int port = 22225;
public String session_id = Integer.toString(new Random().nextInt(Integer.MAX_VALUE));
private WebClient theWebClient;
public ProxyClient(String country){
String myLogin = username+(country!=null ? "-country-"+country : "")
+"-session-" + session_id;
CredentialsProvider myCredentialsProvider = new BasicCredentialsProvider();
myCredentialsProvider.setCredentials(new AuthScope(new HttpHost(theHostname,port)),new UsernamePasswordCredentials(myLogin,password));
theWebClient = new WebClient();
theWebClient.getoptions().setCssEnabled(false);
theWebClient.getoptions().setJavaScriptEnabled(false);
theWebClient.getoptions().setProxyConfig(new ProxyConfig(theHostname,port));
theWebClient.setCredentialsProvider(myCredentialsProvider);
}
public HtmlPage request(String aUrl) throws IOException
{
return theWebClient.getPage(aUrl);
}
public void close() throws IOException { theWebClient.close(); }
}
这是我正在运行的爬虫的简化版本,其中 Client 作为 ProxyClient 传入:
public class BusinessSearchTaxCrawler
{
private String theBaseUrl = "https://apps.ilsos.gov/corporatellc/CorporateLlcController";
private HtmlPage thePage;
public BusinessSearchTaxCrawler()
{
thePage = null;
}
public boolean getBusinessMailingAddress(Client aClient,PropertyInfo aPropertyInfo)
{
try
{
thePage = aClient.request(theBaseUrl);
} catch (Exception aE)
{
aE.printstacktrace();
}
return false;
}
}
这是错误的完整堆栈跟踪:
javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createlayeredSocket(SSLConnectionSocketFactory.java:436)
at org.apache.http.impl.conn.DefaultHttpClientConnectionoperator.upgrade(DefaultHttpClientConnectionoperator.java:191)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:392)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:428)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:177)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1324)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1241)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:348)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:417)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:402)
at ProxyClient.request(ProxyClient.java:39)
at BusinessSearchTaxCrawler.getBusinessMailingAddress(BusinessSearchTaxCrawler.java:24)
at Main.main(Main.java:27)
Caused by: java.io.EOFException: SSL peer shut down incorrectly
at sun.security.ssl.InputRecord.read(InputRecord.java:505)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
... 22 more
这些是我为调试问题所采取的步骤:
-在没有代理的情况下运行代码正常运行。我可以使用以下代码毫无问题地连接到主机端点:
WebClient theWebClient = new WebClient();
theWebClient.getoptions().setCssEnabled(false);
theWebClient.getoptions().setJavaScriptEnabled(false);
thePage = theWebClient.getPage(theBaseUrl);
-我尝试将 -Dhttps.protocols=TLSv1.1,TLSv1.2
添加到 VM 选项。这并没有改变结果
-我使用 -Djavax.net.debug=all
运行应用程序并在堆栈跟踪中观察到以下内容:
Ignoring unsupported cipher suite: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_DHE_DSS_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_DHE_DSS_WITH_AES_128_CBC_SHA256 for TLSv1.1
%% No cached client session
*** ClientHello,TLSv1.2
RandomCookie: GMT: 1591992531 bytes = { 169,86,174,70,252,104,167,236,15,50,36,85,3,119,151,231,179,110,140,53,169,249,35,95,76,189,130 }
Session ID: {}
Cipher Suites: [TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256,TLS_DHE_RSA_WITH_AES_128_CBC_SHA256,TLS_DHE_DSS_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDH_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_DSS_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256,TLS_DHE_RSA_WITH_AES_128_GCM_SHA256,TLS_DHE_DSS_WITH_AES_128_GCM_SHA256,TLS_EMPTY_RENEGOTIATION_INFO_SCSV]
Compression Methods: { 0 }
Extension elliptic_curves,curve names: {secp256r1,secp384r1,secp521r1,sect283k1,sect283r1,sect409k1,sect409r1,sect571k1,sect571r1,secp256k1}
Extension ec_point_formats,formats: [uncompressed]
Extension signature_algorithms,signature_algorithms: SHA512withECDSA,SHA512withRSA,SHA384withECDSA,SHA384withRSA,SHA256withECDSA,SHA256withRSA,SHA256withDSA,SHA224withECDSA,SHA224withRSA,SHA224withDSA,SHA1withECDSA,SHA1withRSA,SHA1withDSA
Extension server_name,server_name: [type=host_name (0),value=apps.ilsos.gov]
***
[write] MD5 and SHA1 hashes: len = 176
0000: 01 00 00 AC 03 03 5F E4 E1 D3 A9 56 AE 46 FC 68 ......_....V.F.h
0010: A7 EC 0F 32 24 55 03 77 97 E7 B3 6E 8C 35 68 A9 ...2$U.w...n.5h.
0020: F9 23 5F 4C BD 82 00 00 2C C0 23 C0 27 00 3C C0 .#_L....,.#.'.<.
0030: 25 C0 29 00 67 00 40 C0 09 C0 13 00 2F C0 04 C0 %.).g.@...../...
0040: 0E 00 33 00 32 C0 2B C0 2F 00 9C C0 2D C0 31 00 ..3.2.+./...-.1.
0050: 9E 00 A2 00 FF 01 00 00 57 00 0A 00 16 00 14 00 ........W.......
0060: 17 00 18 00 19 00 09 00 0A 00 0B 00 0C 00 0D 00 ................
0070: 0E 00 16 00 0B 00 02 01 00 00 0D 00 1C 00 1A 06 ................
0080: 03 06 01 05 03 05 01 04 03 04 01 04 02 03 03 03 ................
0090: 01 03 02 02 03 02 01 02 02 00 00 00 13 00 11 00 ................
00A0: 00 0E 61 70 70 73 2E 69 6C 73 6F 73 2E 67 6F 76 ..apps.ilsos.gov
main,WRITE: TLSv1.2 Handshake,length = 176
[Raw write]: length = 181
0000: 16 03 03 00 B0 01 00 00 AC 03 03 5F E4 E1 D3 A9 ..........._....
0010: 56 AE 46 FC 68 A7 EC 0F 32 24 55 03 77 97 E7 B3 V.F.h...2$U.w...
0020: 6E 8C 35 68 A9 F9 23 5F 4C BD 82 00 00 2C C0 23 n.5h..#_L....,.#
0030: C0 27 00 3C C0 25 C0 29 00 67 00 40 C0 09 C0 13 .'.<.%.).g.@....
0040: 00 2F C0 04 C0 0E 00 33 00 32 C0 2B C0 2F 00 9C ./.....3.2.+./..
0050: C0 2D C0 31 00 9E 00 A2 00 FF 01 00 00 57 00 0A .-.1.........W..
0060: 00 16 00 14 00 17 00 18 00 19 00 09 00 0A 00 0B ................
0070: 00 0C 00 0D 00 0E 00 16 00 0B 00 02 01 00 00 0D ................
0080: 00 1C 00 1A 06 03 06 01 05 03 05 01 04 03 04 01 ................
0090: 04 02 03 03 03 01 03 02 02 03 02 01 02 02 00 00 ................
00A0: 00 13 00 11 00 00 0E 61 70 70 73 2E 69 6C 73 6F .......apps.ilso
00B0: 73 2E 67 6F 76 s.gov
main,received EOFException: error
main,handling exception: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
main,SEND TLSv1.2 ALERT: fatal,description = handshake_failure
main,WRITE: TLSv1.2 Alert,length = 2
[Raw write]: length = 7
0000: 15 03 03 00 02 02 28 ......(
main,called closeSocket()
我运行 openssl s_client -connect ilsos.gov:443
并观察到以下情况:
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES128-GCM-SHA256
Session-ID: 66D1C471C9CA0DA2BCE6DA7675DF099D134BB0495C69D05B52AE0A5F4CF7976F
Session-ID-ctx:
Master-Key: A7B388126D92E03C1314EDE2815E9E8A38CF10FD745CB13C2F6163E0FBB05F35CF17CAF18128F072FCF1D1B03A4C3A11
Start Time: 1608833542
Timeout : 7200 (sec)
Verify return code: 0 (ok)
最后,我读到某处添加了 System.setProperty("https.protocols","TLSv1,TLSv1.1,TLSv1.2");
,我在 ProxyClient
类的构造函数中添加了它。这也没有解决问题。
我还是个菜鸟,不知道怎么看清楚上面的调试信息。但我怀疑代理以某种方式使用了比我的机器更旧的 TLS 协议。感谢您的帮助。
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)