问题描述
我尝试在 Rest API 中解析分块传输编码生成的数据,当我尝试打印字符串中的值时,我确实看到数据具有值,我认为它应该可以工作,但是当我尝试时将值分配给文件,文件完全不可读,下面的代码我使用了boost库,我将在代码中详细说明我的想法,我们将从代码的响应部分开始,我不知道我有什么问题做过了
// Send the request.
boost::asio::write(socket,request);
// Read the response status line. The response streambuf will automatically
// grow to accommodate the entire line. The growth may be limited by passing
// a maximum size to the streambuf constructor.
boost::asio::streambuf response;
boost::asio::read_until(socket,response,"\r\n");
// Check that response is OK.
std::istream response_stream(&response);
std::string http_version;
response_stream >> http_version;
unsigned int status_code;
response_stream >> status_code;
std::string status_message;
std::getline(response_stream,status_message);
if (!response_stream || http_version.substr(0,5) != "HTTP/")
{
//std::cout << "Invalid response\n";
return 9002;
}
if (status_code != 200)
{
//std::cout << "Response returned with status code " << status_code << "\n";
return 9003;
}
// Read the response headers,which are terminated by a blank line.
boost::asio::read_until(socket,"\r\n\r\n");
// Process the response headers.
//this portion of code I tried to parse the file name in the header of response which the file name is in the content-disposition of header
std::string header;
std::string fullHeader = "";
string zipfilename="",txtfilename="";
bool foundfilename = false;
while (std::getline(response_stream,header) && header != "\r")
{
fullHeader.append(header).append("\n");
std::transform(header.begin(),header.end(),header.begin(),[](unsigned char c){ return std::tolower(c); });
string containstr = "content-disposition";
string containstr2 = "filename";
string quotestr = "\"";
if (header.find(containstr) != std::string::npos && header.find(containstr2) != std::string::npos)
{
int countquotes = 0;
bool foundquote = true;
std::size_t startpos = 0,beginpos,endpos;
while (foundquote)
{
std::size_t myfound = header.find(quotestr,startpos);
if (myfound != std::string::npos)
{
if (countquotes % 2 == 0)
beginpos = myfound;
else
{
endpos = myfound;
foundfilename = true;
}
startpos = myfound + 1;
}
else
foundquote = false;
countquotes++;
}
if (endpos > beginpos && foundfilename)
{
size_t zipfileleng = endpos - beginpos;
zipfilename = header.substr(beginpos+1,zipfileleng-1);
txtfilename = header.substr(beginpos+1,zipfileleng-5);
}
else
return 9004;
}
}
if (foundfilename == false || zipfilename.length() == 0 || txtfilename.length() == 0)
return 9005;
//when the zipfilename has been found,we gonna get the data from the body of response,due to the response was chunked transfer encoding,I tried to parse it,it's not complicated due to I saw it on the Wikipedia,it just first line was length of data,the next line was data,and it's the loop which over and over again,all I tried to do was spliting all the data from the body of response by "\r\n" into a vector<string>,and I gonna read the data from that vector
// Write whatever content we already have to output.
std::string fullResponse = "";
if (response.size() > 0)
{
std::stringstream ss;
ss << &response;
fullResponse = ss.str();
}
//tried split the entire body of response into a vector<string>
vector<string> allresponsedata;
split_regex(allresponsedata,fullResponse,boost::regex("(\r\n)+"));
//tried to merge the data of response
string zipfiledata;
int myindex = 0;
for (auto &x : allresponsedata) {
std::cout << "Split: " << x << std::endl;// I tried to print the data,I did see the value in the variable of x
if (myindex % 2 != 0)
{
zipfiledata = zipfiledata + x;//tried to accumulate the datas
}
myindex++;
}
//tried to write the data into a file
std::ofstream zipfilestream(zipfilename,ios::out | ios::binary);
zipfilestream.write(zipfiledata.c_str(),zipfiledata.length());
zipfilestream.close();
//afterward,the zipfile was built,but it's unreadable which it's not able to open,the zip utlities software says it's a damaged zip file though
我什至尝试过类似 slow http client based on boost::asio - (Chunked Transfer) 的其他方法,但这种方法效果不佳,VS 说
1 IntelliSense: no instance of overloaded function "boost::asio::read" matches the argument list
argument types are: (boost::asio::ip::tcp::socket,boost::asio::streambuf,boost::asio::detail::transfer_exactly_t,std::error_code)
它只是无法编译
size_t n = asio::read(socket,asio::transfer_exactly(chunk_bytes_to_read),error);
即使我读过 asio::transfer_exactly 的例子,也没有这样的例子https://www.boost.org/doc/libs/1_57_0/doc/html/boost_asio/reference/transfer_exactly.html
有什么想法吗?
解决方法
我看你没有正确阅读格式:https://en.wikipedia.org/wiki/Chunked_transfer_encoding#Format
您需要在累积完整响应正文之前读取块长度(以十六进制表示)和任何可选的块扩展。
它需要在之前完成,因为您拆分的序列\r\n
很容易出现在块数据中。
同样,我建议只使用 Beast 的支持,让一切变得简单
http::response<http::string_body> response;
boost::asio::streambuf buf;
http::read(socket,buf,response);
您将完全解析、解释标头(包括 Trailer
标头!),并将 response.body()
中的内容作为 std::string
。
即使服务器不使用分块编码或结合不同的编码选项,它也会做正确的事情。
根本没有理由重新发明轮子。
完整演示
这演示了来自 https://jigsaw.w3.org/HTTP/ 的分块编码测试网址:
#include <boost/process.hpp>
#include <boost/beast.hpp>
#include <iostream>
namespace http = boost::beast::http;
using boost::asio::ip::tcp;
int main() {
http::response<http::string_body> response;
boost::asio::io_context ctx;
tcp::socket socket(ctx);
connect(socket,tcp::resolver{ctx}.resolve("jigsaw.w3.org","http"));
http::write(
socket,http::request<http::empty_body>(
http::verb::get,"/HTTP/ChunkedScript",11));
boost::asio::streambuf buf;
http::read(socket,response);
std::cout << response.body() << "\n";
std::cout << "Effective headers are:" << response.base() << "\n";
}
打印
This output will be chunked encoded by the server,if your client is HTTP/1.1
Below this line,is 1000 repeated lines of 0-9.
-------------------------------------------------------------------------
01234567890123456789012345678901234567890123456789012345678901234567890
01234567890123456789012345678901234567890123456789012345678901234567890
...996 lines removed ...
01234567890123456789012345678901234567890123456789012345678901234567890
01234567890123456789012345678901234567890123456789012345678901234567890
Effective headers are:HTTP/1.1 200 OK
cache-control: max-age=0
date: Wed,31 Mar 2021 20:09:50 GMT
transfer-encoding: chunked
content-type: text/plain
etag: "1j3k6u8:tikt981g"
expires: Wed,31 Mar 2021 20:09:49 GMT
last-modified: Mon,18 Mar 2002 14:28:02 GMT
server: Jigsaw/2.3.0-beta3