问题描述
我正在尝试在 uproot4 中读取 TH1D 对象的一个分支。可以使用以下命令创建示例根文件:
TFile * f = new TFile("new.root","RECREATE");
TTree * t = new TTree("mytree","mytree");
t->SetMakeClass(1); //See note later
TH1D * histo;
t->Branch("myhisto","TH1D",&histo);
for(int i=0;i<100;i++){
t->GetEntry(i);
histo = new TH1D(Form("histo_%d",i),Form("histo_%d",100,100);
histo->Fill(i);
t->Fill();
}
t->Print();
t->Write();
连根拔起:
Python 3.8.6 (default,Jan 27 2021,15:42:20)
Type 'copyright','credits' or 'license' for more information
IPython 7.17.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import uproot
In [2]: uproot.__version__
Out[2]: '4.0.5'
In [3]: uproot.open("new.root:mytree/myhisto")
Out[3]: <TBranchElement 'myhisto' at 0x7f91da583e50>
In [4]: uproot.open("new.root:mytree/myhisto").interpretation
Out[4]: AsObjects(Model_TH1D)
但是,当我尝试读取数组时,它失败并显示很长的回溯。最后的调用是:
~/.local/lib/python3.8/site-packages/uproot/model.py in read(cls,chunk,cursor,context,file,selffile,parent,concrete)
798 )
799
--> 800 self.read_members(chunk,file)
801
802 self.hook_after_read_members(
~/.local/lib/python3.8/site-packages/uproot/models/TArray.py in read_members(self,file)
41 )
42 self._members["fN"] = cursor.field(chunk,_tarray_format1,context)
---> 43 self._data = cursor.array(chunk,self._members["fN"],self.dtype,context)
44
45 def __array__(self,*args,**kwargs):
~/.local/lib/python3.8/site-packages/uproot/source/cursor.py in array(self,length,dtype,move)
308 if move:
309 self._index = stop
--> 310 return numpy.frombuffer(chunk.get(start,stop,self,context),dtype=dtype)
311
312 _u1 = numpy.dtype("u1")
~/.local/lib/python3.8/site-packages/uproot/source/chunk.py in get(self,start,context)
366
367 else:
--> 368 raise uproot.deserialization.DeserializationError(
369 """attempting to get bytes {0}:{1}
370 outside expected range {2}:{3} for this Chunk""".format(
DeserializationError: while reading
TH1D version 8 as uproot.dynamic.Model_TH1D_v3 (514 bytes)
TH1 version 1 as uproot.dynamic.Model_TH1_v8 (18 bytes)
(base): <TNamed '' at 0x7f91da38d430>
(base): <TAttLine (version 2) at 0x7f91da38d700>
(base): <TAttFill (version 2) at 0x7f91da38da30>
(base): <TAttMarker (version 2) at 0x7f91da38dd90>
fNcells: 0
TAxis version 2 as uproot.dynamic.Model_TAxis_v10 (12 bytes)
(base): <TNamed '' title='\x00\x00' at 0x7f91da398910>
(base): <TAttAxis (version 4) at 0x7f91da398bb0>
fNbins: 81920
fXmin: 8.34406940932277e-309
fXmax: 2.0000190735445362
TArrayD version None as uproot.models.TArray.Model_TArrayD (? bytes)
fN: 81792
TH1D version 8 as uproot.dynamic.Model_TH1D_v3 (514 bytes)
TH1 version 1 as uproot.dynamic.Model_TH1_v8 (18 bytes)
(base): <TNamed '' at 0x7f91da495850>
(base): <TAttLine (version 2) at 0x7f91da398970>
(base): <TAttFill (version 2) at 0x7f91da48cdc0>
(base): <TAttMarker (version 2) at 0x7f91da3773d0>
fNcells: 0
TAxis version 2 as uproot.dynamic.Model_TAxis_v10 (12 bytes)
(base): <TNamed '' title='\x00\x00' at 0x7f91da3779d0>
(base): <TAttAxis (version 4) at 0x7f91da377d30>
fNbins: 81920
fXmin: 8.34406940932277e-309
fXmax: 2.0000190735445362
TArrayD version None as uproot.models.TArray.Model_TArrayD (? bytes)
fN: 81792
attempting to get bytes 58:654394
outside expected range 0:542 for this Chunk
in file new.root
in object /mytree;1
如果我设置了 SetMakeClass(0);在创建文件时,读取失败,而是:
~/.local/lib/python3.8/site-packages/uproot/model.py in read(cls,file)
801
802 self.hook_after_read_members(
<dynamic> in read_members(self,file)
NotImplementedError: memberwise serialization of Model_TAxis_v10
in file new.root
测试了 ROOT 6.22/06 和 5.34/21,连根拔起 4.0.5 和 4.0.6,使用 python 2.7.18 和 3.8.6 解释器。我做错了什么吗?
解决方法
请查看 Jim 的回答并点击其中的链接,查看 uproot 的 Issue #38 是否已修复。
以下不是解决方案,而是一种变通方法。如果您有权访问 ROOT,则可以从 TH1D 分支检索 bin 边缘和内容,并将它们保存为 TArrayD 的两个独立分支,可以被 uproot 读取。执行此操作的示例宏是(指原始问题中的变量名称):
void dumpTH1Array(){
//See the original question for the content of new.root
TFile * f = new TFile("new.root","UPDATE");
TTree * t = (TTree*)f->Get("mytree");
//Branch to read TH1D
TH1D * histo = 0;
TBranch * b_histo = 0;
t->SetBranchAddress("myhisto",&histo,&b_histo);
//Create new branches of TArrayD objects.
TArrayD * hx = new TArrayD();
TArrayD * hy = new TArrayD();
TBranch * b_hx = t->Branch("myhisto_x","TArrayD",&hx);
TBranch * b_hy = t->Branch("myhisto_y",&hy);
UInt_t nentries = t->GetEntries();
for(UInt_t i = 0; i<nentries; ++i){
//Get the stuff
Long64_t localEntry = t->LoadTree(i);
b_histo->GetEntry(localEntry);
b_hx->GetEntry(localEntry);
b_hy->GetEntry(localEntry);
//nbins includes the under- and overflow bins,//so it is actually the user defined nbins+2.
UInt_t nbins = histo->GetSize();
//We suppose that the TH1D has fixed binning
//so histo->GetXaxis()->GetXbins() would just
//return a null pointer. We rebuild the edges
//array.
Double_t * binedges = new Double_t[nbins-1];
TAxis * xaxis = histo->GetXaxis();
xaxis->GetLowEdge(binedges);
binedges[nbins-2]=xaxis->GetBinUpEdge(nbins-2);
//Set them.
hx = new TArrayD(nbins-1,binedges);
hy = new TArrayD(nbins,histo->GetArray());
//Fill them back.
b_hx->Fill();
b_hy->Fill();
}
t->Write();
f->Close();
//Goodbye
}
您将最终得到两个新分支,“myhisto_x”和“myhisto_y”,包括按升序排列的 bins 边缘(大小:用户定义的 bins +1)及其内容(大小:用户定义的 bins +2,包括下溢和溢出箱)。这些可以很容易地在根目录下阅读。
,你没有做错什么:这是一个 NotImplementedError 因为成员序列化还没有在 Uproot 中实现。这就是 Issue #38,最近受到了很多关注。
其他人在多年后发现此问题:检查问题 #38 是否已解决。