问题描述
首先,问题是
Thread 2 "lexe.exe" received signal SIGSEGV,Segmentation fault.
[Switching to Thread 0x7ffff2c89700 (LWP 17367)]
do_lookup_x (undef_name=undef_name@entry=0x7fffeb893ad9 "algoLenI",new_hash=new_hash@entry=1920857680,old_hash=old_hash@entry=0x7ffff2c88460,ref=0x0,result=result@entry=0x7ffff2c88470,scope=0x1800029030008080,i=0,version=0x0,flags=2,skip=0x0,type_class=0,undef_map=0x7ffff2c8a000) at dl-lookup.c:339
339 dl-lookup.c: No such file or directory.
(gdb) bt
#0 do_lookup_x (undef_name=undef_name@entry=0x7fffeb893ad9 "algoLenI",undef_map=0x7ffff2c8a000) at dl-lookup.c:339
#1 0x00007ffff7de023f in _dl_lookup_symbol_x (undef_name=0x7fffeb893ad9 "algoLenI",undef_map=0x7ffff2c8a000,ref=0x7ffff2c88528,symbol_scope=0x7ffff2c8a388,skip_map=<optimized out>) at dl-lookup.c:813
#2 0x00007ffff698bfe6 in do_sym (flags=<optimized out>,vers=0x0,who=0x7ffff2db36e9 <loadLibFunc2+248>,name=0x7fffeb893ad9 "algoLenI",handle=0x7ffff2c8a000) at dl-sym.c:151
#3 _dl_sym (handle=0x7ffff2c8a000,who=0x7ffff2db36e9 <loadLibFunc2+248>) at dl-sym.c:254
#4 0x00007ffff6c170e4 in dlsym_doit (a=a@entry=0x7ffff2c88770) at dlsym.c:50
#5 0x00007ffff698c51f in __GI__dl_catch_exception (exception=exception@entry=0x7ffff2c88700,operate=0x7ffff6c170d0 <dlsym_doit>,args=0x7ffff2c88770) at dl-error-skeleton.c:196
#6 0x00007ffff698c5af in __GI__dl_catch_error (objname=0x603000000020,errstring=0x603000000028,mallocedp=0x603000000018,operate=<optimized out>,args=<optimized out>) at dl-error-skeleton.c:215
#7 0x00007ffff6c17745 in _dlerror_run (operate=operate@entry=0x7ffff6c170d0 <dlsym_doit>,args=args@entry=0x7ffff2c88770) at dlerror.c:162
#8 0x00007ffff6c17166 in __dlsym (handle=<optimized out>,name=0x7fffeb893ad9 "algoLenI") at dlsym.c:70
#9 0x00007ffff2db36e9 in loadLibFunc2 (T=0x7ffff30ff064,PVLib=0x7ffff2c8a000,FuncName=0x7fffeb893ad9 "algoLenI",Check=0 '\000') at ../loadLib/loadLib.c:206
void* loadLibFunc2(void* T,void* PVLib,char* FuncName,bool Check) {
// ... checked PVLib not null
void* Res = dlsym(PVLib,FuncName);
// ... further code that is not reached
}
使用-fsanitize=address -fno-omit-frame-pointer
干净地重新编译加载器和库没有显示任何错误。
Valgrind:
==10456== Invalid read of size 4
==10456== at 0x400A2A1: do_lookup_x (dl-lookup.c:339)
==10456== by 0x400B23E: _dl_lookup_symbol_x (dl-lookup.c:813)
==10456== by 0x51A6FE5: do_sym (dl-sym.c:151)
==10456== by 0x51A6FE5: _dl_sym (dl-sym.c:254)
==10456== by 0x4E3D0E3: dlsym_doit (dlsym.c:50)
==10456== by 0x51A751E: _dl_catch_exception (dl-error-skeleton.c:196)
==10456== by 0x51A75AE: _dl_catch_error (dl-error-skeleton.c:215)
==10456== by 0x4E3D744: _dlerror_run (dlerror.c:162)
==10456== by 0x4E3D165: dlsym (dlsym.c:70)
==10456== by 0x5CFE821: loadLibFunc2 (loadLib.c:206)
看起来dlsym具有正确的输入:
(gdb) x/8xb PVLib
0x7ffff6e04000: 0x7f 0x45 0x4c 0x46 0x02 0x01 0x01 0x00
这看起来像是图书馆的开头
(gdb) x/8xb (char*)PVLib+0x919a7
0x7ffff6e959a7 <algoLenI>: 0x55 0x48 0x89 0xe5 0x48 0x83 0xec 0x10
这看起来就像algoLenI的开头,GDB甚至可以识别它。
由于主程序也具有algoLenI,dlsym是否会感到困惑?:
(gdb) p algoLenI
$4 = {int32 (void *,charC *)} 0x555555566e81 <algoLenI>
(gdb) x/8xb 0x555555566e81
0x555555566e81 <algoLenI>: 0x55 0x48 0x89 0xe5 0x48 0x83 0xec 0x10
一个原因可能是dlsym的调用者已经是来自同一.so的代码:加载程序可执行文件加载了.so,那么.so内部的某些代码也间接需要dlsym。不过,这应该不成问题,因为dlsym似乎具有正确的参数(请参见上文)。
解决方法
问题是,在某些条件下,指向库的指针在其他地方被取消引用。
库指针不应指向这样的内容(与文件的开头相对应)
0x7f 0x45 0x4c 0x46 0x02 0x01 0x01 0x00
但是到dlopen返回的不透明句柄。间接使我们能够找到与文件内容相对应的内容只是“信心”:不透明的句柄始于该操作系统上指向前者的指针。