您好,登錄后才能下訂單哦!
摘要:詳細介紹了C++中的Name Mangling的原理和gcc中對應的實現,通過程序代碼和nm c++filt等工具來驗證這些原理。對于詳細了解程序的鏈接過程有一定的幫助。
Name Mangling概述
C++的語言特性比C豐富的多,C++支持的函數重載功能是需要Name Mangling技術的最直接的例子。對于重載的函數,不能僅依靠函數名稱來區分不同的函數,因為C++中重載函數的區分是建立在以下規則上的:
當然,C++還有很多其他的地方需要Name Mangling,如namespace, class, template等等。
- /*
- * simple_test.c
- * a demo to show that different name mangling technology in C++ and C
- * Author: Chaos Lee
- */
- #include<stdio.h>
- int rect_area(int x1,int x2,int y1,int y2)
- {
- return (x2-x1) * (y2-y1);
- }
- int elipse_area(int a,int b)
- {
- return 3.14 * a * b;
- }
- int main(int argc,char *argv[])
- {
- int x1 = 10, x2 = 20, y1 = 30, y2 = 40;
- int a = 3,b=4;
- int result1 = rect_area(x1,x2,y1,y2);
- int result2 = elipse_area(a,b);
- return 0;
- }
- [lichao@sg01 name_mangling]$ gcc -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000027 T elipse_area
- 0000000000000051 T main
- 0000000000000000 T rect_area
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000028 T _Z11elipse_areaii
- 0000000000000000 T _Z9rect_areaiiii
- U __gxx_personality_v0
- 0000000000000052 T main
l C++語言中規定 :以下劃線并緊挨著大寫字母開頭或者以兩個下劃線開頭的標識符都是C++語言中保留的標示符。所以_Z9rect_areaiiii是保留的標識符,g++編譯的目標文件中的符號使用_Z開頭(C99標準)。
- /*
- * simple_test.c
- * a demo to show that different name mangling technology in C++ and C
- * Author: Chaos Lee
- */
- #include<stdio.h>
- #ifdef __cplusplus
- extern "C" {
- #endif
- int rect_area(int x1,int x2,int y1,int y2)
- {
- return (x2-x1) * (y2-y1);
- }
- int elipse_area(int a,int b)
- {
- return (int)(3.14 * a * b);
- }
- #ifdef __cplusplus
- }
- #endif
- int main(int argc,char *argv[])
- {
- int x1 = 10, x2 = 20, y1 = 30, y2 = 40;
- int a = 3,b=4;
- int result1 = rect_area(x1,x2,y1,y2);
- int result2 = elipse_area(a,b);
- return 0;
- }
- [lichao@sg01 name_mangling]$ gcc -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000027 T elipse_area
- 0000000000000051 T main
- 0000000000000000 T rect_area
- [lichao@sg01 name_mangling]$ g++ -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- U __gxx_personality_v0
- 0000000000000028 T elipse_area
- 0000000000000052 T main
- 0000000000000000 T rect_area
事實上,C標準庫中使用了大量的extern “C”關鍵字,因為C標準庫也是可以用C++編譯器編譯的,但是要確保編譯之后仍然保持C的接口而不是C++的接口(因為是C標準庫),所以需要使用extern “C”關鍵字。
- /*
- * libc_test.c
- * a demo program to show that how the standard C
- * library are compiled when encountering a C++ compiler
- */
- #include<stdio.h>
- int main(int argc,char * argv[])
- {
- puts("hello world.\n");
- return 0;
- }
搜索一下puts,我們并沒有看到extern “C”.奇怪么?
- [lichao@sg01 name_mangling]$ g++ -E libc_test.c | grep 'puts'
- extern int fputs (__const char *__restrict __s, FILE *__restrict __stream);
- extern int puts (__const char *__s);
- extern int fputs_unlocked (__const char *__restrict __s,
- puts("hello world.\n");
- [lichao@sg01 name_mangling]$ g++ -E libc_test.c | grep 'extern "C"'
- extern "C" {
- extern "C" {
不同編譯器使用不同的方式進行name mangling, 你可能會問為什么不將C++的 name mangling標準化,這樣就能實現各個編譯器之間的互操作了。事實上,在C++的FAQ列表上有對此問題的回答:
"Compilers differ as to how objects are laid out, how multiple inheritance is implemented, how virtual function calls are handled, and so on, so if the name mangling were made the same, your programs would link against libraries provided from other compilers but then crash when run. For this reason, the ARM (Annotated C++ Reference Manual) encourages compiler writers to make their name mangling different from that of other compilers for the same platform. Incompatible libraries are then detected at link time, rather than at run time."
GCC采用IA 64的name mangling方案,此方案定義于Intel IA64 standard ABI.在g++的FAQ列表中有以下一段話:
"GNU C++ does not do name mangling in the same way as other C++ compilers.
This means that object files compiled with one compiler cannot be used with
GNU C++的name mangling方案和其他C++編譯器方案不同,所以一種編譯器生成的目標文件并不能被另外一種編譯器生成的目標文件使用。
- Builtin types encoding
- <builtin-type> ::= v # void
- ::= w # wchar_t
- ::= b # bool
- ::= c # char
- ::= a # signed char
- ::= h # unsigned char
- ::= s # short
- ::= t # unsigned short
- ::= i # int
- ::= j # unsigned int
- ::= l # long
- ::= m # unsigned long
- ::= x # long long, __int64
- ::= y # unsigned long long, __int64
- ::= n # __int128
- ::= o # unsigned __int128
- ::= f # float
- ::= d # double
- ::= e # long double, __float80
- ::= g # __float128
- ::= z # ellipsis
- ::= u <source-name> # vendor extended type
Operator encoding
- <operator-name> ::= nw # new
- ::= na # new[]
- ::= dl # delete
- ::= da # delete[]
- ::= ps # + (unary)
- ::= ng # - (unary)
- ::= ad # & (unary)
- ::= de # * (unary)
- ::= co # ~
- ::= pl # +
- ::= mi # -
- ::= ml # *
- ::= dv # /
- ::= rm # %
- ::= an # &
- ::= or # |
- ::= eo # ^
- ::= aS # =
- ::= pL # +=
- ::= mI # -=
- ::= mL # *=
- ::= dV # /=
- ::= rM # %=
- ::= aN # &=
- ::= oR # |=
- ::= eO # ^=
- ::= ls # <<
- ::= rs # >>
- ::= lS # <<=
- ::= rS # >>=
- ::= eq # ==
- ::= ne # !=
- ::= lt # <
- ::= gt # >
- ::= le # <=
- ::= ge # >=
- ::= nt # !
- ::= aa # &&
- ::= oo # ||
- ::= pp # ++
- ::= mm # --
- ::= cm # ,
- ::= pm # ->*
- ::= pt # ->
- ::= cl # ()
- ::= ix # []
- ::= qu # ?
- ::= st # sizeof (a type)
- ::= sz # sizeof (an expression)
- ::= cv <type> # (cast)
- ::= v <digit> <source-name> # vendor extended operator
- <type> ::= <CV-qualifiers> <type>
- ::= P <type> # pointer-to
- ::= R <type> # reference-to
- ::= O <type> # rvalue reference-to (C++0x)
- ::= C <type> # complex pair (C 2000)
- ::= G <type> # imaginary (C 2000)
- ::= U <source-name> <type> # vendor extended type qualifier
- /*
- * Author: Chaos Lee
- * Description: A simple demo to show how the rules used to mangle functions' names work
- * Date:2012/05/06
- */
- #include<iostream>
- #include<string>
- using namespace std;
- int test_func(int & tmpInt,const char * ptr,double dou,string str,float f)
- {
- return 0;
- }
- int main(int argc,char * argv[])
- {
- char * test="test";
- int intNum = 10;
- double dou = 10.012;
- string str="str";
- float f = 1.2;
- test_func(intNum,test,dou,str,f);
- return 0;
- }
- [lichao@sg01 name_mangling]$ g++ -c func.cpp
- [lichao@sg01 name_mangling]$ nm func.cpp
- nm: func.cpp: File format not recognized
- [lichao@sg01 name_mangling]$ nm func.o
- 0000000000000060 t _GLOBAL__I__Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t _Z41__static_initialization_and_destruction_0ii
- 0000000000000000 T _Z9test_funcRiPKcdSsf
- U _ZNSaIcEC1Ev
- U _ZNSaIcED1Ev
- U _ZNSsC1EPKcRKSaIcE
- U _ZNSsC1ERKSs
- U _ZNSsD1Ev
- U _ZNSt8ios_base4InitC1Ev
- U _ZNSt8ios_base4InitD1Ev
- 0000000000000000 b _ZSt8__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
加粗的那行就是函數test_func經過name mangling之后的結果,其中:
C++的name mangling技術一般使得函數變得面目全非,而很多情況下我們在查看這些符號的時候并不需要看到這些函數name mangling之后的效果,而是想看看是否定義了某個函數,或者是否引用了某個函數,這對于我們調試程序是非常有幫助的。
所以需要一種方法從name mangling之后的符號變換為name mangling之前的符號,這個過程稱之為name demangling.事實上有很多工具提供這些功能,最常用的就是c++file命令,c++filt命令接受一個name mangling之后的符號作為輸入并輸出demangling之后的符號。例如:
- [lichao@sg01 name_mangling]$ c++filt _Z9test_funcRiPKcdSsf
- test_func(int&, char const*, double, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, float)
- [lichao@sg01 name_mangling]$ nm func.o | c++filt
- 0000000000000060 t global constructors keyed to _Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t __static_initialization_and_destruction_0(int, int)
- 0000000000000000 T test_func(int&, char const*, double, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, float)
- U std::allocator<char>::allocator()
- U std::allocator<char>::~allocator()
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
- U std::ios_base::Init::Init()
- U std::ios_base::Init::~Init()
- 0000000000000000 b std::__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
- [lichao@sg01 name_mangling]$ nm -C func.o
- 0000000000000060 t global constructors keyed to _Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t __static_initialization_and_destruction_0(int, int)
- 0000000000000000 T test_func(int&, char const*, double, std::string, float)
- U std::allocator<char>::allocator()
- U std::allocator<char>::~allocator()
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
- U std::ios_base::Init::Init()
- U std::ios_base::Init::~Init()
- 0000000000000000 b std::__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
又到了Last but not least important的時候了,還有一個特別重要的接口函數就是__cxa_demangle(),此函數的原型為:
- namespace abi {
- extern "C" char* __cxa_demangle (const char* mangled_name,
- char* buf,
- size_t* n,
- int* status);
- }
- /*
- * Author: Chaos Lee
- * Description: Employ __cxa_demangle to demangle a mangling function name.
- * Date:2012/05/06
- *
- */
- #include<iostream>
- #include<cxxabi.h>
- using namespace std;
- using namespace abi;
- int main(int argc,char *argv[])
- {
- const char * mangled_string = "_Z9test_funcRiPKcdSsf";
- char buffer[100];
- int status;
- size_t n=100;
- __cxa_demangle(mangled_string,buffer,&n,&status);
- cout<<buffer<<endl;
- cout<<status<<endl;
- return 0;
- }
- [lichao@sg01 name_mangling]$ g++ cxa_demangle.cpp -o cxa_demangle
- [lichao@sg01 name_mangling]$ ./cxa_demangle
- test_func(int&, char const*, double, std::string, float)
- 0
l 編寫名稱為name mangling接口函數,打開重復符號的編譯開關,可以替換原來函數中鏈接函數的指向,從而改變程序的運行結果。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。