Talk:Name mangling

From Wikipedia, the free encyclopedia

Contents

[edit] Resources

To keep everyone in sync, here are the programs we use to generate the examples:

[edit] general C++ mangling example

void h (int a) {}
void h (int a, char b) {}
void h () {}

You can get the mangled named from the object file via c++filt (at least on Solaris and Linux):

ejb/severance [~] % nm sym2.o
sym2.o:
[Index]   Value      Size    Type  Bind  Other Shndx   Name 
[2]     |        32|       5|FUNC |GLOB |0    |2      |__1cBh6F_v_
[4]     |         0|       5|FUNC |GLOB |0    |2      |__1cBh6Fi_v_
[3]     |        16|       5|FUNC |GLOB |0    |2      |__1cBh6Fic_v_
[1]     |         0|       0|FILE |LOCL |0    |ABS    |sym2.cc
ejb/severance [~] % nm sym2.o|c++filt
sym2.o:
[Index]   Value      Size    Type  Bind  Other Shndx   Name
[2]     |        32|       5|FUNC |GLOB |0    |2      |void h()
[4]     |         0|       5|FUNC |GLOB |0    |2      |void h(int)
[3]     |        16|       5|FUNC |GLOB |0    |2      |void h(int,char)
[1]     |         0|       0|FILE |LOCL |0    |ABS    |sym2.cc

And for VMS (if anyone ever needs it ;-):

$ cxxdem/man "void h(int)", "void h(int, char)", "void h()"
H__XI
H__XIC
H__XV
$ cxxdem H__XI, H__XIC, H__XV
void h(int)
void h(int, char)
void h()

And for MSVC++6, use "dumpbin" (although cygwin's nm will also read MSVC++'s COFF files)

dumpbin sym2.obj > log
then search log for lines resembling:
  Communal; sym= "void __cdecl h(int)" (?h@@YAXH@Z)

Borland's TDUMP.EXE program "demangles" symbols as it reads a C++ OMF file, so I had to run cygwin's "strings" program to get the mangled names.

[edit] Discussion

[edit] int not void?

The C++ seems to be wrong - those are void functions returning ints. I figured you meant their return types to be int? -- Finlay McWalter | Talk 21:18, 15 Jun 2004 (UTC)

Oops. They were int originally and I changed them halfway through. Lady Lysine Ikinsile 21:22, Jun 15, 2004 (UTC)

Oh, and the "mangled name" table is wrong too - the examples are f, f, g, but the table is f, f, f -- Finlay McWalter | Talk 21:20, 15 Jun 2004 (UTC)

They aren't intended to correspond to the previous example. Lady Lysine Ikinsile 21:22, Jun 15, 2004 (UTC)
Yep - I was confused because there were three of each, and f() was used - I realise we'll need quite a number of the latter kind, and maybe we should rename them h() or something, just to make sure? -- Finlay McWalter | Talk 21:29, 15 Jun 2004 (UTC)

[edit] version numbers

I think we should say which version of each compiler we're using. I think GCC changed from one scheme to another when they changed from the old codebase to the EGCS version. (I get the results you do with gcc v3.3.1). -- Finlay McWalter | Talk 22:09, 15 Jun 2004 (UTC)

Yes, they did change at some point — I think with gcc 3.0, but it may have been 3.1 (?). I'm pretty certain 2.95 uses the old scheme. Pre-egcs (2.7? 2.8 was egcs, wasn't it, or was that 2.9x?) may have used something different still. The new scheme is some kind of standard, at least on Linux (and Intel ICC uses the same scheme on Linux) ... GCC also use the same scheme on Solaris (at least - that's all I have to test with here), even though it differs from Sun's native C++ compiler. Not sure whether GCC inspired the standard or the other way around. (Tested with: solaris 9, "gcc version 3.3.2" and "CC: Sun C++ 5.5 2003/03/12"; suse linux 9.1, "gcc version 3.3.3 (SuSE Linux)") Lady Lysine Ikinsile 22:21, Jun 15, 2004 (UTC)
yeah, I think that 2.95 -> 3.0 is the transition (I seem to remember lots of things sticking on 2.95 for ages). This [1] says gcc's scheme "seems to be based on the one in the C++ Reference Manual" -- Finlay McWalter | Talk 22:44, 15 Jun 2004 (UTC)

These are the results from gcc 2.95:

h(int)       -> h__Fi
h(int, char) -> h__Fic
h(void)      -> h__Fv

not sure if it's worth putting those in as well as 3.x. Lady Lysine Ikinsile 22:48, Jun 15, 2004 (UTC)

Yeah, I think that's good information. It shows that even different versions of the same compiler can produce different output, and thus compounds the DLL hell. -- Finlay McWalter | Talk 22:55, 15 Jun 2004 (UTC)

[edit] great progress

Wow, it's so much better already - thanks for your (disproportionately dilligent) contributions. I've dropped User:Alfio a note about name decoration, asking if he'd like to help. I think I'll temporarily recuse myself from the C++ section (as I'm nearing the limit of my lame C++ skills) and I'll go finish the java section (ah, at least java has a standard mangling scheme and a standard ABI). Btw, my MSVC6 compiler is too old to compile the larger example (surprisingly it likes "namespace" but chokes on "std::") - I think it's fine if we just cite one or two compilers for the larger example (the trivial case amply makes our point that things are totally different). Thanks again. -- Finlay McWalter | Talk 00:29, 16 Jun 2004 (UTC)

I think the larger example could be better: I was getting a headache from deciphering the ABI documentation [2] and wasn't really capable of producing wonderful prose at the same time :) I'll try to clean it up a little.
Yes, I think it's fine to only consider one compiler here: the general concept of mangling is the same between all of them.
The larger example should work on MSVC6 (with #include <string> and <iostream>), even if it is a horribly broken compiler. Incidentally, someone on IRC mentioned they'd add ObjC examples later :) I'll also install Intel C++ at some point to add that to the table ... what other compilers are missing? (Comeau comes to mind) Lady Lysine Ikinsile 00:47, Jun 16, 2004 (UTC)
Silly me, didn't #include anything :) Compilers: does Cfront still exist? There's an SGI mips compiler (which was kick ass a decade ago, last time I touched MIPS). I hope the ObjC person has access to a different compiler than the gcc ObjC frontend - I do know that gcj (the java frontend to gcc) just uses the g++ mangling scheme (makes sense) so you'd think the gcc-objc frontend does too (and thus wouldn't be such an interesting example). I saw your edit on Alfio's page - duhh, I'd forgotten about decoration for calling convention. And it seems we don't have a page on calling conventions either (stdcall, fastcall, cdecl, pascal,...). -- Finlay McWalter | Talk 01:05, 16 Jun 2004 (UTC)
No, objc doesn't, there's a regular scheme to how method names are mangled in objc. However, I do think there are different ways of implementing selectors, I'll have to look around and get back to you on that. Dysprosia 05:49, 16 Jun 2004 (UTC)

[edit] IA64 ABI

"GNU GCC 3.x and all the IA-64 compilers use the same IA-64 ABI, which defines (among other things) a standard name-mangling scheme."

Is this ABI x86/IA64 only, or does it use the same ABI even on SPARC, PPC, etc.? -- Finlay McWalter | Talk 17:37, 16 Jun 2004 (UTC)
I'm not entirely sure. I've just tested GCC on Linux i386, Solaris 9 i386, HP-UX 11.11 PA-RISC, Linux Alpha, Tru64 5.1b Alpha, Linux SPARC (32-bit and 64-bit) and Linux AMD64 (64-bit), and it used identical mangling schemes on all of them (_Z1h etc). I don't know, though, if rest of the ABI is the same on those platforms. The standard itself is published as "Itanium ABI" (see the link in ==External links==). I should probably reword the paragraph to say that on IA-64 the entire ABI is standardised, and that GCC uses the IA-64 name mangling scheme (at least) on all other systems, too. Lady Lysine Ikinsile 17:52, Jun 16, 2004 (UTC)

[edit] status?

At what point is this article going to move to the main namespace? I'd think a reasonable overview of C++ and Java is probably reasonable to start with, though I'm not sure how detailed 'reasonable' is :) Lady Lysine Ikinsile 03:58, Jun 17, 2004 (UTC)

[edit] Mangling and Decoration

The article Name decoration should be merged with this article. This will help avoiding redundant information as well as confusion. The reader will be unable to guess which one is the latest and most updated. Secondly, both are actually the same things, it is pointless to have two pages for synonyms.

Agreed, but arguably the "proper" name is "decoration" so it should be merged the other way. With a redirect left behind, it doesn't really matter though. Eric 04:38, 10 January 2006 (UTC)

Ok, I've finished merging that article into this one. -- Bovineone 05:52, 10 March 2006 (UTC)

[edit] Type-Casting of Function Argument In Cross Library Function Calls

Can the below explained be a name mangling problem??


Today, I faced a strange ‘C’ problem that had forced me to go through the C fundamentals again.

STRANGE PROBLEM:Consider you are working in for LIB_A and calling some foo function of LIB_B

LIB_A

void xyz()

{ unsigned short usVar = 12; /* Call function foo from LIB_B */ foo(usVar); …….. }

LIB_B

void foo(unsigned int uiVar) { /* guess what should be the value of uiVar when called from xyz */ }

I found that the value of uiVar = 0x0038000C(Hex).

After analyzing to the value above 0011 1000 0000 0000 0000 1100 (0x0038000C) with the correct value 0000 0000 0000 0000 0000 1100 (0x0000000C)


I was pretty sure in a way that this is to do with type-casting. When I change function xyz with explicit cast

void xyz() { unsigned short usVar = 12; /* Call function foo*/ foo( (unsigned int ) usVar); …….. }

I got the right value. But why it would need an explicit cast? According to general C typecast or coercion or automatic conversion concept, unsigned short should automatically be converted into unsigned int (without explicit cast).

Isn’t it??

NOTE: LIB_A and LIB_B played an important role here because in the same library it works as it should.


I was struggling for an answer and then I tried playing with MSVC settings, I found an option “RTTI” Run Time Type Information under VC++->Project Settings->C\C++(Tab)->Categrory = C++ Language(Enable Run-Time Type Information (RTTI)

Strangely when I enabled this, I started getting the correct values.

Then I relied on the following two links to conclude that we should enable RTTI option in all our libraries to avoid such kinds of error.

http://www.duckware.com/bugfreec/chapter7.html#avoidtypecasts

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vccelng4/html/elgrfRundshTimeTypeInformation.asp

I don't think Wikipedia is the proper site for this, but anyway...
I think your problem is not related with name mangling or RTTI (and both unrelated one another), but with automatic function prototyping, that is, if you use a function without a prototype (ie. foo) without a prototype, the compiler will guess the number and type of parameters from the actual parameters it gets (in your case a short).
But note that automatic function prototyping is a deprecated feature of C, and in C++ it is just not valid. Simply #include the appropiate .h files and all will be ok.
--Pinzo 00:19, 15 November 2006 (UTC)

[edit] Too technical?

Does this article strike anyone is being overly technical, especially with such a vague introduction? The preceding unsigned comment was added by 68.252.72.221 (talk • contribs) .

I've added in more lead section text from the Name decoration article that I just merged into here. This article should be a little more accessible to a wider audience now. -- Bovineone 05:57, 10 March 2006 (UTC)

[edit] Cases of mangling in C++

AFAIK, default arguments don't affect name mangling, because they are not part of the function signature...

BTW, what is "variable ownership"???

--Pinzo 00:11, 15 November 2006 (UTC)