Author Topic: strange "undefined reference" only with 32-bit ASM symbols  (Read 11433 times)

Offline bootstrap

  • Multiple posting newcomer
  • *
  • Posts: 64
strange "undefined reference" only with 32-bit ASM symbols
« on: June 28, 2012, 06:21:58 am »
First I want to say that the following problem is quite possibly not a problem (or misunderstanding) in/with codeblocks.  But I'm not certain, so there's a risk I am posting a question here that is a problem with the mingw64 toolset (or about my own oversight or stupidity with either package).  So, please ignore this message if you're clear this is not a codeblocks issue.  I posted this question at the mingw64 forum on sourceforge first, but that forum has very little traffic.

Note that I fresh installed the latest nightly build of codeblocks (and mingw64/wxwidgets packages) just yesterday, so everything is up-to-date.  Even the windoze 7 system is a fresh install on a new disk drive from only a week or two ago.

I have been developing a portable 3D engine for some time.  Until recently I was developing with codeblocks on 64-bit ubuntu 12.04 on one computer, and with visual studio2005 pro on 64-bit windoze xp on another computer (no dual boot).  My winxp64 drive crashed and burned, so I bought a new drive and installed windoze 7.  Not surprisingly, they made sure vs2005 doesn't work properly on win7, so I decided it was time to do what I should have done long ago (get out from beneath their evil thumb), and develop the windoze version with codeblocks too.

So I removed vs2005pro and installed mingw64, wxwidgets and codeblocks on my windoze 7 computer, and am now trying to run the windoze versions of one of my smaller applications in codeblocks (before trying to make my 3D engine work, which has many more dependencies).  The takeaway from the above is... the code was working previously, and the linux targets are still working.

This application has the following 8 targets:
  linux32_debug
  linux32_release
  linux64_debug
  linux64_release
  windoze32_debug
  windoze32_release
  windoze64_debug
  windoze64_release

All of the linux targets build and execute properly on my 64-bit ubuntu 12.04 system.

I went through and tried to make sure all the codeblocks settings and project settings were correct, but I'm having the following problem and I don't understand what to try next.  Here are the basic facts.

#1:  The .h and .cpp files are identical for both projects, with #ifdef LINUX and #ifdef WINDOZE where necessary.

#2:  I can build and execute the windoze64_debug target, but trying to build the windoze32_debug target generates <undefined reference to 'math_sin'> errors (and a number of other symbols too, all of which are in an assembly-language file called <icemathasm32.s>.

#3:  The only difference in project files is the following.  The <icemathasm32.s> file is part of the linux32_xxxx and windoze32_xxxxx targets, and the <icemathasm64.s> file is part of the linux64_xxxx and windoze64_xxxx targets.  This is necessary because 32-bit and 64-bit assembly language is different.  Both files are identical in the sense they have the same function-name symbols and such, but the assembly-language code for each function is different.  This works fine when building all 4 linux targets with codeblocks on the 64-bit ubuntu 12.04 system.

#4:  I have checked many, many times, and the compiler and linker are both getting an -m32 option for the windoze32_xxxxx targets, and an -m64 option for the windoze64_xxxxx targets.

#5:  I have looked inside the assembly-language files repeatedly, verified they look fine, and all the function symbols that are producing "undefined reference" errors are declared in .globl lines as appropriate.  Furthermore, the exact same assembly-language files built and execute correctly on my linux system.

#6:  I have performed "build => clean" many times, and verified the .o files are gone, then performed "build => build" again and noted all the appropriate .o files are created, including <icemathasm32.o> in the windoze32_xxxxx target directories and <icemathasm64.o> in the windoze64_xxxxx target directories.

#7:  The size of the <icemathasm32.o> file produced on 64-bit ubuntu 12.04 is 6408 bytes, while the size of the <icemathasm32.o> file produced on 64-bit windoze 7 is 6231 bytes.  I'm not sure why they are different sizes, but that small difference does not worry me.  Maybe it should?

#8:  In an attempt to narrow down the problem, I moved the <icemathasm32.o> file from the linux32_debug directory into the windoze32_debug directory, and tried to build.  This produces the same "undefined reference" errors.

#9:  The following is the final build line for windoze64_debug (displayed in build-log tab in codeblocks):
g++.exe  -o windoze64_debug\bin\ice.exe windoze64_debug\icenoise.o windoze64_debug\iceoptics.o windoze64_debug\iceparse.o windoze64_debug\icetext.o windoze64_debug\icetime.o windoze64_debug\icetimer.o windoze64_debug\icefile.o windoze64_debug\ice.o windoze64_debug\icemath.o windoze64_debug\iceapp.o windoze64_debug\icemathasm64.o windoze64_debug\icememory.o   -m64

#10:  The following is the final build line for windoze32_debug (displayed in build-log by codeblocks):
g++.exe  -o windoze32_debug\bin\ice.exe windoze32_debug\icenoise.o windoze32_debug\iceoptics.o windoze32_debug\iceparse.o windoze32_debug\icetext.o windoze32_debug\icetime.o windoze32_debug\icetimer.o windoze32_debug\icefile.o windoze32_debug\ice.o windoze32_debug\icemath.o windoze32_debug\iceapp.o windoze32_debug\icemathasm32.o windoze32_debug\icememory.o   -m32

I am lost.  I'll bet there is some silly setting somewhere or other that will make me look stupid when we figure this out, but I've looked through the settings and haven't noticed anything obvious.  Except probably in retrospect when someone points it out to me.

What am I missing?  Is it possible that the mingw64 assembler doesn't make the .globl symbols visible only when assembling 32-bit assembly language?  I'm not sure what else could be happening, but that seems too fundamental to be possible (too many people would be having this problem).

What am I doing wrong?

PS:  I just executed "nm icemathasm32.o" in both windoze32_debug and linux32_debug directories, and the output was identical.  The symbols that are generating "undefined reference" all have a capital T next to their names (I assume indicating they are global symbols in the text section).

PS:  I just realized the problem might somehow be related to decoration on the symbol names?  Does that make any sense?  The output of nm shows the names without any leading decoration (not even an underline character), and no decoration following the name either.  When I do this to other .o files (generated from my .cpp files), they all have a leading underline and no suffix decoration.  Hmmmm.  Why does this not happen on linux and not happen on windoze in the 64-bits variant (where the symbols have no decoration whatsoever in both the assembly language file or my other cpp files).  Do I need some kind of _cdecl or something when I declare functions in windoze, or something?
« Last Edit: June 28, 2012, 07:11:25 am by bootstrap »

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: strange "undefined reference" only with 32-bit ASM symbols
« Reply #1 on: June 28, 2012, 07:05:24 am »
Well I would try to make a minimal sample to reproduce the problem... this you could post here. From what you tell it looks obvious, but there is a lot information missing, like the whole build process, compiler versions and so on.

Did you try (a minimal sample) on the command line? This could look like an *.s file with one function and a *.c file calling this function.
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline bootstrap

  • Multiple posting newcomer
  • *
  • Posts: 64
Re: strange "undefined reference" only with 32-bit ASM symbols
« Reply #2 on: June 28, 2012, 07:12:58 am »
Well I would try to make a minimal sample to reproduce the problem... this you could post here. From what you tell it looks obvious, but there is a lot information missing, like the whole build process, compiler versions and so on.

Did you try (a minimal sample) on the command line? This could look like an *.s file with one function and a *.c file calling this function.

I think I isolated the problem (see last PS above).  However, I'm not entirely sure how I need to change my function declarations or definitions to eliminate those leading underlines in the object file symbols.

Later:  Here is more information.  When the gcc/g++/as toolset compiles .c or .cpp or .s files on linux, it never generates underline prefixes (or suffixes either) --- at least for normal C functions.  However, when the mingw64 version of gcc/g++ compiles .c or .cpp or .s files on windoze, it DOES put an underline prefix onto function symbols in the .o object files it generates.  However, it does NOT generate underline prefixes on onto function symbols in the .o object files it generates from (assembly language) .s files!

When the gcc/g++ compiler compiles a function call, it has no way to know whether that function was written in C or assembly-language.  Since most called functions were written in C/C++ and compiled by gcc/g++, it generates code to call symbols with underline prefixes (ONLY on windoze).  But if the function is inside a .s assembly-language file, no underline prefix is added to symbols, and therefore these function calls cannot be resolved at link time (we get the "undefined reference" described in the original message).

However, this seems like a completely CRAZY, insane situation, unless there is some magic switch or something that I don't know about to tell the assembler to prefix every global function name in assembly-language files with an underline.

If that cannot be done, then it is IMPOSSIBLE to make the same .s assembly-language file a part of both a linux and windoze program --- even though the file will assemble and would link if not for the stupid underline prefix mismatch!  There must be a switch/option or something, because I can't believe I'm the first to discover this insane problem!  But I don't see one off hand.  Maybe the "as.exe" program has a switch/option?
« Last Edit: June 28, 2012, 09:26:35 am by bootstrap »


Offline Radek

  • Multiple posting newcomer
  • *
  • Posts: 104
Re: strange "undefined reference" only with 32-bit ASM symbols
« Reply #4 on: June 28, 2012, 11:31:02 am »
Oh, well, name mangling ... I am sure that the assembler will not do any name mangling. Either there is some #pragma directive in your C compiler that allows you to specify how to compile externs (for example, Watcom C/C++ allows that and it uses that itself for defining C calls, PASCAL calls and similar specifications). If such directive is available, then define "ASMCALL" - no underscores around, no uppercasing and so on - and make all your calls to assembler code ASMCALL. Or, you will need to define aliases in the assembler code:

Code
        public   myproc:near
        public   _myproc:near
        public   myproc_:near


myproc:
_myproc:
myproc_:
        push      eax
        ...

This should suffice. The underscore is either prefixed of postfixed providing your externs are declared as "C". Try one such aliasing and see whether the corresponding error vanishes.

Offline bootstrap

  • Multiple posting newcomer
  • *
  • Posts: 64
Re: strange "undefined reference" only with 32-bit ASM symbols
« Reply #5 on: June 29, 2012, 06:26:01 am »
Thanks for the ideas and link.

The way this is handled is VERY BAD.  It is clear to me that the assembler, not the compiler should prefix underscores or not based upon a switch, unless the compiler generates the object file directly, bypassing any assembler.  Then every symbol would be generated correctly for the target ABI (which is OS dependent).  Since the assembler appears not to have any such ability to prefix symbols, there is a serious and inherent problem for anyone writing assembly language code to build and run on both linux and windoze.

One thing to remember is this.  The functions within OS libraries (*.so on linux and *.dll on windoze) either have or do not have leading underscores.  When an application is built, that is a fixed fact that cannot be changed.  Therefore, some potential solutions are problematic.  For example, it doesn't help much to add or strip underscore prefixes from the symbols in your own application functions, because then, when functions are called, some need to have underscores prepended, and others don't - a seriously bad situation.

So it is a fixed fact of life that build tools need to generate prefixes on win32 --- but not win64 or linux32 or linux64.  The problem at hand is... the mingw tools only generate underscore prefixes for win32 C files, but not win32 assembly-language files.  Again, I claim this is a serious mistake in the design of the 32-bit assembler, for the same reason it would be a serious mistake for the C/C++ compilers to not generate underscore prefixes when generating win32 object files.

I eventually created a FUNC(x) macro in my .s file, which I had to rename as a .S file to enable preprocessor support within the assembly-language file.  This function prefixes __USER_LABEL_PREFIX__ to the symbol inside the macro parentheses.  __USER_LABEL_PREFIX__ is "_" on windoze and "" on linux.

However, I think the solution Radek proposes is cleaner.  Unfortunately, I don't think this will let me rename my assembly-language file back to .s again though, because I probably need to keep the preprocessor enabled on both my 32-bit and 64-bit assembly-language files to overcome yet another stupid decision on the windoze side --- their stupid decision on what registers need to be preserved across function calls, especially SIMD registers --- where this becomes revoltingly gross when you realize they only require the lower 128-bits of the preserved 256-bit ymm registers be preserved.  Sheesh, what a kludge!  Typical macroshaft design.  Linux is so clean and efficient in comparison.

So the Radek solution works for me.  I'm not 100% clear whether problems might occur on a system where multiple symbols like "malloc", "_malloc" and "__malloc" exist (at different addresses in the code).  I'm also not entirely clear what happens with assembly language code that must call functions outside itself in a portable way.

I guess one other solution isn't too horribly bad, though it does render the code forever not capable of compilation with macroshaft tools --- which will very soon be quite fine with me!  And that is appending an "asm" suffix on function declarations in the .h files for assembly-language files.  I haven't tried this yet, but it appears changing:

int funcname();
... to ...
int funcname() asm ("funcname");

might do the trick --- by telling the compiler to NOT prepend the underline in code that calls those functions.