Cross compiler
From Wikipedia, the free encyclopedia
A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is run. Cross compiler tools are generally found in use to generate compiles for embedded system or multiple platforms. It is a tool that one must use for a platform where it is inconvenient or impossible to compile on that platform, like microcontrollers that run with a minimal amount of memory for their own purpose. It has become more common to use this tool for paravirtualization where a system may have one or more platforms in use.
Contents |
[edit] Uses of cross compilers
The fundamental use of a cross compiler is to separate the build environment from the target environment. This is useful in a number of situations:
- Embedded computers where a device has extremely limited resources. For example, a microwave oven will have an extremely small computer to read its touchpad and door sensor, provide output to a digital display and speaker, and to control the machinery for cooking food. This computer will not be powerful enough to run a compiler, a file system, or a development environment. Since debugging and testing may also require more resources than is available on an embedded system, cross-compilation can be less involved and less prone to errors than native compilation.
- Compiling for multiple machines. For example, a company may wish to support several different versions of an operating system or to support several different operating systems. By using a cross compiler, a single build environment can be set up to compile for each of these targets.
- Compiling on a server farm. Similar to compiling for multiple machines, a complicated build that involves many compile operations can be executed across any machine that is free regardless of its brand or current version of an operating system.
- Bootstrapping to a new platform. When developing software for a new platform, or the emulator of a future platform, one uses a cross compiler to compile necessary tools such as the operating system and a native compiler.
- Compiling native code for emulators for older now-obsolete platforms like the Commdore 64 or Apple II by enthusiasts who use cross compilers that run on a current platform (such as Aztec C's MS DOS 6502 cross compilers running under Windows XP).
Use of virtual machines (such as Java's JVM) resolves some of the reasons for which cross compilers were developed. The virtual machine paradigm allows the same compiler output to be used across multiple target systems.
Typically the hardware architecture differs (e.g. compiling a program destined for the MIPS architecture on an x86 computer) but cross-compilation is also applicable when only the operating system environment differs, as when compiling a FreeBSD program under Linux, or even just the system library, as when compiling programs with uClibc on a glibc host.
[edit] Canadian Cross
The Canadian Cross is a technique for building cross compilers for other machines. Given three machines A, B, and C, one uses machine A to build a cross compiler that runs on machine B to create executables for machine C. When using the Canadian Cross with GCC, there may be four compilers involved:
- The proprietary native Compiler for machine A (1) is used to build the gcc native compiler for machine A (2).
- The gcc native compiler for machine A (2) is used to build the gcc cross compiler from machine A to machine B (3)
- The gcc cross compiler from machine A to machine B (3) is used to build the gcc cross compiler from machine B to machine C (4)
The end-result cross compiler (4) will not be able to run on your build machine A; instead you would use it on machine B to compile an application into executable code that would then be copied to machine C and executed on machine C.
For instance, NetBSD provides a POSIX Unix shell script named build.sh
which will first build its own toolchain with the host's compiler; this, in turn, will be used to build the cross-compiler which will be used to build the whole system.
The term Canadian Cross came about because at the time that these issues were all being hashed out, Canada had three national political parties.[1]
[edit] GCC and cross compilation
GCC, a free software collection of compilers, can be set up to cross compile. It supports many platforms and languages. However, due to limited volunteer time and the huge amount of work it takes to maintain working cross compilers, in many releases some of the cross compilers are broken.
GCC requires that a compiled copy of binutils be available for each targeted platform. Especially important is the GNU Assembler. Therefore, binutils first has to be compiled correctly with the switch --target=some-target
sent to the configure script. GCC also has to be configured with the same --target
option. GCC can then be run normally provided that the tools, which binutils creates, are available in the path, which can be done using the following (on UNIX-like operating systems with bash):
PATH=/path/to/binutils/bin:$PATH; make
Cross compiling GCC requires that a portion of the target platform's C standard library be available on the host platform. At least the crt0, ... components of the library must be available. You may choose to compile the full C library, but that can be too large for many platforms. The alternative is to use newlib, which is a small C library containing only the most essential components required to compile C source code. To configure GCC with newlib, use the switch --with-newlib
.
The GNU autotools packages (i.e. autoconf, automake, and libtool) use the notion of a build platform, a host platform, and a target platform. The build platform is where the code is actually compiled. The host platform is where the compiled code will execute. The target platform usually only applies to compilers. It represents what type of object code the package itself will produce (such as cross-compiling a cross-compiler); otherwise the target platform setting is irrelevant. For example, consider cross-compiling a video game that will run on a Dreamcast. The machine where the game is compiled is the build platform while the Dreamcast is the host platform.
[edit] Manx Aztec C cross compilers
Manx Software Systems, of Shrewsbury, New Jersey, produced C compilers beginning in the 1980s targeted at professional developers for a variety of platforms up to and including PC's and Mac's.
Manx's Aztec C programming language was available for a variety of older now-obsolete platforms including MS DOS, Apple II DOS 3.3 and ProDOS, Commodore 64, Macintosh 68XXX [2] and Amiga.
From the 1980s and continuing throughout the 1990s until Manx Software Systems disappeared, the MS DOS version of Aztec C [3] was offered both as a native mode compiler or as a cross compiler for other platforms with different processors including the Commodore 64 [4] and Apple II[5]. Internet distributions still exist for Aztec C including their MS DOS based cross compilers. They are still in use today.
Manx's Aztec C86, their native mode 8086 MS DOS compiler, was also a cross compiler. Although it did not compile code for a different processor like their Aztec C65 6502 cross compilers for the Commodore 64 and Apple II, it created binary executables for then-legacy operating systems for the 16 bit 8086 family of processors.
When the IBM PC was first introduced it was available with a choice of operating systems, CP/M 86 and PC DOS being two of them. Aztec C86 was provided with link libraries for generating code for both IBM PC operating systems. Throughout the 1980s later versions of Aztec C86 (3.xx, 4.xx and 5.xx) added support for MS DOS "transitory" versions 1 and 2 [6] and which were less robust than the "baseline" MS DOS version 3 and later which Aztec C86 targeted until its demise.
Finally, Aztec C86 provided C language developers with the ability to produce ROM-able "HEX" code which could then be transferred using a ROM Burner directly to an 8086 based processor. Paravirtualization may be more common today but the practice of creating low-level ROM code was more common per-capita during those years when device driver development was often done by application programmers for individual applications, and new devices amounted to a cottage industry. It was not uncommon for application programmers to interface directly with hardware without support from the manufacturer. This practice was similar to Embedded Systems Development today.
Thomas Fenwick and James Goodnow II were the two principle developers of Aztec-C. Fenwick later became notable as the author of the Microsoft Windows CE Kernel or NK ("New Kernel") as it was then called.[7]
[edit] Microsoft C cross compilers
[edit] Early History - 1980's
Microsoft C (MSC) has a long history[8] dating back to the 1980s. The first Microsoft C Compilers were made by the same company who made Lattice C and were rebranded by Microsoft as their own until MSC 4 which was the first version that Microsoft produced themselves.[9]
In 1987 many developers started switching to Microsoft C, and many more would follow throughout the development of Microsoft Windows to its present state. Products like Clipper and later Clarion emerged that offered easy database application development by using cross language techniques allowing part of their programs to be written in Microsoft C.
[edit] 1987
C programs had long been linked with modules written in Assembly Language. C itself was usually written in Assembly Language, and most C compilers (even current compilers) offer an Assembly Language pass (that can be "tweaked" for efficiency then linked to the rest of the program after assembling).
Compilers like Aztec-C converted everything to assembly language as a distinct pass and then assembled the code in a distinct pass, and were noted for their very efficient and small code, but by 1987 the optimizer built into Microsoft C was very good and only "mission critical" parts of a program were usually considered for rewriting. In fact C language programming had taken over as the "lowest-level" language, with programming becoming a multi-disciplinary growth industry and projects becoming larger with programmers writing user interfaces and database interfaces in higher-level languages, and a need had emerged that required cross language development that continues to this day.
By 1987 with the release of MSC 5.1 Microsoft offered a cross language development environment for MS DOS. 16 bit binary object code written in Assembly Language (MASM) and Microsoft's other languages including Quick Basic, Pascal, and Fortran could be linked together into one program in a process they called "Mixed Language Programming" and now "InterLanguage Calling".[10] If BASIC was used in this mix, the main program needed to be in BASIC to support the internal runtime that compiled BASIC required for garbage collection and its other managed operations that simulated a BASIC Interpreter like QBasic in MS DOS.
The C code in particular needed to be written to pass its variables in "reverse order" on the stack and return its values on the stack rather than in a processor register. There were other programming rules to make all the languages work together but this particular rule persisted through the cross language development that continued throughout Windows 16 and 32 bit versions and in the development of programs for OS 2 and which persists to this day. It is known as the "Pascal Calling Convention" but is so common that it is taken for granted and the term is rarely used.
Another type of cross compilation that Microsoft C was used for during this time was in retail applications that require Handheld Devices like the Symbol Technologies PDT3100 (used to take inventory), which provided a link library targeted at an 8088 based Bar Code Scanner. Comparable to what is done today for that same market using Windows Mobile by companies like Motorola who bought Symbol, the application was built on the host computer then transferred to the Handheld Device (via a serial cable) where it was run.
[edit] Early 1990's
Throughout the 1990s and beginning with MSC 6 (their first ANSI C compliant compiler) Microsoft re-focused their C compilers on the emerging Windows Market, and also on OS 2 and in the development of GUI programs. Mixed Language compatibility remained through MSC 6 on the MS DOS side, but the API for Microsoft Windows 3.0 and 3.1 was written in MSC 6. MSC 6 was also extended to provide support for 32 Bit assemblies and support for the emerging Windows for Workgroups (WFW) and Windows NT which would form the foundation for Windows XP. A programming practice called a Thunk was even introduced to allow cross assembly instruction passing between 16 and 32 bit programs that took advantage of runtime binding rather than the static binding that was favoured in Monolithic 16 bit MS DOS applications. Static binding is still favoured by some native code developers but does not generally provide the degree of re-use required by newer best practices like CMM.
MS DOS support was still provided with the release of Microsoft's first C++ Compiler, MSC 7, which was backwardly compatible with the C programming language and MS DOS and supported both 16 bit and 32 bit code generation.
It is fair to say at this point that MSC took over where Aztec C86 left off. Since the market share for C compilers had turned to cross compilers which took advantage of the latest and greatest Windows features, offered C and C++ in a single bundle and still supported MS DOS systems that were already a decade old, the smaller companies that produced compilers like Aztec C could no longer compete and either turned to niche markets like embedded systems or disappeared.
MS DOS and 16 bit code generation support continued until MSC 8.00c which was bundled with Microsoft C++ and Microsoft Application Studio 1.5, the forerunner of Microsoft Visual Studio which is the cross development environment that Microsoft provide today.
[edit] Late 1990's
MSC 12 was released with Microsoft Visual Studio 6 and no longer provided support for MS DOS 16 bit binaries instead providing support for 32 bit console applications, but provided support for WIN 95 and WIN 98 code generation as well as for WIN NT. Link libraries were available for other processors that ran Microsoft Windows; a practice that Microsoft continues to this day.
MSC 13 was released with Visual Studio 2003, and MSC 14 was released with Visual Studio 2005, both which will still produce code for older systems like Windows 95, but which will produce code for several target platforms including the Mobile Market and the ARM processor.
[edit] DotNET and Beyond
In 2001 Microsoft had developed something called the Common Language Runtime (CLR) which formed the core for their DotNET (.NET) programming environment. This layer on the operating system which itself was by now in the GUI in Windows freely allowed the mixing of development languages but C itself was dropped from the higher level mix in favour of C++ and the new C# language which allows the "unsafe" keyword[11], both supporting the use of C. This practice is known by some as "Managed Code". While more efficient than runtimes like Java, it is at this point that the high level code produced by Microsoft can no longer really be considered to be cross compiled since the DotNET runtime and CLR are for most practical purposes required to get to the core routines for the processor and the devices on the target computer.
Despite the fact that only the command line C compiler in Visual Studio 2005 can be considered a cross compiler, other technologies have emerged that provide runtimes for other platforms including Linux, like Mono, and third party add-ons like Qt and its predecessors including XVT Design provide source code level cross development capability with other platforms, while still using Microsoft C to build the Windows versions. Other compilers like MinGW have also become popular in this area since they are more directly compatible with the Unixes that comprise the non-Windows side of software development allowing those developers to target all platforms using a familiar build environment.
Since the operating system is integrated with the GUI in Windows unlike in Unix terms like "cross platform development environment" are more often correct in Windows than "cross compiler" when describing the IDE's that are used to compile programs. In Unixes like Linux it is arguably more correct today to use the term "cross compiler" since native code is still, by some, commonly generated from a build machine to run on other processors and other distros.
However other Microsoft Development Environments for platforms like Windows Mobile not only offer cross compilers but also offer emulators and remote deployment environments that require very little configuration, unlike the cross compilers in days gone by or on other platforms.
[edit] See also
[edit] References
- ^ 4.9 Canadian Crosses. CrossGCC. Retrieved on 2007-10-11. “This is called a `Canadian Cross' because at the time a name was needed, Canada had three national parties.”
- ^ Obsolete Macintosh Computers
- ^ Aztec C
- ^ Commodore 64
- ^ Apple II
- ^ MS DOS Timeline
- ^ Inside Windows CE (search for Fenwick)
- ^ Microsoft Language Utility Version History
- ^ History of PC based C-compilers
- ^ Which Basic Versions Can CALL C, FORTRAN, Pascal, MASM
- ^ Unsafe Programming In C#
[edit] External links
- http://www.airs.com/ian/configure/configure_5.html is a book reference for configuring GNU cross compilation tools
- Building Cross Toolchains with gcc is a wiki of other GCC cross-compilation references
- http://www.scratchbox.org/ Scratchbox is a toolkit for Linux cross-compilation to ARM and x86 targets
- Crosstool is a helpful toolchain of scripts, which create a Linux cross-compile environment for the desired architecture, including embedded systems
- buildroot is another set of scripts for building a uClibc-based toolchain, usually for embedded systems
- T2 SDE is another set of scripts for building whole Linux Systems based on either GNU libC, uClibc or dietlibc for a variety of architectures
- Cross Linux from Scratch Project
- http://www.rubygarden.org/ruby?RubyOnUCLinux Entry on cross compiling ruby to uCLinux
- IBM has a very clear structured tutorial about cross-building a GCC toolchain. A PDF file containing the tutorial appears to be available here, although IBM requests that you register and access the tutorial here