LLVM
Original author(s) | Vikram Adve, Chris Lattner |
---|---|
Developer(s) | LLVM Developer Group |
Initial release | 2003 |
Stable release |
4.0.1
/ 4 July 2017[1] |
Repository |
llvm |
Written in | C++ |
Operating system | Cross-platform |
Type | Compiler |
License | University of Illinois/NCSA Open Source License[2] |
Website |
llvm |
The LLVM compiler infrastructure project (formerly Low Level Virtual Machine) is a "collection of modular and reusable compiler and toolchain technologies"[3] used to develop compiler front ends and back ends.
LLVM is written in C++ and is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages. Originally implemented for C and C++, the language-agnostic design of LLVM has since spawned a wide variety of front ends: languages with compilers that use LLVM include ActionScript, Ada, C#,[4][5][6] Common Lisp, Crystal, D, Delphi, Fortran, OpenGL Shading Language, Halide, Haskell, Java bytecode, Julia, Lua, Objective-C, Pony,[7] Python, R, Ruby,[8] Rust, CUDA, Scala,[9] Swift, and Xojo.
The LLVM project started in 2000 at the University of Illinois at Urbana–Champaign, under the direction of Vikram Adve and Chris Lattner. LLVM was originally developed as a research infrastructure to investigate dynamic compilation techniques for static and dynamic programming languages. LLVM was released under the University of Illinois/NCSA Open Source License,[2] a permissive free software licence. In 2005, Apple Inc. hired Lattner and formed a team to work on the LLVM system for various uses within Apple's development systems.[10] LLVM is an integral part of Apple's latest development tools for macOS and iOS.[11] Since 2013, Sony has been using LLVM's primary front end Clang compiler in the software development kit (SDK) of its PS4 console.[12]
The name LLVM was originally an initialism for Low Level Virtual Machine, but this became increasingly less apt as LLVM became an "umbrella project" that included a variety of other compiler and low-level tool technologies, so the project abandoned the initialism.[13] Now, LLVM is a brand that applies to the LLVM umbrella project, the LLVM intermediate representation (IR), the LLVM debugger, the LLVM C++ Standard Library (with full support of C++11 and C++14[14]), etc. LLVM is administered by the LLVM Foundation. Its president is compiler engineer Tanya Lattner.[15]
The Association for Computing Machinery presented Adve, Lattner, and Evan Cheng with the 2012 ACM Software System Award for LLVM.[16]
Features
LLVM can provide the middle layers of a complete compiler system, taking intermediate representation (IR) code from a compiler and emitting an optimized IR. This new IR can then be converted and linked into machine-dependent assembly language code for a target platform. LLVM can accept the IR from the GNU Compiler Collection (GCC) toolchain, allowing it to be used with a wide array of extant compilers written for that project.
LLVM can also generate relocatable machine code at compile-time or link-time or even binary machine code at run-time.
LLVM supports a language-independent instruction set and type system.[17] Each instruction is in static single assignment form (SSA), meaning that each variable (called a typed register) is assigned once and then frozen. This helps simplify the analysis of dependencies among variables. LLVM allows code to be compiled statically, as it is under the traditional GCC system, or left for late-compiling from the IR to machine code via just-in-time compilation (JIT), similar to Java. The type system consists of basic types such as integer or floating point numbers and five derived types: pointers, arrays, vectors, structures, and functions. A type construct in a concrete language can be represented by combining these basic types in LLVM. For example, a class in C++ can be represented by a mix of structures, functions and arrays of function pointers.
The LLVM JIT compiler can optimize unneeded static branches out of a program at runtime, and thus is useful for partial evaluation in cases where a program has many options, most of which can easily be determined unneeded in a specific environment. This feature is used in the OpenGL pipeline of Mac OS X Leopard (v10.5) to provide support for missing hardware features.[18]
Graphics code within the OpenGL stack can be left in intermediate representation, and then compiled when run on the target machine. On systems with high-end graphics processing units (GPUs), the resulting code remains quite thin, passing the instructions on to the GPU with minimal changes. On systems with low-end GPUs, LLVM will compile optional procedures that run on the local central processing unit (CPU) that emulate instructions that the GPU cannot run internally. LLVM improved performance on low-end machines using Intel GMA chipsets. A similar system was developed under the Gallium3D LLVMpipe, and incorporated into the GNOME shell to allow it to run without a proper 3D hardware driver loaded.[19]
For run-time performance of the compiled programs, GCC formerly outperformed LLVM by 10% on average.[20][21] Newer results indicate that LLVM has now caught up with GCC in this area, and is now compiling binaries of approximately equal performance.[22]
Components
LLVM has become an umbrella project containing multiple components.
Front ends
LLVM was originally written to be a replacement for the existing code generator in the GCC stack,[23] and many of the GCC front ends have been modified to work with it. LLVM currently supports compiling of Ada, C, C++, D, Delphi, Fortran, Haskell, Objective-C and Swift using various front ends, some derived from version 4.0.1 and 4.2 of the GNU Compiler Collection (GCC).
Widespread interest in LLVM has led to several efforts to develop new front ends for a variety of languages. The one that has received the most attention is Clang, a new compiler supporting C, C++, and Objective-C. Primarily supported by Apple, Clang is aimed at replacing the C/Objective-C compiler in the GCC system with a system that is more easily integrated with integrated development environments (IDEs) and has wider support for multithreading. Support for OpenMP directives has been included in Clang since release 3.8.[24]
The Utrecht Haskell compiler can generate code for LLVM. Though the generator is in the early stages of development, in many cases it has been more efficient than the C code generator.[25] The Glasgow Haskell Compiler (GHC) has a working LLVM backend that achieves a 30% speed-up of the compiled code relative to native code compiling via GHC or C code generation followed by compiling, missing only one of the many optimizing techniques implemented by the GHC.[26]
Many other components are in various stages of development, including, but not limited to, the Rust compiler, a Java bytecode front end, a Common Intermediate Language (CIL) front end, the MacRuby implementation of Ruby 1.9, various front ends for Standard ML, and a new graph coloring register allocator.
Intermediate representation
The core of LLVM is the intermediate representation (IR), a low-level programming language similar to assembly. IR is a strongly typed reduced instruction set computing (RISC) instruction set which abstracts away details of the target. For example, the calling convention is abstracted through call and ret instructions with explicit arguments. Also, instead of a fixed set of registers, IR uses an infinite set of temporaries of the form %0, %1, etc. LLVM supports three isomorphic (i.e., functionally equivalent) forms of IR: a human-readable assembly format, a C++ object format suitable for frontends, and a dense bitcode format for serializing. A simple "Hello, world!" program in the assembly format:
@.str = internal constant [14 x i8] c"hello, world\0A\00"
declare i32 @printf(i8*, ...)
define i32 @main(i32 %argc, i8** %argv) nounwind {
entry:
%tmp1 = getelementptr [14 x i8]* @.str, i32 0, i32 0
%tmp2 = call i32 (i8*, ...)* @printf( i8* %tmp1 ) nounwind
ret i32 0
}
Back ends
At version 3.4, LLVM supports many instruction sets, including ARM, Qualcomm Hexagon, MIPS, Nvidia Parallel Thread Execution (PTX; called NVPTX in LLVM documentation), PowerPC, AMD TeraScale,[28] AMD Graphics Core Next (GCN), SPARC, z/Architecture (called SystemZ in LLVM documentation), x86/x86-64, and XCore. Some features are not available on some platforms. Most features are present for x86/x86-64, z/Architecture, ARM, and PowerPC.[29]
The LLVM machine code (MC) subproject is LLVM's framework for translating machine instructions between textual forms and machine code. Formerly, LLVM relied on the system assembler, or one provided by a toolchain, to translate assembly into machine code. LLVM MC's integrated assembler supports most LLVM targets, including x86, x86-64, ARM, and ARM64. For some targets, including the various MIPS instruction sets, integrated assembly support is usable but still in the beta stage.
Linker
The lld subproject is an attempt to develop a built-in, platform-independent linker for LLVM.[30] lld aims to remove dependence on a third-party linker. As of May 2017, lld supports ELF, PE/COFF, and Mach-O in descending order of completeness.[30] In cases where lld is insufficient, another linker such as GNU ld can be used.
Using lld allows link-time optimization. When link-time optimization is enabled, the compiler generates LLVM bitcode instead of native code, and native code generation is done by the linker.
C++ Standard Library
The LLVM project includes an implementation of the C++ Standard Library, dual-licensed under the MIT License and the UIUC license.[31]
Debugger
History
The LLVM project started in 2000 at the University of Illinois at Urbana–Champaign, under the direction of Vikram Adve and Chris Lattner. LLVM was originally developed as a research infrastructure to investigate dynamic compilation techniques for static and dynamic programming languages. LLVM was released under the University of Illinois/NCSA Open Source License,[2] a permissive free software licence. In 2005, Apple Inc. hired Lattner and formed a team to work on the LLVM system for various uses within Apple's development systems.[10] LLVM is an integral part of Apple's latest development tools for macOS and iOS.[11] Since 2013, Sony has been using LLVM's primary front end Clang compiler in the software development kit (SDK) of its PS4 console.[32]
Version | Release date |
---|---|
4.0.1 | 4 July 2017 |
4.0 | 13 March 2017 |
3.9.1 | 23 December 2016 |
3.9.0 | 2 September 2016 |
3.8.1 | 11 July 2016 |
3.8.0 | 8 March 2016 |
3.7.1 | 5 January 2016 |
3.7.0 | 1 September 2015 |
3.6.2 | 16 Jul 2015 |
3.6.1 | 26 May 2015 |
3.6.0 | 27 February 2015 |
3.5.2 | 2 April 2015 |
3.5.1 | 20 January 2015 |
3.5.0 | 3 September 2014 |
3.4.2 | 19 June 2014 |
3.4.1 | 7 May 2014 |
3.4.0 | 2 January 2014 |
3.3 | 17 June 2013 |
3.2 | 20 December 2012 |
3.1 | 22 May 2012 |
3.0 | 1 December 2011 |
2.9 | 6 April 2011 |
2.8 | 5 October 2010 |
2.7 | 27 April 2010 |
2.6 | 23 October 2009 |
2.5 | 2 March 2009 |
2.4 | 9 November 2008 |
2.3 | 9 June 2008 |
2.2 | 11 February 2008 |
2.1 | 26 September 2007 |
2.0 | 23 May 2007 |
1.9 | 19 November 2006 |
1.8 | 9 August 2006 |
1.7 | 20 April 2006 |
1.6 | 8 November 2005 |
1.5 | 18 May 2005 |
1.4 | 9 December 2004 |
1.3 | 13 August 2004 |
1.2 | 19 March 2004 |
1.1 | 17 December 2003 |
1.0 | 24 October 2003 |
See also
- AMD Optimizing C/C++ Compiler
- C--
- Amsterdam Compiler Kit (ACK)
- LLDB (debugger)
- GNU lightning
- GNU Compiler Collection (GCC)
- Pure
- OpenCL
- Emscripten
- TenDRA Distribution Format
- Architecture Neutral Distribution Format (ANDF)
- Comparison of application virtual machines
- SPIR-V
- University of Illinois at Urbana Champaign discoveries & innovations
Literature
- Chris Lattner - The Architecture of Open Source Applications - Chapter 11 LLVM, ISBN 978-1257638017, released 2012 under CC BY 3.0 (Open Access).[34]
- LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation, a published paper by Chris Lattner, Vikram Adve
References
- ↑ Stellard, Tom (4 July 2017). "LLVM 4.0.1 Release". llvm-announce (Mailing list). Retrieved 15 July 2017.
- 1 2 3 "License", LLVM: Frequently Asked Questions, llvm.org, retrieved 27 January 2012
- ↑ "The LLVM Compiler Infrastructure Project". Retrieved 2016-03-11.
- ↑ Announcing LLILC - A new LLVM-based Compiler for .NET, retrieved 17 April 2015
- ↑ Mono LLVM, retrieved 10 March 2013
- ↑ LLVM, Chris Lattner, in The architecture of Open Source Applications, edited by Amy Brown, Greg Wilson, 2011
- ↑ "The LLVM Compiler Infrastructure Project". llvm.org. Retrieved 2016-05-25.
- ↑ "Features". RubyMotion. Scratchwork Development LLC. Retrieved 17 June 2017.
RubyMotion transforms the Ruby source code of your project into ... machine code using a[n] ... ahead-of-time (AOT) compiler, based on LLVM.
- ↑ Reedy, Geoff (24 September 2012). "Compiling Scala to LLVM". St. Louis, Missouri, United States. Retrieved 19 February 2013.
- 1 2 Adam Treat (19 February 2005), mkspecs and patches for LLVM compile of Qt4, archived from the original on 4 October 2011, retrieved 27 January 2012
- 1 2 "Apple LLVM Compiler", Developer Tools, Apple, retrieved 27 January 2012
- ↑ Developer Toolchain for ps4 (PDF), retrieved 24 February 2015
- ↑ Lattner, Chris (21 December 2011). "The name of LLVM". llvm-dev (Mailing list). Retrieved 2 March 2016.
- ↑ ""libc++" C++ Standard Library".
- ↑ Chris Lattner (3 April 2014). "The LLVM Foundation". LLVM Project Blog.
- ↑ "ACM Awards". ACM.
- ↑ "LLVM Language Reference Manual". Retrieved 16 April 2012.
- ↑ Chris Lattner (15 August 2006). "A cool use of LLVM at Apple: the OpenGL stack". llvm-dev (Mailing list). Retrieved 1 March 2016.
- ↑ Michael Larabel, "GNOME Shell Works Without GPU Driver Support", phoronix, 6 November 2011
- ↑ V. Makarov. "SPEC2000: Comparison of LLVM-2.9 and GCC4.6.1 on x86". Retrieved 3 October 2011.
- ↑ V. Makarov. "SPEC2000: Comparison of LLVM-2.9 and GCC4.6.1 on x86_64". Retrieved 3 October 2011.
- ↑ Michael Larabel (27 December 2012). "LLVM/Clang 3.2 Compiler Competing With GCC". Retrieved 31 March 2013.
- ↑ Lattner, Chris; Vikram Adve (May 2003). Architecture For a Next-Generation GCC. First Annual GCC Developers' Summit. Retrieved 6 September 2009.
- ↑ "Clang 3.8 Release Notes". Retrieved 24 August 2016.
- ↑ "Compiling Haskell To LLVM". Retrieved 22 February 2009.
- ↑ "LLVM Project Blog: The Glasgow Haskell Compiler and LLVM". Retrieved 13 August 2010.
- ↑ For the full documentation, refer to llvm
.org ./docs /LangRef .html - ↑ Stellard, Tom (26 March 2012). "[LLVMdev] RFC: R600, a new backend for AMD GPUs". llvm-dev (Mailing list).
- ↑ Target-specific Implementation Notes: Target Feature Matrix // The LLVM Target-Independent Code Generator, LLVM site.
- 1 2 "lld - The LLVM Linker". The LLVM Project. Retrieved 10 May 2017.
- ↑ ""libc++" C++ Standard Library".
- ↑ Developer Toolchain for ps4 (PDF), retrieved 24 February 2015
- ↑ http://llvm.org/releases/
- ↑ Chris Lattner (March 15, 2012). "Chapter 11". The Architecture of Open Source Applications. Amy Brown, Greg Wilson. ISBN 978-1257638017.
External links
- Official website
- LLVM Language Reference Manual, describes the LLVM intermediate representation
- LLVM - 2.0 and beyond! on YouTube
- Discussion of LLVM by John Siracusa at Ars Technica
- The Design of LLVM by Chris Lattner, Dr. Dobb's Journal, May 2012