Chapter 2

Link-Editor

The link-editing process creates an output file from one or more input files. The creation of the output file is directed by the options supplied to the link-editor together with the input sections provided by the input files.

All files are represented in the executable and linking format (ELF). For a complete description of the ELF format see Chapter 7, Object File Format. For this introduction, however, it is first necessary to introduce two ELF structures, sections and segments.

Sections are the smallest indivisible units that can be processed within an ELF file. Segments are a collection of sections that represent the smallest individual units that can be mapped to a memory image by exec(2) or by the runtime linker ld.so.1(1).

Although there are many types of ELF sections, they all fall into two categories with respect to the link-editing phase:

Sections that contain program data, whose interpretation is meaningful only to the application itself, such as the program instructions .text and the associated data .data and .bss.
Sections that contain link-editing information, such as the symbol table information found from .symtab and .strtab, and relocation information such as .rela.text.

Basically, the link-editor concatenates the program data sections into the output file. The link-editing information sections are interpreted by the link-editor to modify other sections or to generate new output information sections used in later processing of the output file.

The following simple breakdown of link-editor functionality introduces the topics covered in this chapter:

It verifies and checks for consistency all the options passed to it.
It concatenates sections of the same characteristics (for example, type, attributes, and name) from the input relocatable objects to form new sections within the output file. These concatenated sections can in turn be associated to output segments.
It reads symbol table information from both relocatable objects and shared objects to verify and unite references with definitions, and usually generates a new symbol table, or tables, within the output file.
It reads relocation information from the input relocatable objects and applies this information to the output file by updating other input sections. In addition, output relocation sections might be generated for use by the runtime linker.
It generates program headers that describe all segments created.
It generates dynamic linking information sections if necessary, which provide information such as shared object dependencies and symbol bindings to the runtime linker.

The process of concatenating like sections and associating sections to segments is carried out using default information within the link-editor. The default section and segment handling provided by the link-editor is usually sufficient for most link-edits. However, these defaults can be manipulated using the -M option with an associated mapfile. See Chapter 8, Mapfile Option.

Invoking the Link-Editor

You can either run the link-editor directly from the command line or have a compiler driver invoke it for you. In the following two sections the description of both methods are expanded. However, using the compiler driver is the preferred choice. The compilation environment is often the consequence of a complex and occasionally changing series of operations known only to compiler drivers.

Direct Invocation

When you invoke the link-editor directly, you have to supply every object file and library required to create the intended output. The link-editor makes no assumptions about the object modules or libraries that you meant to use in creating the output. For example, when you issue the command:

$ ld test.o

the link-editor creates a dynamic executable named a.out using only the input file test.o. For the a.out to be a useful executable, it should include startup and exit processing code. This code can be language or operating system specific, and is usually provided through files supplied by the compiler drivers.

Additionally, you can also supply your own initialization and termination code. This code must be encapsulated and labeled correctly for it to be correctly recognized and made available to the runtime linker. This encapsulation and labeling can also be provided through files supplied by the compiler drivers.

When creating runtime objects such as executables and shared objects, you should use a compiler driver to invoke the link-editor. Invoking the link-editor directly is recommended only when creating intermediate relocatable objects using the -r option.

Using a Compiler Driver

The conventional way to use the link-editor is through a language-specific compiler driver. You supply the compiler driver, cc(1), CC(1), and so forth, with the input files that make up your application, and the compiler driver adds additional files and default libraries to complete the link-edit. These additional files can be seen by expanding the compilation invocation, for example:

$ cc -# -o prog main.o
/usr/ccs/bin/ld -dy /opt/COMPILER/crti.o /opt/COMPILER/crt1.o \
/usr/ccs/lib/values-Xt.o -o prog main.o \
-YP,/opt/COMPILER/lib:/usr/ccs/lib:/usr/lib -Qy -lc \
/opt/COMPILER/crtn.o

Note - The actual files included by your compiler driver and the mechanism used to display the link-editor invocation might differ.

Specifying the Link-Editor Options

Most options to the link-editor can be passed through the compiler driver command line. For the most part the compiler and the link-editor options do not conflict. Where a conflict arises, the compiler drivers usually provide a command-line syntax you can use to pass specific options to the link-editor. You can also provide options to the link-editor by setting the LD_OPTIONS environment variable. For example:

$ LD_OPTIONS="-R /home/me/libs -L /home/me/libs" cc -o prog main.c -lfoo

The -R and -L options will be interpreted by the link-editor and prepended to any command-line options received from the compiler driver.

The link-editor parses the entire option list for any invalid options or any options with invalid associated arguments. When either of these cases is found, a suitable error message is generated. If the error is deemed fatal, the link-edit terminates. In the following example, the illegal option -X is identified, and the illegal argument to the -z option is caught by the link-editor's checking.

$ ld -X -z sillydefs main.o
ld: illegal option -- X
ld: fatal: option -z has illegal argument `sillydefs'

If an option requiring an associated argument is mistakenly specified twice, the link-editor will provide a suitable warning but will continue with the link-edit. For example:

$ ld -e foo ...... -e bar main.o
ld: warning: option -e appears more than once, first setting taken

The link-editor also checks the option list for any fatal inconsistencies. For example:

$ ld -dy -a main.o
ld: fatal: option -dy and -a are incompatible

After processing all options, if no fatal error conditions have been detected, the link-editor proceeds to process the input files.

See Appendix A, Link-Editor Quick Reference for the most commonly used link-editor options, and the ld(1) man page for a complete description of all link-editor options.

Input File Processing

The link-editor reads input files in the order in which they appear on the command line. Each file is opened and inspected to determine its ELF file type and therefore determine how it must be processed. The file types that apply as input for the link-edit are determined by the binding mode of the link-edit, either static or dynamic.

Under static mode, the link-editor accepts only relocatable objects or archive libraries as input files. Under dynamic mode, the link-editor also accepts shared objects.

Relocatable objects represent the most basic input file type to the link-editing process. The program data sections within these files are concatenated into the output file image being generated. The link-edit information sections are organized for later use, but do not become part of the output file image, as new sections are generated to take their places. Symbols are gathered into an internal symbol table for verification and resolution. This table is then used to create one or more symbol tables in the output image.

Although any input file can be specified directly on the link-edit command-line, archive libraries and shared objects are commonly specified using the -l option. See "Linking With Additional Libraries" for coverage of this mechanism and how it relates to the two different linking modes. However, even though shared objects are often referred to as shared libraries, and both of these objects can be specified using the same option, the interpretation of shared objects and archive libraries is quite different. The next two sections expand upon these differences.

Archive Processing

Archives are built using ar(1), and usually consist of a collection of relocatable objects with an archive symbol table. This symbol table provides an association of symbol definitions with the objects that supply these definitions. By default, the link-editor provides selective extraction of archive members. When the link-editor reads an archive, it uses information within the internal symbol table it is creating to select only the objects from the archive it requires to complete the binding process. You can also explicitly extract all members of an archive.

The link-editor will extract a relocatable object from an archive if:

The archive member contains a symbol definition that satisfies a symbol reference, sometimes referred to as an undefined symbol, presently held in the link-editor's internal symbol table.
The archive member contains a data symbol definition that satisfies a tentative symbol definition presently held in the link-editor's internal symbol table. An example of this is a FORTRAN COMMON block definition, which will cause the extraction of a relocatable object that defines the same DATA symbol.
The link-editors -z allextract is in effect. This option suspends selective archive extraction and causes all archive members to be extracted from the archive being processed.

Under selective archive extraction, a weak symbol reference will not cause the extraction of an object from an archive unless the -z weakextract option is in effect. See "Simple Resolutions" for more information.

Note - The options -z weakextract, -z allextract, and -z defaultextract enable you to toggle the archive extraction mechanism among multiple archives.

With selective archive extraction, the link-editor makes multiple passes through an archive to extract relocatable objects as needed to satisfy the symbol information being accumulated in the link-editor internal symbol table. After the link-editor has made a complete pass through the archive without extracting any relocatable objects, it moves on to process the next input file.

By extracting from the archive only the relocatable objects needed at the time the archive was encountered, the position of the archive within the input file list can be significant. See "Position of an Archive on the Command Line".

Note - Although the link-editor makes multiple passes through an archive to resolve symbols, this mechanism can be quite costly for large archives containing random organizations of relocatable objects. In these cases, you should use tools like lorder(1) and tsort(1) to order the relocatable objects within the archive and so reduce the number of passes the link-editor must carry out.