Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
4.  Shared Objects Performance Considerations Relocations Copy Relocations  Previous   Contents   Next 
   
 

The link-editor has allocated space in the dynamic executable's .bss to receive the data represented by sys_errlist and sys_nerr. These data will be copied from the C library by the runtime linker at process initialization. Thus, each application that uses these data gets a private copy of the data in its own data segment.

There are two drawbacks to this technique. First, each application pays a performance penalty for the overhead of copying the data at runtime. Second, the size of the data array sys_errlist has now become part of the C library's interface. Suppose the size of this array were to change, prehaps as new error messages are added. Any dynamic executables that reference this array have to undergo a new link-edit to be able to access any of the new error messages. Without this new link-edit, the allocated space within the dynamic executable is insufficient to hold the new data.

These drawbacks can be eliminated if the data required by a dynamic executable are provided by a functional interface. The ANSI C function strerror(3C) will return a pointer to the appropriate error string, based on the error number supplied to it. One implementation of this function might be:

$ cat strerror.c
static const char * sys_errlist[] = {
        "Error 0",
        "Not owner",
        "No such file or directory",
        ......
};
static const int sys_nerr =
        sizeof (sys_errlist) / sizeof (char *);

char *
strerror(int errnum)
{
        if ((errnum < 0) || (errnum >= sys_nerr))
                return (0);
        return ((char *)sys_errlist[errnum]);
}

The error routine in foo.c can now be simplified to use this functional interface. This simplification in turn removes any need to perform the original copy relocations at process initialization.

Additionally, because the data are now local to the shared object, the data are no longer part of its interface. The shared object therefore has the flexibility of changing the data without adversely effecting any dynamic executables that use it. Eliminating data items from a shared object's interface generally improves performance while making the shared object's interface and code easier to maintain.

ldd(1), when used with either the -d or -r options, can verify any copy relocations that exist within a dynamic executable.

For example, suppose the dynamic executable prog had originally been built against the shared object libfoo.so.1 and the following two copy relocations had been recorded:

$ nm -x prog | grep _size_
[36]   |0x000207d8|0x40|OBJT |GLOB |15  |_size_gets_smaller
[39]   |0x00020818|0x40|OBJT |GLOB |15  |_size_gets_larger
$ dump -rv size | grep _size_
0x207d8     _size_gets_smaller    R_SPARC_COPY      0
0x20818     _size_gets_larger     R_SPARC_COPY      0

A new version of this shared object is supplied that contains different data sizes for these symbols:

$ nm -x libfoo.so.1 | grep _size_
[26]   |0x00010378|0x10|OBJT |GLOB |8   |_size_gets_smaller
[28]   |0x00010388|0x80|OBJT |GLOB |8   |_size_gets_larger

Running ldd(1) against the dynamic executable will reveal:

$ ldd -d prog
    libfoo.so.1 =>   ./libfoo.so.1
    ...........
    copy relocation sizes differ: _size_gets_smaller
       (file prog size=40; file ./libfoo.so.1 size=10);
       ./libfoo.so.1 size used; possible insufficient data copied
    copy relocation sizes differ: _size_gets_larger
       (file prog size=40; file ./libfoo.so.1 size=80);
       ./prog size used; possible data truncation

ldd(1) shows that the dynamic executable will copy as much data as the shared object has to offer, but only accepts as much as its allocated space allows.

Copy relocations can be eliminated by building the application from position-independent code. See "Position-Independent Code".

Using -Bsymbolic

The link-editor's -B symbolic option enables you to bind symbol references to their global definitions within a shared object. This option is historic, in that it was designed for use in creating the runtime linker itself.

Defining an object's interface and reducing non-public symbols to local is preferable to using the -B symbolic option. See "Reducing Symbol Scope". Using -B symbolic can often result in some non-intuitive side effects.

If a symbolically bound symbol is interposed upon, then references to the symbol from outside of the symbolically bound object will bind to the interposer. The object itself is already bound internally. Essentially, two symbols with the same name are now being referenced from within the process. A symbolically bound data symbol that results in a copy relocation creates the same interposition situation. See "Copy Relocations".


Note - Symbolically bound shared objects are identified by the .dynamic flag DF_SYMBOLIC. This flag is informational only. The runtime linker processes symbol lookups from these objects in the same manner as any other object. Any symbolic binding is assumed to have been created at the link-edit phase.


Profiling Shared Objects

The runtime linker can generate profiling information for any shared objects that are processed during the running of an application. The runtime linker is responsible for binding shared objects to an application and is therefore able to intercept any global function bindings. These bindings take place through .plt entries. See "When Relocations Are Performed" for details of this mechanism.

The LD_PROFILE environment variable specifies the name of a shared object to profile. You can analyze one shared object at a time using this environment variable. The setting of the environment variable can be used to analyze the use of the shared object by one or more applications. In the following example, the use of libc by the single invocation of the command ls(1) is analyzed:

$ LD_PROFILE=libc.so.1  ls -l

In the following example, the environment variable setting is recorded in a configuration file. This setting will cause any application's use of libc to accumulate the analyzed information:

# crle  -e LD_PROFILE=libc.so.1
$ ls -l
$ make
$ ...

When profiling is enabled, a profile data file is created, if it doesn't already exist. The file is mapped by the runtime linker. In the above examples, this data file is /var/tmp/libc.so.1.profile. 64-bit libraries require an extended profile format and are written using the .profilex suffix. You can also specify an alternative directory to store the profile data using the LD_PROFILE_OUTPUT environment variable.

This profile data file is used to deposit profil(2) data and call count information related to the use of the specified shared object. This profiled data can be directly examined with gprof(1).


Note - gprof(1) is most commonly used to analyze the gmon.out profile data created by an executable that has been compiled with the -xpg option of cc(1). The runtime linker's profile analysis does not require any code to be compiled with this option. Applications whose dependent shared objects are being profiled should not make calls to profil(2), because this system call does not provide for multiple invocations within the same process. For the same reason, these applications must not be compiled with the -xpg option of cc(1). This compiler-generated mechanism of profiling is also built on top of profil(2).


One of the most powerful features of this profiling mechanism is to enable the analysis of a shared object as used by multiple applications. Frequently, profiling analysis is carried out using one or two applications. However, a shared object, by its very nature, can be used by a multitude of applications. Analyzing how these applications use the shared object can offer insights into where energy might be spent to improvement the overall performance of the shared object.

The following example shows a performance analysis of libc over a creation of several applications within a source hierarchy.

$ LD_PROFILE=libc.so.1 ; export LD_PROFILE
$ make
$ gprof -b /usr/lib/libc.so.1 /var/tmp/libc.so.1.profile
.....

granularity: each sample hit covers 4 byte(s) ....

                                  called/total     parents
index  %time    self descendents  called+self    name      index
                                  called/total     children
.....
-----------------------------------------------
                0.33        0.00      52/29381     _gettxt [96]
                1.12        0.00     174/29381     _tzload [54]
               10.50        0.00    1634/29381     <external>
               16.14        0.00    2512/29381     _opendir [15]
              160.65        0.00   25009/29381     _endopen [3]
[2]     35.0  188.74        0.00   29381         _open [2]
-----------------------------------------------
.....
granularity: each sample hit covers 4 byte(s) ....

   %  cumulative    self              self    total         
 time   seconds   seconds    calls  ms/call  ms/call name   
 35.0     188.74   188.74    29381     6.42     6.42  _open [2]
 13.0     258.80    70.06    12094     5.79     5.79  _write [4]
  9.9     312.32    53.52    34303     1.56     1.56  _read [6]
  7.1     350.53    38.21     1177    32.46    32.46  _fork [9]
 ....

The special name <external> indicates a reference from outside of the address range of the shared object being profiled. Thus, in the above example, 1634 calls to the function open(2) within libc occurred from the dynamic executables, or from other shared objects, bound with libc while the profiling analysis was in progress.


Note - The profiling of shared objects is multithread safe, except in the case where one thread calls fork(2) while another thread is updating the profile data information. The use of fork1(2) removes this restriction.


 
 
 
  Previous   Contents   Next