• Stephen Rothwell's avatar
    kbuild: allow architectures to use thin archives instead of ld -r · a5967db9
    Stephen Rothwell authored
    ld -r is an incremental link used to create built-in.o files in build
    subdirectories. It produces relocatable object files containing all
    its input files, and these are are then pulled together and relocated
    in the final link. Aside from the bloat, this constrains the final
    link relocations, which has bitten large powerpc builds with
    unresolvable relocations in the final link.
    Alan Modra has recommended the kernel use thin archives for linking.
    This is an alternative and means that the linker has more information
    available to it when it links the kernel.
    This patch enables a config option architectures can select, which
    causes all built-in.o files to be built as thin archives. built-in.o
    files in subdirectories do not get symbol table or index attached,
    which improves speed and size. The final link pass creates a
    built-in.o archive in the root output directory which includes the
    symbol table and index. The linker then uses takes this file to link.
    The --whole-archive linker option is required, because the linker now
    has visibility to every individual object file, and it will otherwise
    just completely avoid including those without external references
    (consider a file with EXPORT_SYMBOL or initcall or hardware exceptions
    as its only entry points). The traditional built works "by luck" as
    built-in.o files are large enough that they're going to get external
    references. However this optimisation is unpredictable for the kernel
    (due to above external references), ineffective at culling unused, and
    costly because the .o files have to be searched for references.
    Superior alternatives for link-time culling should be used instead.
    Build characteristics for inclink vs thinarc, on a small powerpc64le
    pseries VM with a modest .config:
                                      inclink       thinarc
    vmlinux                        15 618 680    15 625 028
    sum of all built-in.o          56 091 808     1 054 334
    sum excluding root built-in.o                   151 430
    find -name built-in.o | xargs rm ; time make vmlinux
    real                              22.772s       21.143s
    user                              13.280s       13.430s
    sys                                4.310s        2.750s
    - Final kernel pulled in only about 6K more, which shows how
      ineffective the object file culling is.
    - Build performance looks improved due to less pagecache activity.
      On IO constrained systems it could be a bigger win.
    - Build size saving is significant.
    Side note, the toochain understands archives, so there's some tricks,
    $ ar t built-in.o          # list all files you linked with
    $ size built-in.o          # and their sizes
    $ objdump -d built-in.o    # disassembly (unrelocated) with filenames
    Implementation by sfr, minor tweaks by npiggin.
    Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
    Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
    Signed-off-by: default avatarMichal Marek <mmarek@suse.com>