Parallel Builds With Gentoo's Emerge

Overview

Gentoo allows one to specify the degree of concurrency to be employed when emerging packages with Portage in two ways by specifying:

  • the -j and the -l options in the MAKEOPTS variable in /etc/make.conf, or,
  • the --jobs and --load-average options on the emerge command line or in the EMERGE_DEFAULT_OPTS variable in /etc/make.conf.

The article describes some of my experiences with using the above parameters.

Determining Hardware Limits On Parallelism

Assuming the operating system being used to run Portage is Linux, one needs to determine the number of CPU cores and theads available for parallelism. This can be determined by executing the following:

grep '^processor' /proc/cpuinfo | sort -u | wc -l

For brevity, this amount will be referred to as NJOBS below, i.e., as if this were run:

NJOBS=$(grep '^processor' /proc/cpuinfo | sort -u | wc -l)

Maximizing Processor Saturation

Ideally one would want all processors busy performing work as this would allow one to minimize the total amount of time compiling code. However, one does not want the system load to become excessive for the following reasons:

  • to minimize process switching, and,
  • to keep the system responsive to user input.

The first step in this process is to set MAKEOPTS to have as its maximum number if parallel tasks to be NJOBS+1 and to limit starting any new jobs if the load is NJOBS or higher in /etc/make.conf:

MAKEOPTS="-j$((NJOBS+1)) -l${NJOBS}"

where the -j option sets the maximum number of parallel jobs that can be run via make and the -l prevents any new parallel job starting unless the load is below the amount specified. (The reason the number of jobs is set to one higher than the number of processors is to help ensure saturation of processor utilization.)

There is no need to set an NJOBS variable in /etc/make.conf as hardware will not change, so one should simply substitute the proper values in MAKEOPTS. For example, MAKEOPTS suitable for an i7 processor would be:

MAKEOPTS="-j9 -l8"

Setting MAKEOPTS is very safe as there are very few packages with parallel make issues. Typically, if there are issues, the ebuild for that package will turn off the option, or, will give notice that -j1 must be used.

Setting MAKEOPTS will improve the build times of a number of packages, however:

  • many packages don't perform parallel makes, and,
  • many packages' build procedures won't saturate all processors with load.

This is most apparent when running long-build tasks such as emerge -eav world or when building large software programs such as Firefox, Chromium, or LibreOffice. In my experience an i7 processor will have a load between 1.00 and 2.00 most of the time if only MAKEOPTS is set. This is not even close to ideal.

To better utilize processors, one has to tell emerge to also run parallel jobs. This is done using the --jobs and --load-average options with emerge:

emerge --jobs=${NJOBS} --load-average=${NJOBS} world

Here, unlike MAKEOPTS, one need only set the jobs and the load limit to the number of processors as tasks will be run within such that will generate load. Since there is a load average limit specified here and in MAKEOPTS the system should not become overly busy and start thrashing or need to excessive amounts of process switching. (If such is an issue, then lower the load average settings.)

Since emerge's command line overrides anything set in /etc/make.conf, I set EMERGE_DEFAULT_OPTS in /etc/make.conf as follows:

EMERGE_DEFAULT_OPTS="--jobs=${NJOBS} --load-average=${NJOBS}"

i.e., for an i7 this would be:

EMERGE_DEFAULT_OPTS="--jobs=8 --load-average=8"

so I don't have to specify such on the command line. For packages that have issues being compiled in parallel, one need only override --jobs on the command line setting it to 1:

emerge -j1 world

Dealing With Failed Builds

There are some instances of packages that will simply not build when emerged as parallel jobs. This poses a significant issue if one was performing an emerge world of hundreds or thousands of packages. Fortunately, emerge has the ability to resume a compile as well as to skip the first package when resuming. Assuming the packages are already built on the computer, you can safely skip the package and rebuild it later with -j1. If you are extra careful not to do such more than once (or you'll lose the ability to resume), you can even do this to build the package (e.g., in another window) and when done, emerge -rav --skipfirst to resume the build process.

Want More Output?

When the number of jobs is greater than one, there is only a summary output. To see the tasks actually being performed, run:

qlop -c

or the continuously updating:

while true ; do clear ; qlop -c ; sleep 2 ; done

To see the actual build output that one sees with emerge -j1, simply run tail for the package, PKG, being built:

tail -f /var/tmp/portage/*/${PKG}*/temp/build.log

by replacing ${PKG} with the name of the package.

Different Values For Jobs and Load Averages

Elsewhere on the Internet, I've seen people use different values such as having the number of processors and assigning half to MAKEOPTS and the other half to emerge. Unfortunately, this will on average saturate and utilize only half of the processors. In my experience, even this "average" does not happen frequently!

Logically, it is much better to set the jobs to the number of processors and then to use load average settings to limit the load to limit the total number of parallel tasks created during the build process. This strategy, as described above, allows all processors to be used without excessive loads being created (e.g., on an i7 the highest loads I've seen with the above settings is approximately 9.0). It also allows parallelism to occur when packages are being built that don't use parallel make (or are limited in how much parallelism they can do). This is why the emerge load average value is set as it is above –as the MAKEOPTS load average ensures parallel tasks only start running if the load is not too high. My experience with such are very positive with good to excellent load averages when things can be done in parallel.

Conclusion

In my experience, using the settings outlined above significantly speeds up compiles on multi-core machines. There are only a couple of packages that don't like the parallel emerge where I need to intervene but the rest build without a problem and my cores/processors become mostly utilized instead of being mostly idle.

One Reply to “Parallel Builds With Gentoo's Emerge”

Leave a Reply

Your email address will not be published. Required fields are marked *