reduce build parallelism
The nested for-loops spawned child processes in an unbounded fashion, so let's wait at the end of each inner loop as a compromise. On my Linux 16-threaded machine the whole build process gets slowed down by this from 43 seconds to 64; sounds OK to me.
This should avoid repeated failures we now keep seeing elsewhere
install-sh: fork: retry: Resource temporarily unavailable
The parallelism was introduced in commit a2f3046d (MR !39 (merged)).