GSoC 2011: Parallelising cabal-install

Tags: haskell, gsoc.

This summer, I will be working on parallelising cabal-install under the aegis of the Google Summer of Code program. The aim of the project is to make Cabal utilise multiple threads for running the build process (a-la make -j). This means that Cabal will be able to make use of those shiny multi-core processors many developers now own. While initially the unit of granularity will be the package, in the second phase of the project I plan to add support for building even single modules in parallel (which will require interaction with ghc --make). Until my patches are accepted into the main repo, they’ll live on Darcden (not much there yet!).

While I haven’t yet done much work on modifying the cabal-install proper, I’ve produced a small prototype that illustrates my approach to the problem. The prototype program consists of several threads which communicate via Chans. There are several worker threads, which compile the packages (threadDelay is used to simulate actual operations). A single control thread maintains the package graph and assigns tasks to the worker threads. A single logger thread prints out messages received from the worker threads. A single install thread installs the packages into the target directory (this is done serially, but can also be parallelized if deemed safe).

After the install thread installs a package, it notifies the controller thread, which then updates the package graph and adds new tasks for the worker threads (if possible). The control thread terminates when the last package has been installed (which leads to the termination of all other threads).

I still have exams until the 7th of June, but I’ve already posted my first patch. Though not directly related to the project, it helped me to smooth out some wrinkles in the workflow and get accustomed to the Darcs way of doing things.


blog comments powered by Disqus