Instructions on how to set up an Ubuntu cluster can be found at https://help.ubuntu.com/community/MpichCluster .
I’ve updated a few outdated commands there myself so it shouldn’t be too hard to follow the instructions. The only thing I personally did differently was that I didn’t create a new user, but instead used my old account on all the machines (the important thing is that the username be the same everywhere).
In this post I’ll explain how to make a Python script to utilize this cluster using the MPI standard for parallel programming.
To prepare your Python interpreter for parallel programming, you first need some sort of an MPI interface. Several exist so it’s up to you to choose. I used mpi4py. This is part of the scipy module and it can be installed through Synaptic or with:
sudo apt-get install python-dev # other potential packages to consider - python-mpi mpichpython python-scipy python-numpy
Then you need to install the mpi4py module. Note: we won’t install mpi4py from the Ubuntu repo, because it depends on OpenMPI and we are using MPICH2. This can normally be done using the Python package index by installing pip and using it to install mpi4py:
sudo apt-get install pip sudo pip install mpi4py
For me, howewer, the repository for mpi4py was unavailable, so I had to download the latest version from their google code home and install it following these instructions – basically it comes down to
sudo python setup.py install
To be able to run an mpi program you first need to boot the mpd (for example on 4 hosts):
mpdboot -n 4
Update: booting the mpd is no longer necessary in the new version of MPICH2, after the switch to the Hydra process manager.
You can then run your programs (in let’s say 6 instances – it doesn’t have to match the number of hosts, for the machinefile see https://help.ubuntu.com/community/MpichCluster):
mpiexec -n 6 -f machinefile python program.py
Here’s a sample Python program you can use to get started with mpi4py:
#!/usr/bin/env python from mpi4py import MPI comm = MPI.COMM_WORLD print "Hello! I'm rank %d from %d running in total..." % (comm.rank, comm.size) comm.Barrier() # wait for everybody to synchronize _here_
I’m using mpich1 and already pip installed mpi4py in my machines but I get the error:
bash: /home/mpiuser/anaconda/bin/hydra_pmi_proxy: No such file or directory
According to what I’ve read, I need to have hydra_pmi_proxy in all my machines but only one of my machines have anaconda. Is there another way to work around this other than copying the entire anaconda folder to all my machines?
Is the C example from https://help.ubuntu.com/community/MpichCluster working for you? Did you set up NFS? You first need to get that running and then try the Python solution, as your problem sounds more like an issue with MPICH than with the Python package.