pmhpc - poor man’s HPC
 
* about *
 
pmHPC is a framework for distributed computing. It provides means to distribute processing code and data and gather the results back to controlling machine.
* how it works *
 
As pmHPC is a framework, it does not do any calculation by itself. Here is a quick summary of the workflow:
 
1) Controller node distributes code to processing nodes
2) Controller node then initiates job processing
3) When job processing is initiated, processing nodes connects to controller node and asks for data
4) Processing nodes then would run the code against the received data and produce result
5) Result is then submitted to controller node
6) When all jobs are completed controller node will call process to combine partial results into one result
* when to use it *
 
pmHPC is best suited for tasks where input data can be easily decomposed, processed separately (in parallel) and then partial results combined. This assumes that changes to neighbouring data does not affect tasks data at all.
 
Good example of such task is matrix multiplication, where initial problem (two matrices) can be broken down into "jobs" (row-column pairs) and distributed to processing machines. Results (cell values) will be submitted and combined into one resulting matrix.
* advantages *
 
pmHPC is a framework, which means that all components (data producer, processing code, data collector) can be pretty much anything.
 
Hooks have to be python modules, but they can easily call external tools and/or binaries, which can be distributed as well with the python code.
 
No additional scripting is required for distributing such add-ons.
 
Modules have full access to system resources, available modules, system utilities, etc.
 
pmHPC is written in python, which means it can run on any platform that has python interpreter. So the processing cluster in fact can be heterogeneous.
 
Communication protocol is XML-RPC, so it is also possible to run calculations on geographically distributed system.
* disadvantages *
 
As it usually happens, the biggest disadvantage is what is listed in "advantages" section. Because modules can be anything and have almost no restrictions in what they can do, the biggest disadvantage is pmHPC's insecurity.
So be cautious about how you use it.
* author *
 
pmHPC is written by Rytis Sileika (reachable via rytis.sileika [AT] gmail.com)