Here is a variation of MSB Radix Sort that I have thought and implemented. This is an embarrassingly parallel algorithm that can be easily translated to the Map Reduce paradigm.
I have implemented this algorithm using Java7’s Fork Join Algorithm.
Also implemented are 2 versions of python code – one Recursive and another Iterative which are NOT implemented using any parallelizable code in python.
Core Ideas of the Algorithm
- Partition Numbers into Buckets based on the number of digits : each bucket is then worked in parallel
- Intra Bucket Parallelism: Numbers in the same bucket (i.e –numbers with the same number of digits) – their sorting can be further parallelized by making 10 more buckets based on their ith –MSB digit. This can happen recursively – giving a huge amount of intra-bucket parallelism.
Java Code usage
Dependencies – SLf4J and Perf4J and Javolution libraries
java MSBRadixSortingWithForkJoin <Number of Items that you would like to Sort>
#Items that you would like to Sort – Integer number. The code internally generates that many random numbers
Python Code Usage
python MSBRadixSortIterative.py filename
python MSBRadixSortRecursive.py fileName
FileName – name of a file – which has 1 integer / line which needs to be sorted