UMass logo
CS logo

Code Release


SCALLA 0.1 is integrated with Hadoop 0.20.1 and can work in three modes. In Hadoop mode, it works as stock Hadoop without the need of any change. In Turn Off Sort-Merge mode, you can run your Hadoop job with a purely hash-based approach without the need to change the code of your Hadoop job. In Incremental Processing mode, you need to modify your Hadoop reducer with our simple incremental API, and you are able to perform incremental processing with another hash-based approach. The several hash-based approaches from SCALLA may provide the benefits including: lower CPU cost, smaller intermediate I/O, shorter running time, and earlier, incremental output during long-running jobs.

The source code and user guide is available at

Cluster Profiler

This is a simple profiler that collects CPU, memory, disk and network usage data from a cluster of nodes and visualizes the data with a set of plots.

Please visit for the source code and user guide.