Scalable Vision is a monitoring and analytics tool for HPC cluster and workload management system. It leverages big data technology for scalability, fault tolerance, and flexible customization, providing insight and outlook for cluster resource, workload, and project based data to users, administrators, and decision makers.
Scalable Vision unique advantages are:
- Support multiple clusters with noSQL scalable database with built-in fault tolerance
- Flexible reporting and data analysis with customizable dashboard
- Integration with system management web portal
- Collect 100s data metrics, viewing data with multiple dimensions
- Support multiple workload management system including Scalable Cube, LSF™, etc.
Scalable Vision has the following built-in reports:
Report | Application |
---|---|
Maximum and used job slots in cluster | Overview of cluster resource usage |
Pending, running, and suspended job slots in cluster | Overview of cluster workload activities |
Pending, running, and suspended job slots in each queue | Overview of workload for a specific category (application, user group etc.) |
Top 20 users with the most running jobs | User workload activities |
Top 20 users with the most pending jobs | User workload activities |
Pending jobs by user in each queue | User service level for a specific category of workload |
Job CPU, memory, and swap usage statistics | Application profile and benchmark |
CPU, memory, and swap usage history of a specific job | Job health |
Job pending reason statistics | Bottleneck identification |
Application license statistics | Critical resource (floating application licenses) management |
Host CPU and memory report | Host resource utilization for capacity planning |
Running and suspended jobs per host | Host resource utilization and capacity planning |
Host load metrics statistics | Capacity planning |
Customizable reports | Custom resource and workload management needs |
For more information, please contact us.