
TORQUE Resource ManagerAdaptive Computing
TORQUE is a scalable, fault-tolerant resource manager for HPC environments, supporting job control, scheduling, and system optimization.
Vendor
Adaptive Computing
Company Website

TORQUE-Res…Data-Sheet.pdf
Product details
TORQUE Resource Manager is an industry-standard solution for managing batch jobs and distributed computing resources in high-performance computing (HPC) environments. Adaptive Computing’s fully developed version of TORQUE (currently at version 7.0.0) includes support for multiple operating systems, enhanced fault tolerance, scalability, and integration with Moab® Workload Manager. TORQUE is used globally across government, academic, and commercial sites to optimize application performance and system utilization.
Features
- OS Compatibility: Supports numerous Ubuntu versions, Red Hat 8, and SUSE 15.
- MIG Support: Includes support for Multi-Instance GPU environments.
- Extensive Testing: Validated with tens of thousands of tests across supported OS versions.
- Fault Tolerance:
- Additional failure condition checks.
- Node health check script support.
- Scheduling Interface:
- Extended query and control interfaces for improved scheduler interaction.
- Job statistics collection for completed jobs.
- Scalability:
- Enhanced server-to-MOM communication model.
- Supports clusters with tens of thousands of nodes and jobs.
- Handles jobs spanning hundreds of thousands of processors.
- Multi-threading and TCP-based communication for high responsiveness.
- Usability:
- Extensive logging improvements.
- Human-readable error messages.
- Modular Add-ons:
- Portal-based job submission.
- Accounting and grid management.
- Power management and high-throughput submission.
Benefits
- Ease of Use: Simplifies job submission with portals, templates, script builders, and web-based file management.
- Customizability: Adapts to specific system configurations and organizational needs.
- High Adoption: Widely used across global HPC installations.
- Improved Reliability: Robust fault tolerance and health monitoring.
- Enhanced Scheduler Control: Provides detailed job data and control interfaces.
- Scalable Architecture: Efficiently manages large-scale clusters and workloads.
- Operational Efficiency: Reduces administrative overhead and improves system performance.
Find more products by category
Enterprise Resource Planning (ERP) SoftwareDevelopment SoftwareAnalytics SoftwareView all