Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In the realm of high-performance computing (HPC), the efficient scheduling of jobs is essential for improving the performance of the system and making the best use of its resources. Conventional scheduling methods, which depend on heuristic priority rules, often find it challenging to adjust to changing job loads, objectives, and configurations. To overcome these obstacles, reinforcement learning (RL)-based schedulers have been proposed. However, training and evaluating these schedulers require extensive job traces, which are often unavailable due to privacy concerns. To bridge this gap, we employ advanced synthetic data generation techniques, including Tabular Variational Autoencoder (TVAE), Conditional GAN (CTGAN), and Copula GAN, to generate high-fidelity synthetic job traces. These models produce diverse and realistic synthetic data that accurately capture various job patterns and system states. We evaluate the synthetic data's quality and applicability through various methods, including cumulative distribution functions (CDFs), correlation heatmaps, statistical analysis, and scheduling simulations. Our findings show that the synthetic data, created with minimal human intervention, closely resembles actual real-world scenarios, which is beneficial for training and assessing RL-based scheduling methods in dynamic HPC settings. Our method addresses the lack of available real-world job traces. Integrating advanced synthetic data generation with RL-based scheduling represents a significant step forward in optimizing HPC job scheduling.

Details

PDF

Statistics

from
to
Export
Download Full History