Comparison of MCTS, MCDDQ, MCDDQ-SA, Greedy algorithms in the context of the problem of parallel planning of machine loading in production
Abstract
Comparison of MCTS, MCDDQ, MCDDQ-SA, Greedy algorithms in the context of the problem of parallel planning of machine loading in production
Incoming article date: 08.04.2025This paper considers the problem of task scheduling in manufacturing systems with multiple machines operating in parallel. Four approaches to solving this problem are proposed: pure Monte Carlo Tree Search (MCTS), a hybrid MCDDQ agent combining reinforcement learning based on Double Deep Q-Network (DDQN) and Monte Carlo Tree Search (MCTS), an improved MCDDQ-SA agent integrating the Simulated Annealing (SA) algorithm to improve the quality of solutions, and a greedy algorithm (Greedy). A model of the environment is developed that takes into account machine speeds and task durations. A comparative study of the effectiveness of methods based on the makespan (maximum completion time) and idle time metrics is conducted. The results demonstrate that MCDDQ-SA provides the best balance between scheduling quality and computational efficiency due to adaptive exploration of the solution space. Analytical tools for evaluating the dynamics of the algorithms are presented, which emphasizes their applicability to real manufacturing systems. The paper offers new perspectives for the application of hybrid methods in resource management problems.
Keywords: machine learning, Q-learning, deep neural networks, MCTS, DDQN, simulated annealing, scheduling, greedy algorithm