News Release

Multi-user reinforcement learning based task migration in mobile edge computing

Peer-Reviewed Publication

Higher Education Press

system model

image: 

system model

view more 

Credit: Yuya CUI, Degan ZHANG, Jie ZHANG, Ting ZHANG, Lixiang CAO, Lu CHEN

Dynamic service migration is a key technology in Mobile Edge Computing(MEC). In a multi-user service migration scenario, the states of all users are combined into a global state, which leads to the instability of the system and ignores the influence of multiple users. It is more and more challenging to design an effective migration strategy to balance migration costs and latency in a multi-user distributed environment.
To solve the problems, a research team led by Degan ZHANG published their new research on 15 August 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
Considering the migration cost, quality of service(QoS), workload on the server, and spectrum resource allocation, a multi-user task migration model is proposed. Under the constraint of migration cost, the team describe the multi-user task migration problem as a minimization optimization problem to minimize the system delay.
DDPG adopts the experience replay mechanism. However, the sampling of samples in experience replay storage is random, and the different importance of each sample is ignored, which will lead to low sampling efficiency. Samples with low complexity are not conducive to the learning of neural networks, and in the early stages of learning, neural networks are difficult to understand high-complexity training samples. Therefore, they assign priority weights to each state sample in the experience replay storage, and set their sampling probabilities according to the priority weights.
Adopting an offline centralized training distributed execution framework ensures that each mobile user's environment is fixed, and even in scenarios with frequent environmental changes, users do not need to interact frequently, which can effectively solve the impact of other mobile users' behavior on the environment. And the team use the interaction between a single mobile user and a population to approximate the interaction between the mobile user and the environment.
From real application scenarios and simulation experiments, the experimental results show that AWDDPG (proposed)can achieve fast and stable convergence, and perform well in terms of migration cost and average completion time.
DOI: 10.1007/s11704-023-1346-3
 


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.