Reinforcement Learning for Self Organization and Power Control of Two-Tier Heterogeneous Networks

Self-organizing networks (SONs) can help manage the severe interference in dense heterogeneous networks (Het-Nets). Given their need to automatically configure power and other settings, machine learning is a promising tool for data driven decision making in SONs. In this paper, a HetNet is modeled as a dense two-tier network with conventional macro cellsoverlaid with denser small cells (e.g. femto or pico cells). First,a distributed framework based on multi-agent Markov decision process is proposed that models the power optimization problem in the network. Second, we present a systematic approach for designing a reward function based on the optimization problem.Third, we introduce Q-learning based distributed power allocation algorithm (Q-DPA) as a self-organizing mechanism that enables ongoing transmit power adaptation as new small cells are added to the network. Further, the sample complexity of the Q-DPA algorithm to achieve -optimality with high probability is provided. We demonstrate, at density of several thousands femtocells per km2, the required quality of service of a macrocelluser can be maintained via the proper selection of independent or cooperative learning and appropriate Markov state models.

