Semi-dynamic Green Resource Management in Downlink Heterogeneous Networks by Group Sparse Power Control

This paper addresses the energy-saving problem for the downlink of heterogeneous networks, which aims at minimizing the total base stations (BSs) power consumption while each user's rate requirement is supported. The basic idea of this work is to make use of the flexibility and scalability of the system such that more benefits can be gained by efficient resource management. This motivates us to propose a flexible BS power consumption model, which can control system resources, such as antennas, frequency carriers and transmit power allocation in an energy efficient manner rather than the"on/off"binary sleep mode for BSs. To denote these power-saving modes, we employ the group sparsity of the transmit power vector instead of the {0, 1} variables. Based on this power model, a semi-dynamic green resource management mechanism is proposed, which can jointly solve a series of resource management problems, including BS association, frequency carriers (FCs) assignment, and the transmit power allocation, by group sparse power control based on the large scale fading values. In particular, the successive convex approximation (SCA)-based algorithm is applied to solve a stationary solution to the original non-convex problem. Simulation results also verify the proposed BS power model and the green resource management mechanism.


I. INTRODUCTION
The definition of the next generation (5G) networks gives the main focus on providing ubiquitous and high data rate services for massive devices: for example, data rates of several tens of Mbit/s should be supported for tens of thousands of users [1]. To realize this 5G vision, the future 5G networks should be planned and deployed based on the peak traffic load in an area such that all the quality of service (QoS) levels throughout the entire networks can be always satisfied.
Networks densification and offloading , increasing bandwidth (e.g., by spectrum sharing [2] and carrier aggregation [3]), and the advanced MIMO (e.g., scaling up the number of antennas [4]) are recognized as the three key technologies in 5G networks to increase the spectral efficiency [5]. By employing these concepts, future 5G networks are more likely to become increasingly dense, massive and heterogeneous. However, like a double-edge sword, these dense, massive and heterogeneous advances in return may limit the network performance and increase the energy consumption if the proper resource management is not adopted. Therefore, if a heterogeneous network (HetNet) 1 are already planned or deployed in a typical area, a question arises:

Q: How to make a HetNet green by resource management in operation, especially for the partially loaded scenarios?
This question on green resource management has attracted intensive research. A brief, comprehensive, yet non-exhaustive review of related work is given as follows.

A. Related Works
The general base station (BS) and user equipment (UE) association is a popular way to improve the overall network performance by scheduling the connections between BSs and UEs such that the inter-BS interference can be properly managed, see [6], [7] and the references therein for the HetNets. When the green communications is the goal, a adaptive BS-UE association can be used to reduce the network energy consumption by power control. In [8], both the power allocation and BS assignment in nonorthogonal downlink transmission code-division multipleaccess (CDMA) communication systems are jointly studied, where each UE is allowed to connect to more than one BS. The authors in [9] propose a joint BS association and power control algorithm to simultaneously maximize the system revenue and minimize the total transmit power consumption such that each UE can be served by the right BS. Two types of BS-UE association problems are addressed in [10] by minimizing the total network power consumption (global throughtput) and minimizing each user's power consumption (UE equilibrium), respectively.
In [11], BS association and downlink beamforming is jointly optimized by minimizing the sum power consumption while guaranteeing a minimum signal-to-interference-and-noise-ratio (SINR) per UE. Instead of studying the BS-UE association under universal spectrum reuse, a joint design of flexible spectrum assignment and BS-UE association might further improve the network performance [12]. Another special case of spectrum reuse is orthogonal frequency division multiple access (OFDMA), which leads to a joint subcarrier assignment and BS-UE association problem. Some recent works on energy efficiency maximization for the downlink multi-cell OFDMA system have been addressed in [13]- [15] and the references therein.
In addition to the green scheduling and power allocation in the above works, another important way to save network energy consumption is to completely or partially turn off some "free" BSs with no/low load, e.g., [16]- [21] and the references therein. For instance, the authors in [19]- [22] introduce and optimize a {0,1} matrix to control the "on/off" status of the BSs, and in particular [21], [22] also consider the scheduling and transmit power minimization. However, the "on/off" two-status decision might be crude and coarse, since this power model implies that all the "on" BSs consume the same constant circuit power in spite of their different traffic loads, which is not true in practical systems. This motivates that the hardware components of the networks should be as flexible and reconfigurable as possible, since this hardware flexibility and scalability can be exploited to improve energy efficiency/saving, by reconfiguring the BS components according to the effectively used resources [17], [23]. Thus, flexibly turning off or deactivating some hardware components are preferred, e.g., the antenna muting/adaptation [24], [25]. In the time domain, the discontinuous transmission (DTX) [26] based on the varying channel quality is another example of BS sleep, which is extended in [27] by combining the scheduling and power control to minimize the BS energy consumption. By adopting the BS sleep mode mechanisms, some unnecessary energy consumptions, for example, static power and part of load-dependent power for the partial-loaded BSs, can be saved.
However, the systems in most previous works on green HetNets are not as flexible and scalable as possible and are usually based on some of the following assumptions: R1. both BSs  power allocation or fractional power control; R7. the "on/off" two-status BS sleep mode is used. In fact, these "restricted" system assumptions should be and can be relaxed due to recent hardware and signal processing capabilities in order to further improve the green performance.

B. Contributions
With this respect, this work is aimed to develop a power model of the HetNets involving the hardware flexibility and reconfiguration and to provide a semi-dynamic green resource management mechanism to adjust the networks energy consumption to the varying data traffic load. Inspired by the centralized benefits in the cloud technologies [28], we assume that all the BSs in one HetNet are connected to a central processor (CP) 2 via backhaul links (in fact, this work requires only a low backhaul overhead) such that a central optimization can be implemented.
The idea herein is to throw away the concept or limits of the "cell" such that a more flexible association/access between BSs, UEs, frequency carriers (FCs) is allowed under the following system assumptions: to the per-BS transmit power budget.
These general system operation assumptions allow us to formulate a series of the flexible scheduling and efficient resource management problems: such as P1. BS-UE association problem (BSselection and "many-to-many" assignment), P2. BS/UE-FC assignment problem (FC-selection and "many-to-many" assignment), P3. downlink transmit power allocation problem, P4. intracarrier interference management problem (a side-product of P1-P3), and P5. flexible BS power model (multiple sleeping modes enabled). In order to jointly solve the above resource management problems, we consider all the BSs, FCs, time blocks, transmit power as the "resources" in the HetNet, and pour them into the "pool" (i.e., the CP). The output of a predefined central optimization of green resource management based on a flexible and scalable power consumption model will give the answer to Question Q.
The main contributions along with the organization of this paper are listed as follows.
• In Section III, inspired by [29], [30], we employ for the first time the ℓ 0 norm of the power vector in place of the {0, 1} matrix to control the "on/off"of hardware components according to the effectively assigned FCs. With this choice, BSs' signal processing power can be flexibly scaled by group sparse power control. Based on this idea, a flexible and scalable BS downlink power consumption model is proposed.
• In Section IV, we formulate a semi-dynamic downlink network energy consumption minimization problem using slowly varying the large scale fading (LSF) values. This problem enables us to jointly optimize all the green BS-UE association and FC assignment, transmit power allocation, BS deactivation. Since this problem is shown to be a NP-hard problem, we apply a successive convex approximation (SCA)-based algorithm in Section V to solve it efficiently, and its convergence to a stationary solution is proved.
Notations: |X | and |x| denotes the number of the elements of a set X and a vector x; X (i) denotes the i-th element in the set X ; X 1 \X 2 denotes the set X 1 but excluding all the elements in the set X 2 ; diag[x] denotes a diagonal matrix with the elements in x as its diagonal elements; n L denotes the number of n-combinations for a L-element set. November  and F {1, 2, · · · , F } denote the index set of the BSs, UEs and FCs, respectively. This setup is denoted by K × L × F . Based on the general system assumptions A1-A6 in Section I-B, we let N k and W f Hz denote the number of antenna of BS k ∈ K and the bandwidth of FC f ∈ F . Let p f k,ℓ ≥ 0 denote the downlink transmit power at BS k allocated for the transmission to UE ℓ ∈ L on FC f ∈ F . The transmit power {p f k,ℓ } ℓ∈L,f ∈F at each BS k are allowed to be flexibly allocated to the LF channels but subject to the per-BS transmit power budget P max

A. Channel Model
We assume that the channel on each FC is quasi-static block-fading which is constant for a number of symbol intervals. 3 A symbol interval is denoted by be the instantaneous channel state information (CSI) from BS k ∈ K to UE ℓ ∈ L on FC f ∈ F in a certain time slot, where α f k,ℓ denotes the LSF gain including path loss and shadowing, andh f k,ℓ denotes the corresponding small scale fading (SSF) vector where each entry is assumed to satisfy independent and identically distribution (i.i.d.) with zero mean and unit covariance.

B. Resource Management Mechanism
In terms of resource management, the dynamic design based on the instantaneous CSI significantly gains the benefits by adjusting strategies with the varying CSI but at the cost of high complexity. In most practical mobile communication scenarios, it is usually not allowed to design complicated instantaneous transmission strategies (e.g., by the high overhead required and high-complexity iterative algorithms) because of the limited coherence time. In contrast, the long-term fixed transmission strategies for a long time duration has a very low complexity but usually results in a very inefficient usage of the resources because of the mismatch between the fixed strategies and the varying CSI. This motivates us to design a semi-dynamic hybrid resource management mechanism: M1. Maximum Ratio Transmission (MRT) Beamforming: During each coherence time, the lowoverhead and low-complexity MRT downlink beamforming scheme is used. Each BS can design the MRT beamforming patterns for its serving UEs independently and locally based on only the instantaneous CSI of the desired links, which has a low computation time (the remaining time can be left for uplink/downlink transmission) and no backhaul overhead needed for the SSF. 4 One beamforming design is sufficient for the whole coherence time; M2. Resource Management: During each A-LSF, green resource management problem is optimized at the CP based on only the LSF values. Only one implementation is needed for the whole A-LSF hence we call it "semi-dynamic".
In M1, no optimization but only the computation of the simple MRT beamforming pattern is required. Our main focus will be on the optimization in M2, which only requires that the LSF values are available at the CP.

C. Channel Acquisition
In order to implement M1 and M2, the acquisition of SSF and LSF are required, respectively. Some symbol intervals within each coherence time might be taken for channel training, e.g., by the pilot sequence transmission, and the remainder is left for downlink data symbol transmission 5 .
In this work, the time-division duplexing (TDD) operation scheme is employed, because the feedback phase under the frequency-division duplexing (FDD) operation can be eliminated by using the channel reciprocity and additionally the pilot overhead might be reduced for the multiantenna system, especially for the massive MIMO system. In the uplink channel training, all UEs transmit pilot sequences to their associated BSs on the assigned FCs. Let √ τ f φ f ℓ with ||φ f ℓ || = 1 be the training vector with the length τ f transmitted from UE ℓ with the transmit power p f U E,ℓ to its associated BS k on an assigned FC f .
Let U F C,f ⊆ L denote the set of UEs who reuse FC f . 6 Then, a τ f × |U F C,f | pilot sequence matrix is needed for channel training from U F C,f to their associated BSs If τ f ≥ |U F C,f |, we can generate the pairwise orthogonal pilot sequences {φ f ℓ } ℓ∈U F C,f . Otherwise, pilot reuse among the UEs in U F C,f is needed and pilot contamination exists. To consider the both cases, we generally denote by U m F C,f ⊂ U F C,f the set of UEs who use the same pilot the τ f × N k received signal at BS k on FC f can be expressed as where z f k,n ∈ C τ f ×1 , ∀n ∈ {1, · · · , N k } denotes the noise vector at n-th antenna of BS k in uplink training phase on FC f . We assume that z f k,n ∼ CN (0, W f σ 2 I), ∀n, since the terminal noise linearly increases with the carrier bandwidth.

Lemma 1 The minimum mean square error (MMSE) estimate of the channel from a typical UE
The SSFh f k,ℓ can be expressed ash Proof: This result is following the standard MMSE estimation in [31,Chapter 15.8].

Remark 1 When no pilot sequence is reused, the channel estimation error might become negli-
is sufficiently large and W f is not very large. Interestingly, (9) also implies that pilot sequences can be reused on the same FC without significant performance loss by those UEs as long as they have low LSF gains or low uplink training power to the same BS. In addition, the assignment of FCs and their bandwidth also influences the channel estimation δ f k,ℓ in (9), since the same link experiences different LSF on different FCs.
In terms of the LSF, it depends on the communication environment and mainly on the geolocations of the UEs because of the path-loss. This motivates us to employ the LSF map [32] to denote the LSF of different geo-locations.

Definition 1 A LSF map is defined as the set of LSF values of the dense sampling geo-locations
in a geographic area. A "point" on the LSF map contains KF -dimensional LSF values of the downlink channels 7 from K BSs to the corresponding geo-location on the F FCs, respectively.
A LSF map can be generated offline by measuring the LSF values of the sampling geolocations in advance once the deployment of BSs is given [33], and thus it can be used as a prior information (stored at the BS or the CP) to perform the cooperative semi-dynamic resource allocation in M2. For example, combining the LSF map and current UEs' geo-locations (maybe provided by GPS), the LSF values in the next A-LSF can be determined based on the UEs' mobility prediction [34].

D. Initial BS-UE Association
Let U f k ⊆ L and B f ℓ ⊆ K denote the UEs set simultaneously served by BS k ∈ K and the BSs set simultaneously serving UE ℓ ∈ L, respectively, on FC f ∈ F . Note that some UEs in Then, the proposed result can be obtained by solving a combinatorial problem.
In order to remove unlikely solutions to reduce the complexity, we propose an initial BS-UE association to shrink the solutions set as follows. Each BS k with N k antennas initially selects N k UEs with the strongest LSF gains on each FC, based on the LSF map and the UEs' mobility prediction, to form its initial UEs set. Since the LSF mainly depends on the UEs' geo-locations, the N k UEs with the strongest LSF gains generally are the closest N k UEs to BS k. Therefore, After selecting UEs by all BSs, each UE ℓ ∈ L might be simultaneously selected by multiple BSs for the potential CoMP transmission.
We let B ℓ B 1 ℓ = B 2 ℓ = · · · B F ℓ denote the initial BSs set consisting of all the serving BSs who initially select UE ℓ.

Remark 2
In general, it is reasonable to assume that each UE ℓ ∈ L is initially selected by at least one BS, i.e., |B ℓ | ≥ 1. In fact, it is rare that a UE cannot be initially selected by any BS, since the BSs are equipped with multiple antennas and the BS deployment is in practical based on UEs' distribution and density. If it really happens, it means that there exist more UEs than the network capacity or the non-selected UEs suffer from very bad channel conditions, and thus they should be deactivated during the next A-LSF. In this case, the proposed initial BS-UE association scheme also includes a simple user selection scheme.
After the initial BS-UE association, the number of feasible solutions to Problem P1 is reduced The optimal scheduling solution can by further determined in the resource management M2 by power control.

III. BSS ENERGY CONSUMPTION MODEL
For the setup K × L × F after initial BS-UE association, the downlink transmit power {p f k,ℓ } k∈B ℓ ,ℓ∈L,f ∈F forms an irregular 8 three-dimensional "tensor" with the size of |B ℓ | × L × F . In particular, the status of a link from BS k to UE ℓ on FC f can be implied by p f k,ℓ . More precisely, the link is on if p f k,ℓ > 0. Otherwise, it is off. This motivates us to propose a general BSs downlink energy consumption model based on the transmit power control.

A. BSs Downlink Energy Consumption Model
Before showing the energy consumption model, we first give some definitions. Let T BS,k and T f BS,k denote F |U k | × F K k=1 |U k | and |U k | × F K k=1 |U k | selective matrices only consisting of {0, 1} such that p BS,k = T BS,k p and p f BS,k = T f BS,k p, respectively. In the initial BS-UE association, each BS k is allowed to connect to N k UEs on all F FCs. However, this maximum-connectivity scenario rarely happens because it is usually neither necessary nor optimal to achieve certain UEs' transmission rate requirement because of the limits of intra-carrier interference and per-BS power constraint. Therefore, many elements of p BS,k and p would be zeros, which implies that these transmit power vectors have the (group) sparse property. For example, BS k will be in deep-sleep if p BS,k = 0. Otherwise, it will be active. Inspired by this sparsity property, we propose to employ the group sparsity of the transmit power vectors to illustrate the status of the BSs or FCs.
Definition 3 A vector is group sparse if it has a grouping of its components and the components within each group are likely to be either all zeros or not. Let x [x 1 , x 2 , · · · , x G ] be a M × 1 vector with G non-overlapping groups, where the vector x g denotes the g-th group of the size The weighted group sparsity of the vector x is defined by where w [w 1 , w 2 , · · · , w G ] with w g as the weight of the group x g and When w = 1, we use ||x|| G,Mg 0 to denote the standard unweighted group sparsity ℓ 0 norm.
can be used to count the number of active BSs. Therefore, we propose to employ the group sparsity of the transmit power vectors to model the downlink BSs energy consumption.

Proposition 1
The BSs power consumption model can be assumed as where P sleep 0 k denotes the basic static power consumption to support the "deep-sleep" mode of BS k; and µ k [P 1 sp,k , P 2 sp,k , · · · , P F sp,k ] denotes the weights for the weighted group sparsity where P f sp,k denotes the weight for the f -th group of p BS,k and is expressed by [35] where P ′ BB and P ′ RF are some reference baseband and RF related signal processing power consumption per 10 MHz bandwidth; and η k ∈ (0, 1) denotes the downlink power amplifier (PA) efficiency ratio of BS k; and P haul is the reference backhaul power consumption for a backhaul collection of wireless links of 100 Mbit/s capacity [36] and R haul is the average total backhaul transmission rate.

B. Explanation: Terms in Power Consumption Model
The proposed BS power consumption model in (12) is explained term by term as follows: Remark 3 When BS k is in "deep-sleep" mode, its signal processing power {p f sp,k } F f =1 is equal to zero. We employ ||p BS,k || F,|U k | 0,µ k to count the number of effective FCs assigned to BS k, which allows that each BS to have (F !+1)-level signal processing power by turning off partial hardware components according to different effective (assigned) bandwidth 9 . This term is load-dependent.

For example, if a BS is required to support a high data load of UEs, more FCs should be assigned but at the cost of high signal processing power. In contrast, a BS is placed into "deep-sleep"
if no FC is needed. Therefore, multi-level signal processing power can be perfectly determined by the group sparsity power control based on the UEs' rate requirement.

Backhaul Power:
This term is to measure the power consumption by the backhaul overhead, usually including the exchange of the CSI, transmission data and the signaling between coordinated BSs (e.g., in the iterative processing). The backhaul power consumption highly depends on the mechanism/algorithm itself. For instance, our proposed semi-dynamic resource management mechanism has no need for the backhaul communication during the channel training and the local MRT beamforming pattern design. its main requirement is to release the downlink users data to their associated BSs. Therefore, in our scenario the average total resulting backhaul rate for each UE is approximately its average downlink data rate 10 , thereby where R ℓ (p) is defined in bits/s as the average downlink transmission rate for UE ℓ.
The proposed BS energy consumption model in (12) is expressed as a function of transmit power vector p. This implies that a series of resource management problems, such as the tradeoffs between the BSs energy consumption and downlink transmission rate and Problems P1-P4 in Section I-B, can be jointly solved by optimizing a single variable p.

IV. DOWNLINK TRANSMISSION RATE AND PROBLEM FORMULATION
In this work, we desire to minimize the BSs energy consumption while each UE's required downlink rate is guaranteed. The downlink rate of an individual UE is first derived as follows.

A. Downlink Transmission Rate
Given an initial BS-UE association, the average transmission rate of each UE ℓ ∈ L during T LSF can be expressed as where 1 − τ f β 2,f denotes the downlink data transmission time fraction, and R f ℓ denotes the rate contribution from B ℓ to UE ℓ on FC f , i.e., where Eh{} denotes the expectation only with respect to the SSF within each T LSF because the LSF stays constant within T LSF , and w f k,ℓ ∈ C N k ×1 denotes the instantaneous downlink beamforming designed based on the estimated CSI at BS k for UE ℓ on FC f , and Inter − BS f ℓ and Intra − BS f ℓ denote the inter-BS and the intra-BS interference to UE ℓ on FC f , respectively.
Proof: See Appendix A.

Remark 4 The approximation is because
is used, which is widely used and partially justified in the performance analysis for the multi-antenna systems (e.g., [37]). In particular, the simulations in [38] imply this approximation has a high accuracy, especially for the large antenna array.

B. Problem Formulation
A semi-dynamic green resource management problem of BSs power minimization by group sparse power control is formulated as follows where the objective function P BS is shown in (12), and the R f ℓ in downlink transmission rate constraint (18b) is based on (17), and the constraint (18c) denotes per-BS transmit power constraint because of the hardware limits.
However, it is challenging to solve (18) directly. One reason is that it is a well-known NP hard problem to minimize the group sparsity (ℓ 0 norm) in (10). Another reason is that the term (17) in a coupled structure with the transmit power is like the sum rate expression of the single-input and single-output (SISO) interference networks and also leads to a NP-hard problem in optimization. The goal of this work is to efficiently compute high-quality suboptimal solutions of Problem (18).

C. Problem Reformulation
In order to make the problem (18) tractable, it is a common approach to relax a group sparsity ℓ 0 -norm to a mixed ℓ 2 /ℓ 1 norm. The weighted group sparsity of a vector x in (10) is approximately expressed as ||x|| G,|xg| 0,w ≈ G g=1 w g ||x g || 2 , which is non-smooth but convex (its minimization is known as a group Lasso problem). However, [39] and [40] provided a comparison of serval non-convex approximations of ℓ 0 norm and suggested that the following log-based approximation usually has a better sparse recovery performance where ǫ in (19) is set to be a very small constant. The simulations in the paper imply the choice of ǫ has a very slight influence on the performance.
Based on (19) and (14), the BS power consumption in (12) approximately becomes where t k T T BS,k 1, t k,f T f,T BS,k 1 and R ℓ (p) in (17). The average individual UE rate on FC f in (17) can be rewritten to a vector-form where α f B ℓ ,ℓ is a LF |B ℓ |×1 all-zeros vector except for the corresponding positions of {α f k,ℓ (δ f k,ℓ (N k − 1) + 1)} k∈B ℓ , and α f K,ℓ is similarly defined. In (21), we define α f (21) is a difference of two concave (DC) functions of p.
Based on the reformulation in (20) and in (21) of the rate constraint and objective function, respectively, after moving the constant terms in the objective function Problem (18) becomes where the total backhaul power consumption term is removed in (22a), because the rate constraint (22b) will be optimally achieved with "equality", i.e., R ℓ (p) = γ ℓ (constant term). However, Problem (22) is still difficult to solve, since it is a concave-minimization problem with the DC constraints.

V. SCA-BASED ALGORITHMS AND SOLUTIONS
In this section, the SCA-based algorithm is applied to compute the locally optimal solutions of the non-convex problem (22). The basic idea of the SCA-based algorithm (in spirit of [41], [42]) is to iteratively 1) construct a surrogate function as the upper bound of the objective/constraint function at the current solution and then 2) optimize the problem with surrogate functions which yields the next estimation of the parameters.

A. Technical Preliminaries
Consider the following non-convex optimization problem: where y, c j : R M → R are non-convex but smooth functions with the form of where y + , y − , c + j , c − j : R M → R are continuous convex functions, and Ω is a convex set in R M . We define X {x ∈ Ω : c j (x) ≤ 0, j = 1, · · · , J}. Problem (23) is a DC program with DC constraints (non-convex in general). By the SCA, a common scheme to generate a surrogate function is to linearize the non-convex functions by using a first-order Taylor series. For example, either the completely linearized (CL) function y CL (x, z) = y(z) + (∇y(z)) T (x − z) (25) or the partially linearized (PL) function can be the surrogate function of y(x), which is tight at a feasible point z, i.e., y CL (x, z) and y P L (x, z) Similarly, c CL j (x) or c P L j (x) is assumed to be the surrogate function of the DC constraint function c j (x), ∀j. Then, the DC program with DC constraints can be approximately formulated as a sequence of convex optimization problems (in multiple iterations), and each can be solved using algorithms and toolbox from convex optimization theory. Therefore, Problem (23) can be suboptimally but efficiently solved by the following Algorithm 1 and its variants.
Algorithm 1 SCA-based Algorithm to Solve DC Program (23) Initialization: i = 0, x (0) ∈ X and ǫ th . repeat Generate the surrogate functions y P L (x, x (i) ) and c P L j (x, x (i) ) by following (25); Solve the convex optimization problem

Remark 5 In principle, both PL functions and the CL functions (if they are feasible) can be flexibly used as the surrogate functions of the non-convex objective and constraint functions,
which might lead to some variants of Algorithm 1.

B. Solutions of BS Energy Consumption Minimization
By the above SCA-based algorithm, Problem (22) as a DC program can be solved.
At a feasible point q, based on (25) and (26) and after removing the constant terms, the surrogate function of the concave objective function and the DC constraint in (23) can be expressed by respectively.
After selecting a feasible initial point p (0) , Problem (22) can be suboptimally solved by the following Algorithm 2.
In Algorithm 2, (31) is a convex optimization problem with a linear objective function and convex constraints, which can be optimally and efficiently solved by the CVX toolbox.

Remark 6
The surrogate function R S ℓ (p, p (i) ) in (30) is an upper bound of the real rate function R ℓ (p), but in each iteration it is always achieved that R S ℓ (p ⋆ , p (i) ) = γ ℓ , ∀ℓ where p ⋆ is the Algorithm 2 SCA-based Algorithm to Solve Problem (22) Initialization: i = 0, a feasible p (0) and ǫ th . repeat Solve the convex optimization problem (27a)). This makes that each UE rate requirement can be finally guaranteed.

Proposition 2
The SCA-based algorithm in Algorithm 2 always converges to a KKT stationary solution of Problem (22).
Therefore, a local-optimal solution p to Problem (22) can be obtained by Algorithm 2, which is not guaranteed to be global optimal. Then, this solution also gives the answers to the problems P1-P4 in Section I-B.

C. Performance Analysis
We compare our proposed algorithm based on the flexible assumptions A2-A4 in Section I-B with some baselines that study the same BSs power minimization problem with the proposed BS power model but based on the assumptions R2-R5 in Section I-A in a theoretical way.
Proposition 3 Based on the flexible system assumptions A2-A4 in Section I-B, our proposed green resource management mechanism always outperforms those baselines which are based on the assumptions R2-R5 in Section I-A.
Proof: Similar to Definition 2, we let p U E,ℓ ∈ R F |B ℓ |×1 , p f U E,ℓ ∈ R |B ℓ |×1 , and p F C,f ∈ R L|B ℓ |×1 denote the power of the BSs set B ℓ to UE ℓ on all FCs, the power of the BSs set B ℓ to UE ℓ on FC f , and the power of all the BSs to all the UEs on FC f , respectively. The "restricted" assumptions R2-R5 can be equivalently formulated to the following theoretical constraints respectively. Therefore, for example, one baseline with assumption R2 can be formulated to the optimization problem (18) but with an extra constraint (32). In optimization, more constraints used for the same objective optimization problem will degrade the performance (or have the same performance when this extra constraint is inactive), since the feasible solution set is shrunk. In this work, these constraints (32)- (35) have been, in fact, relaxed by the general assumptions A2-A4 as shown in Problem (18) , and thus its outperformance is verified.

D. Implementation
The implementation of the proposed semi-dynamic green resource management mechanism during each A-LSF in a HetNet is summarized as follows.   [43] PA efficiency 35% (macro), 25% (pico) • Step 4: Repeat Step 3a to Step 3c until the end of the A-LSF.

VI. SIMULATIONS AND DISCUSSIONS
In this section, the performance of the proposed algorithm is evaluated on a 3-macro cell two-tier HetNet. Each macro cell is a regular hexagon with a radius of 250 meters and a single macro BS located at the center, where the same number of pico BSs and UEs are randomly deployed within each macro cell with the simulation parameters in Table I.
As shown in Section V-C, we have already proved that our proposed algorithm always outperforms the baselines based on the restricted BS-UE association and BS/UE-FC assignment assumption R2-R5 in Section I-A, and thus the focus herein is on two other baselines: • L 2,1 Approx: It denotes the performance of the same optimization by Algorithm 2 but using the ℓ 1 /ℓ 2 mixed norm to approximate the ℓ 0 norm instead of (19). This baseline is to show the impact of the ℓ 0 norm approximation; • Min. T-Power: This baseline is generated when only the downlink transmit power is minimized, which is a quite widely-used metric in the previous work on energy efficiency/saving.

A. Performance Evaluation for Deterministic UEs
We first evaluate the performance of Algorithm 2 within a typical A-LSF, where the UEs' locations can be considered to be fixed because the LSF is not varying during each LSF time period. We assume 5 pico BSs per macro cell. The partial loaded scenario is considered, where 5 UEs are located within each macro cell and each UE has a 2 Mbits/s data rate requirement.
No pilot sequence is reused. As shown in Table I, a total 40 MHz spectrum is available.
A result example for Algorithm 2 is shown in Fig. 1 When we assume two FCs are adopted where each FC has a bandwidth of 20 MHz, .e.g, the multiple access scenario, an energy consumption comparison with the baselines "L 2,1 Approx" and "Min. T-Power" is shown in Fig. 2. Observe that the energy consumption is increasing with the UE's rate requirement and our algorithm can achieve a more than 50% energy reduction compared with the "Min. T-Power", since the "Min. T-Power" does not optimize the sleep modes. This implies that our proposed flexible BS power model provides more freedoms for further energy saving. The log-based approximation also outperforms the ℓ 1 /ℓ 2 mixed norm.
In Fig. 3 the convergence behavior of Algorithm 2 is shown, where we set the parameter ǫ for the ℓ 0 norm approximation in (19) as ǫ ∈ {10 −1 , 10 −3 , 10 −5 , 10 −7 }, for each ǫ 10 random initializations are used. It is shown in Fig. 3 that the used ℓ 0 norm approximation in (19) is robust to the choice of ǫ and different initializations might lead to different KKT stationary solutions with similar convergence rate.

B. Average Performance Evaluation
The average performance of the proposed algorithm is evaluated by 100 Monte Carlo simulations, where the locations of 5 UEs are randomly generated within each macro cell. The average energy consumption for 2 FCs with respect to the UEs' data rate requirement is shown in Fig. 4, which has a similar behavior (also with more than 50% energy reduction) with deterministic scenario in Fig. 2. This implies that the performance of the proposed algorithm is not highly influenced by the specific channel values.
In Fig. 5, we illustrate the total energy consumption for the higher UE rate requirement of 20 Mbits/s, where some of macro BSs are not in deep-sleep, and thus the signal processing power scales with the number of macro BS antennas. Note that the signal processing power also depends on the bandwidth of the assigned FCs. By the sparse power control, the proposed algorithm and   Algorithm 2 is slightly decreasing with the carrier splitting, while the two baselines seem to not be sensitive to the amount carrier splitting. In contrast to carrier aggregation, narrowing a FC will sacrifice the spectrum efficiency, but a more flexible resource usage for scheduling is allowed. This is in particular important for the partial-loaded scenarios, where the wide carrier may be not necessary. The study of the trade-off between the spectrum efficiency and the energy efficiency with respect to the bandwidth and the number of FCs will be done in our future work.

VII. CONCLUSIONS
In this paper, motivated by the requirement for energy saving in the partially loaded HetNets, we propose an optimization scheme for the system operation to be as flexible and scalable as possible. This flexibility provides more freedom to help the network reduce the energy consumption by deactivating some unnecessary hardware components. A flexible BS power consumption model is developed to support the scalability, which allows the BS to control the system resources, such as antennas and frequency carriers, for energy saving by group sparse power control. Based on this power model, a BS energy consumption minimization problem while supporting each user's rate requirement is formulated and optimized only with respect to a transmit power vector. Solving this problem yields solutions for a series of green resource management problems, such as BS-UE association, BS/UE-FC assignment, the BS signal processing power levels, and the energy minimization can be jointly solved. In addition, this work provides a general framework for BS energy minimization, which is independent of the BSs tiers/density and the number/bandwidth of the FCs. Simulation results indicate that the proposed algorithm is capable of reducing the BS power consumption by more than 50%.  (22). Considering the properties of the cluster point, we have p (i) = p (i+1) = p with i → +∞ for the optimization of (31). Therefore, given p (i) = p, the optimal solution p (i+1) = p of (31) should satisfy the following KKT conditions where ζ ℓ , ∀ℓ ∈ L and θ k , ∀k ∈ K are the Lagrangian multipliers. Observe that the KKT conditions (46a)-(46d) are exactly same as the KKT conditions of Problem (22). Therefore, it implies that p with the associated Lagrangian multipliers {ζ ℓ , θ k } is a KKT stationary solution to the original problem (22).