Abstract
With ever-growing technical advances, performance of complex scientific and engineering applications has arrived at petaflops and exaflops range. However, massive power drawn from the large scale computing infrastructure has caused commensurate rise in electricity consumption, escalating data center ownership costs besides leaving carbon footprints. Judicious scheduling of complex applications with an objective to reduce overall makespan and reduced energy consumption has become one of the biggest confront in the realm of computing architectures. This paper presents a survey on energy efficient scheduling algorithms based on dynamic voltage and frequency scaling (DVFS) and dynamic power management (DPM) techniques. The parameters considered are mainly the makespan, processor energy (dynamic and static) consumption, and network energy (communication) consumption, wherever appropriate during task scheduling.
Keywords
Introduction
For sustainable computing, growing energy consumption tends to be the major obstacles for the proliferation of high performance computing. Over the last decade, performance improvement of computing systems from gigaflops to petaflops [17] resulted in substantial energy consumption rise. In the current computing paradigm, speedup of petaflops is drawing enormous power consumption, and is inevitably obscuring the expansion of future exascale machines. Contemporary exascale machines consume incredible power consumption of the order of 100 MW leading to unsustainable energy utility costs. There are many survey papers in the domain of power aware scheduling, but as the technology, tools, and associated features are developing rapidly, there is a sufficient research space to uncover certain aspects, which is the focus of work in this paper. The performance and power consumption scenario of the worldwide powerful machines can be realized from the latest Top500 list, wherein Japan’s Fugaku is number one supercomputer with power and performance of 28.3 megawatt and 415.5 petaflops respectively [49]. The data centers are predicted to draw more than 1000TWh energy in the next decade, surpassing power dissipation of Japan and Germany. Ultimately, power dissipation has reached at an alarming stage that leads to increase operational cost, poor system reliability, additional cooling costs, and limited battery life besides ecological hazards. In view of the issues, challenges, and constraints in computing paradigms, competent energy efficient scheduling techniques that can sustain the makespan and reduce overall energy consumption need to be investigated. This article intends to help the research fraternity for better insights on energy aware techniques, as well as to develop more competent power aware systems of tomorrow.
As the task scheduling of an application program is crucial, it needs to be managed effectively to utilize the full potency of parallel computing platforms. In the past few decades, researchers are working hard on scheduling problem mainly to get minimal makespan. However, in the light of sustainability, it is pertinent to carefully schedule tasks on an interconnected system of multiprocessor (homogeneous or heterogeneous) machines such that total energy consumption is kept low while meeting quality of service. As injudicious scheduling heuristics may lead to redundant energy consumption, much focus is being paid to optimize the energy consumption nowadays. Sometimes, these objectives are aberrant with each other, and are difficult to attain.
Major part of this paper is divided into two parts, i) Survey of DVFS-based energy aware scheduling on uniprocessor and multiprocessor computing systems, particularly in the context of different task scheduling heuristics, ii) Then, owing to the fabrication of compact and fast transistor sizes for complementary metal oxide semiconductor (CMOS) design, static power dissipation is increasing at an exponential rate. The growing trend of chip scaling to nano-scale causes static power dissipation to come near to the level of dynamic power, and thereby needs careful concerns too. To address this issue, DPM-based energy efficient techniques and associated algorithms are identified and discussed in the second part of the paper.
The remainder of this paper is organized as follows. Section 2 provides a glimpse on task scheduling heuristics, and power consumption in CMOS logic circuit, and an overview of the power management techniques. Next Section presents survey on energy efficient scheduling algorithms based on DVFS and DPM techniques. Finally, Section 4 discusses the concluding remarks. The main contributions of the proposed paper are as:
Paper presents the static scheduling heuristics to reveal them further on energy aware frontage. Energy efficient scheduling algorithms are surveyed based on DVFS and DPM-based techniques. Survey on DVFS-based technique is presented covering, i) different computing platforms (uniprocessor and multiprocessor), ii) homogeneity/heterogeneity of multiprocessor systems, iii) type of application tasks (independent and dependent) to be scheduled, iv) and type of heuristics (list-based, clustering-based, and duplication-based) for scheduling tasks of a parallel application. Survey on DPM-based technique is presented covering, i) different power states of a processor, ii) power aware scheduling considering simple and complex power states of processor. Consideration of optimization parameters that includes makespan/schedule length, dynamic energy, static energy, and communication energy consumption of an application program.
Preliminary on task scheduling heuristics
Scheduling aims at reducing the application’s overall execution time by carefully allocating and assigning tasks to the available processors. The overall time to execute a parallel application is termed as makespan/schedule length. Mapping tasks of an application on a given set of processors can be based on static scheduling or dynamic scheduling [14]. This paper focuses on static scheduling heuristics, wherein execution time of tasks, amount of data-flow, and data dependencies are known at compile-time. Scheduling on multiprocessor system is a well-known nondeterministic NP-complete problem [12] with optimal schedules available under approximation and heuristic-based techniques. Heuristic-based techniques are taken up in this paper, which are further classified as list, clustering, and duplication, each with its own performance and complexity. Over several facets of technology shift and computing paradigms, taxonomy of scheduling techniques into the fundamental categories is shown in Fig. 1.
A brief task scheduling taxonomy.
List-based heuristics use certain priority scheme to assign task priority, and then allocate task to a processor that gives earliest start time. Traditionally, these heuristics were designed for homogeneous multiprocessor systems with and without interprocess communication costs (IPC), and later on, in view of heterogeneity of processors as well. However, these heuristics are not able to well exploit the parallelism of communication intensive graphs due to lack of addressing IPC costs. In pursuit to fulfill the computing power of wider complexity and big simulations, state-of-the-art industrial and scientific society demands heterogeneous computing systems. In this regard, distributed computing resources [13] provide a way to tackle scheduling problems of increased complexity. Some algorithms that fall under list-based heuristic include Highest Level First with Estimated Times (HLFET), Smallest Co-Level First with Estimated Times (SCFET), Critical Path/Most Immediate Successors First (CP/MISF), Heaviest Node First (HNF), Earliest Task First (ETF), Mapping Heuristic (MH), Modified Critical Path (MCP), Fast Critical Path (FCP), Heterogeneous Earliest Finish Time (HEFT), Performance Effective Task Scheduling (PETS), Longest Dynamic Critical Path (LDCP), Lookahead, Constrained Earliest Finish Time (CEFT), Predict Earliest Finish Time (PEFT) etc.
The IPC problem of list-based heuristics is alleviated by clustering-based scheduling as proposed in late eighties and early nineties. Here, the heavily communicating tasks are grouped in cluster, and then mapped on processors. Scheduling algorithms under this heuristic are Edge Zeroing (EZ), Dominant Sequence Cluster (DSC), Clustering and Scheduling System – II (CASS II), Clustering for Heterogeneous Processors (CHP), Triplet, Wrap Cluster Merging (WCM), and Global Load Balancing (GLB), Cluster Pair Priority Scheduling (CPPS), Computation Communication Load Clustering (CCLC), Dynamic Computation Communication Load (DCCL), Randomized Dynamic Computation Communication (RDCC), Bounded Dominant Sequence Clustering (BDSC) etc.
Another static scheduling heuristic is duplication-based scheduling that intends to duplicate predecessor tasks on multiple processors to evade the costly IPC, and thereby gives much reduced makespan over list-based. Some examples from this category are: Task Duplication based Scheduling (TDS), Modified Task Duplication based Scheduling (MTDS), Duplication Scheduling Heuristic (DSH), Bottom-Up Top-Down Duplication Heuristic (BTDH), Critical Path and Fast Duplication (CPFD), Selective Duplication (SD), Heterogeneous Limited Duplication (HLD) etc.
A parallel application is in the form of a directed acyclic graph (DAG). A DAG is decomposed into multiple tasks, and has precedence constraints among the parent and child tasks. For the correct execution of an application program, the tasks must execute as per the precedence constraints relation. A DAG is denoted by a graph
CMOS power consumption
Major sources of power dissipation in CMOS logic circuit [50] are dynamic and static power. Dynamic power dissipation occurs due to the circuit activity as transistors switch states in the logic circuit, and has two main components: i) switched capacitance, and ii) short-circuit current. Switched capacitance, one of the most dominating components, occurs due to charging and discharging of load capacitances during high to low or low to high transitions at the outputs of the circuit, and is defined as:
where
Static power dissipates due to leakage currents, and is independent of transistor switching states. Due to attenuation of CMOS technology to sub micron level and thousands of transistors integration on a chip, static power consumption is increasing more than dynamic power [9]. Leakage current can be of different types – reverse-biased-junction leakage, gate-induced-drain leakage, subthreshold leakage, gate-oxide-tunnelling, gate-current leakage, and punch-through leakage. Out of these, subthreshold leakage and gate-oxide-tunnelling leakage consumes significant total leakage current.
The static power consumption
where
Dynamic voltage and frequency scaling (DVFS)
This technique was introduced in digital circuits in mid-nineties to control processor voltage and frequency based on the system workloads. Conventionally, energy-aware techniques are used to find trade-offs between performance and energy reduction. Selection of lower frequency and voltage pair results in power reduction of processor, but performance may be deteriorated. Thereby, power savings and performance sustain are highly volatile. Several commercial processors (AMD PowerNow and Intel SpeedStep) are driven by this scheme. Many contemporary processors are equipped with built-in support of DVFS to reduce power consumption by means of dynamic control of operating voltage and frequency. As a processor executes several tasks concurrently with inter-task dependencies, idle-times/slacks of varying sizes are typically incurred. In DVFS mechanism, idle-times are exploited with one of the appropriate discrete voltage and frequency to reduce dynamic power consumption of processor. However, lowering down voltage and frequency may increase task execution time, which in turn may elongate the overall makespan. Therefore, exploiting DVFS scheme is beneficial, if reduction in energy consumption does not deteriorate the makespan of an application program.
Dynamic power management (DPM)
Significant amount of power dissipates when processor is in active and idle mode. As peak performance is desired during certain time intervals, processor must not stay in active mode for the entire scheduling periods so as to generate the energy efficient schedules. A CPU/processor supports shutdown and slowdown power-saving modes to reduce power consumption under ambience of certain validation constraints such as context switch overheads. Processor in shutdown mode is considered as highest power saving mode but with more switching overheads. In this paper, slowdown mode is further categorized into standby and dormant modes. Depending on workload of an application program, appropriate power-saving modes could be utilized to reduce much power consumption instead of letting the processor to remain in idle mode [3, 33]. The strategy behind the DPM policy is the judicious selection of available idle slots against one of the feasible power-saving modes. However, switching a processor from a specific power-saving mode to an active mode comes along with certain context switch overheads in terms of transition delay and energy penalty. Consequently, applying DPM is meaningful provided energy consumption reduction for a given power-saving mode succeeds over energy penalty overheads.
Energy efficient scheduling techniques
This paper presents survey on energy efficient task scheduling, in particular to the identification and classification of power/energy aware techniques, types of the computing paradigm, types of an application program, and types of the scheduling heuristics deployed on an application program, as shown in Figs 2 and 3. To begin, energy aware techniques are classified into two prominent categories namely DVFS and DPM. The DVFS-based scheduling is further classified covering many facets such as type of computing platform (homogeneous uniprocessor or homogeneous and heterogeneous multiprocessor system), type of an application (independent or dependent/precedence constrained tasks), type of scheduling heuristics (list-based, duplication-based, clustering-based) deployed on an underlying computing platform, as shown in Fig. 2. Further, DPM-based techniques in relation to different power states of a processor are presented in Fig. 3.
DVFS-based energy efficient scheduling techniques.
Taxonomy of DPM-based energy efficient scheduling.
Earlier works in this sphere were largely focused on real-time scheduling of independent tasks, especially on uniprocessor homogeneous computing platforms. In real-time scheduling, tasks deadline are given and their WCET (Worst-Case Execution Times) are typically known before-hand. Many algorithms developed in this domain have leveraged the performance slack available in real time. Aydin et al. [1] found that using WCET to guarantee the temporal constraints of the workload may not properly utilize the unused computations as WCET and actual execution times of real-time applications can differ. The scheduling algorithms for independent and dependent tasks as proposed by Zhu et al. [54] try to overcome the limitations of prior techniques [1, 21, 53]. In the mid nineties, CMOS technology was not smart enough; the algorithms were evaluated on execution traces and certain simulation models. In papers [29, 44], favorable set of frequency and voltage is computed to reduce energy-delay product of a circuit. The aim was to scale down the operating voltage till the desired clock frequency is reached. By the year 2000, semiconductor technology seemed mature with the inclusion of DVFS-enabled commodity processors.
DVFS-based algorithms can be classified as on-line and off-line. The former one relies on past behavior/history of the system to forecast future workloads to decide voltage-frequency scaling along with possible scaled value. Interval-based DVFS is one of the on-line schemes that divide the time into fixed length intervals so that voltage scaling is activated only at the start of each interval. Weiser et al. [51] suggested an on-line scheme for energy efficient scheduling of non real-time tasks. Later on, Govil et al. [16] extended the work with a sophisticated prediction policy to discover the processor busy interval time, and then smoothing speed to balance power consumption and delay. The main issues with interval-based DVFS are irregular prediction of future workloads and certain performance degradations. To address this issue, Sinha and Chandrakasan [47] suggested adaptive filtering to smooth out noises in the past workload patterns. On contrary, an off-line DVFS algorithm fixes voltage scaling in advance. Task-based DVFS is an off-line scheme that controls voltage and frequency per task basis instead of fixed time intervals, making it more practical and suitable for real-time tasks having varying timing requirements. Based on this approach, Yao et al. [53] proposed an optimal, off-line polynomial time scheduling algorithm for real-time independent tasks on a variable speed uniprocessor system to compute optimal minimum energy schedule. In [21], authors suggested an integer linear programming by which voltage selection varies as per the task switching activity. Tasks with more switch activity were exploited with lower voltages and vice-versa. Ultimately, a task may exhibits different voltage and frequency for different cycles. In contrast, Manzak and Chakrabarti [38] assumed a set of continuous voltages and a single voltage-frequency for a task. They presented an off-line scheduling algorithm based on the switching activity of tasks with random arrivals and deadlines.
DVFS scheduling on multiprocessor systems
In this part of survey, the approach and track on energy efficient techniques on multiprocessor system is discussed for scheduling dependent tasks on homogeneous and heterogeneous systems, and with respect to different scheduling heuristics such as list-based, duplication-based, and clustering-based. It is gathered that most of the available works exploited list-based heuristics over duplication and clustering to reduce dynamic energy consumption.
Energy aware list-based scheduling algorithms
With the inclusion of matured CMOS technology in the year 2000, Gruian and Kuchcinski [15] proposed a list based Low Energy Scheduling (LENES) algorithm for scheduling dependent tasks on homogeneous computing paradigms, as opposed to the prior works mainly done for independent task scheduling. LENES makes use of an energy sensitive priority function to run processor at varying voltages to generate energy efficient static schedules. Its performance for certain random and real graphs was shown to be quite well. An Energy Reduction Algorithm [24] was proposed on a power scalable homogeneous cluster for directed acyclic as well as tree task graphs. The algorithm selects lower and uniform voltage and frequency along the critical path in task graph. Dynamic voltage scaling is applied at an average for two consecutive tasks, provided they exhibit certain slack time. Simulation results on a real platform showed energy reduction with some makespan degradation.
The ignorance of IPC costs in this algorithm was handled by Mori et al. [43] with a Power Aware Scheduling Algorithm (PASA) to schedule dependent tasks on homogeneous cluster. PASA selects an appropriate frequency for the tasks that are not on the critical paths. Based on task selection based mechanism, wherever feasible, power consumption was reduced during heavy and low load computation times. For consecutive tasks with slack times, a task that can reduce maximum amount of power consumption is selected and exploited with DVFS. However, simulation results of PASA were conducted for few random task graphs only.
Another list-based algorithm was formulated by Wang et al. [52] known as Power Aware List Scheduling (PALS) for dependent tasks on homogeneous clusters. This algorithm finds the slack times of non-critical tasks, extends their execution times, and applies voltage scaling during idle and communication phases. In PALS, a list-based earliest task first (ETF) algorithm was implemented to generate a schedule list, and then voltage scaling is applied for all the non-critical tasks. A green service level agreement (SLA) was also formulated that showed negotiation of performance degradation and energy savings. This algorithm was found to be effective over other well-known algorithms namely Energy Reduction Algorithm, LENES, and ECS [37]. Low Power Heterogeneous Machine (LPHM) [8] algorithm combined traditional list-based HEFT algorithm with voltage scaling for continuous voltage scalable processors. It identifies the available scheduling slots on a processor and exploited dynamic voltage scaling technique to reduce dynamic energy consumption without violation of precedence constraints. The simulation results on a large set of random graphs showed a considerable energy consumption reduction over HEFT.
Kaur et al. [28] extended LPHM algorithm, and suggested list-based LDVS algorithm, wherein dynamic voltage scaling is exploited for idle and communication times in the schedule of HEFT algorithm. The total energy consumption was considered for both computation and communication energy. The simulation results of LDVS on an exhaustive set of random and regular graphs were shown better than related list-based HEFT and duplication-based HLD algorithms.
To balance trade-off between makespan and energy consumption, an Energy Conscious Scheduling (ECS) [37] algorithm was implemented for parallel tasks on heterogeneous distributed computing system. Two parameters namely relative superiority (RS) and makespan-conservative energy reduction (MCER) were included for scheduling of tasks. Further, this approach was extended by ECS
Baskiyar and Abdel-Kader [7] presented an Energy Aware DAG Scheduling (EADAGS) algorithm for scheduling dependent tasks on heterogeneous system. The idle slots generated by Decisive path scheduling (DPS) algorithm were exploited with discrete voltage levels to achieve energy gains. For heterogeneous computing platforms, an energy-efficient elastic (3E) [55] algorithm was suggested for scheduling aperiodic, independent non-real time tasks. This algorithm can adjust supply voltages and frequencies of a processor depending on the system workload, and thereby makes a nice compromise between expected finish time and energy consumption. A survey of popular list-based DVFS scheduling algorithms on homogeneous and heterogeneous multiprocessor systems is shown in Table 1.
List-based scheduling heuristics with DVFS technique
List-based scheduling heuristics with DVFS technique
As duplication and clustering-based scheduling heuristics give more reduced makespan, some energy management techniques are also investigated for these heuristics. Power Aware Task Clustering (PATC) [52] was formulated, in which an edge zeroing (EZ) algorithm was made energy aware via DVFS technique. Traditional EZ reduces makespan by zeroing the edges of high communication costs in a task graph. However, PATC guides edge zeroing process from the perspective of power consumption reduction. Ruan et al. [46] suggested Energy Efficient Scheduling algorithm TDVAS for the parallel applications running on homogeneous clusters by utilizing voltage scaling to baseline Task Duplication Scheduling (TDS) algorithm without makespan increase. An Energy Optimization Scheduling for Task Dependent Graph (EOTD) [42] was formulated for homogeneous computing platforms. Unlike the previous works, EOTD mainly focused on targeting communication and static energy of processing elements. EOTD employed task clustering, dynamic voltage scaling, and binary search based power management. For energy optimization, this algorithm make use of clustering scheduling to reduce transmission time and communication energy, DVFS scheme to reduce dynamic energy, and then DPM on a binary search to reduce static energy consumption.
To balance makespan and energy consumption on homogeneous clusters, Zong et al. [56] presented Energy Aware Duplication (EAD) and Performance-Energy Balanced Duplication (PEBD) algorithms. The main motive of these algorithms was to judiciously replicate the predecessors of a task such that replication/duplication reduces makespan without increasing energy consumption. For this, fixed set of threshold values were selected to control the number of duplications. Only those duplications were allowed that do not increase system energy consumption. However, the suggested algorithm did not exploited DVFS technique. The performance of EAD and PEBD were shown to perform well than existing TDS and MCP algorithms for some regular task graphs. An extension to Zong’s work was carried by Liu at al. [34] with an Adaptive Energy Efficient Scheduling (AES) algorithm. AES combined adaptive threshold approach with dynamic voltage scaling. The performance of algorithm was observed good over TDVAS, EAD, and PEBD algorithms.
Some alternatives of Zong’s algorithm were proposed to reduce communication energy consumption of dependent tasks on homogeneous cluster. Liang et al. [31] implemented an Energy Aware Dynamic Clustering Scheduling algorithm, which made use of clustering and dynamic re-clustering techniques. Preliminary clustering aims to create clusters of tasks into multiple groups prior to application execution. However, due to uncertainty of execution time, dynamic re-clustering can adjust the clustering groups based on a threshold and communication overhead. Another Novel Energy Aware Scheduling Algorithm (NEASA) [30] was suggested based on two-phase frequency scaling algorithm. NEASA adopted the DVFS technique, wherein both static as well as dynamic workload variation slack times were reclaimed, and exploited with different frequencies. NEASA is shown to be energy efficient than TDVAS for compute intensive applications. However, the algorithm was not at an obvious advantage for communication intensive applications.
The aforesaid duplication-based algorithms were applicable on homogeneous clusters with an assumption of an unbounded number of processors. But, most of these algorithms are not pertinent on clustering-based scheduling if number of processors tends to be less than as required to schedule multiple cluster groups. In that case, some dummy processors are utilized to assign the leftover clusters. Moreover, for heterogeneous platforms, owing to different capacities of processors, the suggested solutions may not perform well. Some works on duplication-based heuristics are available on heterogeneous platforms, which are discussed below.
Baskiyar and Abdel-Kader [7] suggested an Energy Aware Graph Scheduling with Duplication (EAGS-D) algorithm, wherein duplication-based heterogeneous N-predecessor duplication (HNPD) algorithm was integrated with DVFS. In EAGS-D, duplication is retained if it helps to lower finish time; else it is discarded so as to reduce computation energy consumption. The proposed technique neglected communication energy consumption of interconnects. In Energy Efficient Task Duplication Schedule (EETDS) algorithm, two-phase strategy was employed to balance makespan and energy consumption tradeoff for heterogeneous cluster. In first phase, grouping was done to create group of tasks that belong to a critical path to reduce communication energy consumption. In second phase, duplication-based strategy was applied to reduce more power dissipation. EETDS was observed better than TDS and NDS algorithms with certain makespan degradation.
An Energy Aware minimized Duplication (EAMD) for heterogeneous computing environment was formulated by Mei and Li [40], which is an extension to baseline HLD algorithm. EAMD attempts to remove the superfluous tasks to reduce the processor’s energy consumption. However, removal of redundant tasks might increase the communication energy consumption of its child tasks that was ignored in EAMD. The idea of exploiting DVFS onto the primary and duplicated tasks is implemented in Energy Aware Duplication Scheduling (EADS) [27] algorithm. EADS analyzed computation and communication energy consumption against list-based HEFT, LPHM and duplication based HLD algorithms. Recently, Kaur et al. [26] proposed an Energy Conscious Scheduling with Controlled Threshold (ECSCT) that integrated controlled threshold-based duplication with dynamic voltage scaling. The simulation results were found significant over list-based (HEFT, LPHM) and duplication-based (HLD, EADS, EAMD) for an exhaustive set of random and real task graphs. Maurya and Tripathi [39] developed an Energy Aware Edge Priority Scheduling (EAEPS) for multiprocessors. This algorithm makes the EPS (Edge Priority Scheduling) algorithm energy efficient by zeroing the edges of higher priorities, and reduces the energy consumption by exploiting DVFS technique. Hu et al. [19] developed an algorithm Energy Aware Scheduling with Service Level Agreement EASLA for cluster systems having DVFS enabled processors. The main idea behind their algorithm is to distribute slacks of set of tasks and scale down voltage/frequencies to achieve energy reduction. The survey of popular duplication and clustering-based algorithms for dependent tasks on multiprocessor system is given in Table 2.
Clustering and duplication-based energy aware scheduling
Clustering and duplication-based energy aware scheduling
Owing to compact and fast transistor sizes in the design of CMOS circuit, it is evident that static power consumption is increasing at a fast track, and much research attention is needed to manage it. The trend towards scaling down of VLSI technology to deep sub micron level causes transistors integrated on a chip to consume substantial static power. Austin et al. [4] predicted leakage current to amount to 50 percent of total power dissipation in 90 nm technologies, thereby contributing to the majority of total energy consumption. A sharp rise in leakage currents in nanometer designs is the prime cause of power dissipation, which is causing a five-fold increase in static power consumption with each technology generation. Accordingly, static power is projected to surpass dynamic power in near future [23]. The main causes of static power consumption is due to the leakage currents flowing through a transistor, and it exists whether a processor is doing some computations or is in idle state. Before delving into survey on DPM-based techniques, an overview of the different power states of a processor are shown in Fig. 3.
For execution of an application program, the processor is considered to be in two possible states i) active-power mode, ii) low-power mode. In the former state, processor is doing normal mode of operation with full support functionality. The processor logic and embedded RAM arrays are clocked and powered on. As in active-power mode, processor is doing computations; it is also referred to be in dynamic state. Conversely, when processor is not doing any computations, because it is waiting for some communication actions to complete, it is said to in an idle state. Off-the-shelf computing systems are nowadays equipped with shutdown policies. Shutdown is a simple and effective method to save maximum amount of energy saving. However, this is in general not applicable at large scale owing to context switch constraints (shutdown and wake-up cost).
Processor supports several low-power modes namely standby, dormant, and shutdown. Such low-power states can be exploited during idle periods to reduce static power dissipation regardless of keeping the processor in idle state. In a standby state, processor clocks are disabled while still keeping the logic powered up. As a result, power drawn to leakage current reduces and a small clock power overhead is incurred to switch the processor from standby to active-power state. In dormant mode, the processor is powered off while leaving the data caches powered on. Shutdown mode is similar to dormant, but it powered off the device and all its possible states.
The second part of survey provides DPM-based scheduling techniques to address static power dissipation. Based on dynamically switching on and off the logic circuit or system components, the static power dissipation can be regulated. As a processor is equipped with several low-power modes (shutdown, standby, and dormant), its power consumption can be reduced by switching to one of these modes during idle periods. These techniques are based on three polices – timeout, predictive and stochastic optimum control [5]. In timeout policy, the component is placed in sleep mode if it remains idle for more than a defined timeout interval. The minimum idle-time that pays off the cost linked with low-power mode is known as breakeven time. In predictive policy, strategy is based on the futuristic component’s usage. This policy makes use of correlation between past history and near future workload patterns to adjudge the expected idle periods in the future course of time. The parameter of interest is to set up a time threshold that decides the transition of component from active to low-power state during unused intervals. The predicted idle periods should be long enough for efficient utilization by sleep mode. The judgment of the predicted idle periods is crucial for exploiting the maximum potency of DPM. Non-judicious calculation of idle period can lead to performance penalty or energy wastage. Predictive-based DPM policy can be further classified as adaptive and non-adaptive. In non-adaptive strategy, time threshold is fixed once and is independent of input patterns. Adaptive strategy uses past history patterns to govern future idle periods and thus changes time threshold as per the varied input patterns.
Leakage Control EDF (LC-EDF) and Leakage Control Dual Priority (LC-DP) algorithms aim to extend the idle time enough to be exploited well by sleep state [35]. However, Jejurikar and Gupta [22] found that procrastination scheduling as used in LC-DP may have the chances of deadlines miss in real time tasks. To cope up, they suggested a better technique with DVFS scheme while accounting for transition delay and energy penalty. A simple power model with only shutdown as a single low-power state is considered in their work. Baptiste [6] developed a dynamic control algorithm for aperiodic real-time events to decide when to switch on and off the devices supporting single sleep mode. For multiple low-power modes, authors presented the operating mode that a processor should enter for aperiodic real-time events and developed an on-line algorithm [2, 20]. For energy optimization, integration of DVFS and DPM as combinational approach is suggested by various researchers [20, 23, 45]. However, the integration of both DVFS and DPM in a framework poses some challenges, with a trade-off factor covering the merits and associated costs [11]. With DVFS, scaling down processor frequency reduces dynamic power consumption. Nevertheless, it also tends to elongate task execution time and less idle-time intervals in the schedule, which limits the provision of transiting processor to low-power sleep mode. On the contrary, operating processor at higher frequencies creates enough idle-times for sleep mode transition but at the price of increased processor energy consumption and more energy penalty.
Swaminathan and Chakrabarty [48] proposed DPM to control processor shutdown and wake-up of real-time events for energy savings. Coupling of voltage scaling and shutdown approach to find trade-off between DVFS, processor shutdown, and number of processors employed is realized by Langen and Juurlink [30]. Chen and Thiele [10] developed leakage-aware DVFS scheduling, wherein tasks are executed with decelerating and accelerating frequencies to initiate sleep state to reduce power consumption. However, it still makes use of simplified power model. The assumption of simplified power model and negligible transition delay has been overcome as adopted in some of the recent works [3, 18, 33]. In [3], breakeven time is computed during different low-power states, and an energy aware slack management algorithm was proposed to reduce static energy consumption of real-time tasks on uniprocessor systems. Legout et al. [33] utilized different low-power states of a processor to conserve static energy for real-time and mixed critical tasks. They relied on the use of different power states of a processor to account for transition delay and energy penalty, which were neglected in most of the prior works.
The above-mentioned algorithms have stressed upon the static energy efficiency of periodic or aperiodic real-time tasks. For precedence constrained tasks and scheduling, Ma et al. [41] discussed integration of task clustering with duplication for homogeneous clusters. They developed an Energy-Efficient Scheduling Algorithm of Data Dependent Tasks (ESTD) for DVFS-unable cluster system. The algorithm employs task clustering and duplication to reduce communication cost and then DPM strategy to reduce static energy consumption. The suggested ESTD algorithm was shown to give better performance than duplication-based EAD and PEBD for some random graphs. Kaur et al. [25] suggested Duplication-Controlled Static Energy-Efficient Scheduling (C-SEED) algorithm that couples adaptive threshold- based duplication with system level dynamic power management technique to reduce static energy consumption.
Conclusions
This paper surveys the state-of-the-art DVFS and DPM-based scheduling algorithms largely to address processor’s dynamic and static energy consumption. The former algorithms are surveyed for uniprocessor, and homogenous as well as heterogeneous multiprocessor systems in the context of different scheduling heuristics-list, duplication, and clustering. As analyzed, DVFS-centric scheduling algorithms were primarily focused for scheduling independent tasks on uniprocessor systems, and later on for multiprocessor systems. Much of the works target on dynamic energy consumption by means of managing processor’s active energy but overlooking idle energy consumption. Further, many list-based scheduling algorithms were explored with DVFS scheme to reduce computation energy but did not work towards reducing inter-task communication cost/delay. In addition, static/leakage and network’s communication energy is also neglected in most of the available works. Recently, energy management is carried out for duplication and clustering heuristics mainly to reduce communication energy consumption of network interconnect. As these heuristics have the potential to reduce makespan, they are being investigated on energy efficiency fronts, particularly for heterogeneous computing system. DVFS-centric algorithms are mainly utilized to reduce the processor’s dynamic energy consumption/computation energy.
Additionally, the quantum factors contributing to leakage currents are the main causes of shrinking of silicon feature sizes. Accordingly, static power dissipation is increasing more than dynamic power. DPM-based solutions are gaining importance in managing static power consumption, and are pertinent on the computing paradigms that may not be DVFS-enabled. Most of the DPM-based algorithms are restricted to real-time tasks and that too for homogeneous computing systems. Some works deal with simplified power models without consideration of transition delays and energy penalty while implementing DPM scheme. Only very limited works has facilitated to reduce static energy of dependent tasks in duplication-based schedules for off-the-shelf heterogeneous platforms. The major contribution of the survey is that it provides the trends in energy management techniques and their evolutionary roadmap at a computing platform.
Footnotes
Author’s Bios
