Abstract
As the performance of modern multi-core processors is significantly increases, the total energy consumption in the systems also increases drastically. Dynamic Voltage and Frequency Scaling (DVFS) is considered as one of the efficient schemes for achieving the aim of energy saving. In this paper, we consider scheduling a task set, whose release times, deadlines and execution requirements are given, on DVFS-enabled multi-core processor system. Our main aim is to meet the execution requirements of all the tasks, and to minimizethe overall energy consumption on the processor with effective utilization of resources. Instead of seeking optimal solutions with high complexity, we aim to design algorithms suitable for real-time systems, with good performances. We come up with a simple algorithm for task scheduling and energy awareness by considering deadline constraint. We further consider the distribution of deadline and task scheduling, which guarantee that all tasks meet their execution requirements, and tries to minimize the overall energy consumption. Case based simulations for various applications and task characteristics and evaluations using a practical processor’s power configuration indicate that our proposed algorithm has a less energy consumption performance and good resource utilization in terms of saving processor energy, though it has low complexity. Besides, the proposed algorithm is easy to be implemented in practical systems.
Introduction
Recently, the use of high-performance system in industrial application and embedded applicationshas grown rapidly. For such applications, multi-core systems or multi-core architecture are considered as a promising and reliable solution [1]. These architectures are used in various embedded system applications and helps to improve the performance by using DSP (Digital Signal Processing) processors and multiprocessor models. Hence, these type of multi-core or multiprocessor architectures accelerates the processing speed for huge and complex data. These type of architectures help to improve the computing performance, huge computation, multiple huge scale scientific application and computational platform. However, these architectures require more power for operation. In order to design these types of architectures, power constraint is considered a key component due to on-chip power dissipation [2]. In these devices, power consumption issues can be determined by taking the advantage of supply voltage controlling schemes where the workload is distributed among all available processors. Furthermore, efficient task scheduling also can resolve the issue of power consumption [3].
Power consumption issue of processors is categorized into two main parts: (a) dynamic power consumption (P dyn ) and (b) static power consumption (P stat ). Dynamic power consumption (P dyn = αV2FC) issue is caused due to the switching activity in the processor whereas static power consumption is raised due to leakage current or power (P leak = VI leak ) scenario in the processors [7]. In recent CMOS technology advancements, dynamic power is dominating issue. Chip density depends on the number of processing cores, if more number of cores are placed then density increases otherwise decreases. Chip density increment is the well-known issue of static power consumption. Power consumption is a challenging task hence static and dynamic power need to be considered for energy optimization of any multicore system. In this field, demand of users is increasing every day which arises various issues such as total implementation cost of processors, make span minimization, and processor reliability and so on [8].
Various techniques have been proposed to address the issue of power consumption. Processor manufacturing industries are ever-changing towards the architectures which are based on the Chip Multi-Processors (CMP) where multiple cores are embedded into one chip resulting in power consumption optimization [4]. It has been reported that multicore systems are capable to enhance the throughput and power consumption issues when compared with uniprocessor systems [5]. Due to the rapid growth of technology and implementation in real-time applications, demand for energy efficient processors is also growing rapidly. In real-time applications, dynamic voltage/frequency scaling (DVFS) is known as a promising technique for power consumption optimization [6]. Multicore DVFS are categorized into two main categories such as: (a) local DVFS (b) global DVFS. According to global (DVFS) technique, processer’s voltage and clock frequency are considered and scaled together to meet the demand of dynamic performance. These clock frequencies and voltage of processors can be defined separately using local DVFS technique [9].
Conventional techniques of multicore processors architectures are integrated with multiple power converters which helps in performance of the architecture. Switched capacitor shows harmonious behavior with traditional CMOS architectures but these architectures suffer from the issue of lower conversion efficiency. Hence, conversion efficiency and power consumption become very challenging task need to be addressed to support the real-time applications. To overcome this issue, dynamic voltage and frequency scaling scheme (DVFS) is combined with the switched capacitors for improving the performance. This technique consents output voltage to ripple and results in tracking the frequency of processor core. This approach uses switching between converter modes and body-bias tuning to obtain the efficient energy consumption performance [10]. Similarly, another approach which is also based on the voltage regulators and switched capacitors is presented in [11]. According to this work, conventional DVFS schemes degrades the performance due to slow transitions of voltage. Hence, to overcome this issue and performance improvement, authors presented voltage regulator based architectures where chip is integrated on chip-multiprocessors (CMP). This integration of chip and voltage regulators helps to provide control to nanosecond-scale voltage and voltage at each core. Another contradictory study is presented in recent years which discuss about fine-grained DVFS and coarse-grained DVFS systems. According to this study, fine-grain DVFS systems are capable to provide more energy saving for varying workloads whereas coarse-grained DVFS systems are not affected by scaling speed and time constraints. Motivated by this conclusion, a new architecture is presented which works on fine-grained DVFS working system [12]. These per-core voltage processor architectures require higher cost for manufacturing and packing which is not acceptable for huge industrial manufacturing due to cost issues. To address this issue, cost-effective processor architecture is developed which is based on the core-to-core voltage variation and per-core power-gating devices which helps to obtain lower implementation cost and higher accuracy [13].
However, these kind of implementation where voltage regulators are implemented on the chip is a very challenging task. Efficiency of voltage regulator and characteristics of output voltage are the key issues faced by the researchers which can affect the application of voltage regulator.
In order to reduce the dynamic power of the multi-core processors, Dynamic Power Management (DPM) is also a well-known technique. By taking this into consideration, Chung et al. [14] developed a new scheme for dynamic power management which shows low power consumption for conventional battery powered devices. But in the recent era of technology, industries are growing very rapidly where DPM techniques are not capable to provide efficient performance. This performance degradation is caused due to multiple request handling and misunderstanding of DPM policy whether the processor is in ideal state. For dynamic power management system, power management unit need to be developed. Power management unit holds the entire collected information and analyzes to provide the support to hardware during executing any application. Furthermore, predictive power-aware management scheme is also presented which helps to improve the performance [15]. These issues of DVFS and DPM causes error in task completion time. In multicore systems for real-time applications, time of task completion is a main component which is responsible for performance. During execution of any application, priority level increment of urgent task cannot satisfy the task completion criteria. Hence, deadline based task completion strategy is presented resulting in task completion and energy awareness of theapplication.
To address the issues of DVFS and DPM, here we present a new approach for improving performance of multi-core processors. Main aim of proposed study is to optimize the power consumption for multi-core systems and improving the performance and stability using DVFS techniques. Proposed approach consists three main stages which are as follows:
Deadline Consideration and distribution
At this stage, a sub-deadline value is assigned to each task along with the workflow which helps to obtain the desired or user – defined deadline request.
Task Defining and assigning
In this stage, we use precedence based constraints for task ordering using bottom distance of each task.
Final planning of task completion
Finally, here we choose the best service which is having comprising frequency and voltage which is capable to finish each task as per the given deadline of task.
In this work, our main contribution is to develop an energy aware- scheduling approach using DVFS scheduling approach resulting in less energy completion for any given task.
Rest of the article is organized as follows: section two describes the most recent studies presented in this field, section three deals with proposed model, section four presents result analysis and finally section five gives concluding remarks.
Related work
In this section, we present the discussion about recent studies in this field of power management and energy aware scheduling for multicore embedded systems. Various techniques have been presented for this purpose, these techniques are classified into two main categories: (a) DVFS (Dynamic Voltage and Frequency Scaling Scheme) and (b) PMM (Power Mode Management).
DVFS is a well-known technique for energy awareness in multi-core systems which performs voltage or frequency alteration according to the performance and power requirements of the system. Nowadays, CMOS industry is gaining attraction due to huge demand of electronic products but these industries face challenges in terms of power consumption. In CMOS based devices, voltage and frequency relationship is given as P ∝ FV2. According to this relationship, reducing the frequency or voltage can result in the energy saving. Various commercial processors are present in the market which uses DVFS scheme such as AMD processor and Intel etc. Kim et al. [11] discussed about the drawbacks of DVFS system and concluded that these technique can harm the performance of device which can result in non-satisfying deadline criteria. Moreover, DVFS technique utilizes a clock generator and DC-DC convertor which may cause increment in energy consumption. In this study, authors presented that multi-core processors are being widely used now a days which are capable to bring down leakage energy and clock frequency issues. Gang et al. [16] also study about energy saving techniques for real-time multicore systems using DVFS. This work defines a new problem as voltage setup problem. This problem defines the issue of voltage level selection for better energy awareness in DVFS. In general, there are various voltage levels present in DVFS which results in overhead such as voltage regulators face are and power overhead. Hence, authors presented a new approach for optimal number selection for voltage level in this type of multiple-voltage system resulting in energy efficiency. In multimedia application scenarios, missing deadline of the given task remains invisible to the users. Using this concept, Bhattacharyya et al. [17] proposed a new approach for task execution and energy awareness for multicore embedded systems. In order to carry out this objective, authors used dynamic voltage scaling techniques. This work consists two phases whereas in first phase completion ratio and energy aware parameters are considered and in second stage authors presented new scheme which saves energy in task execution by creating slacks. This approach helps to develop a system where specific tasks can be accomplished without affecting the QoS (Quality of Service) of the system.
In [18], Llamocca et al. reviewed and discussed about energy saving and performance improvement in video processing applications. In this study, authors take advantage of dynamic partial reconfiguration (DPR) which helps to control the available resource by considering energy, performance and accuracy constraints according to the design requirement. To improve the performance, 2D FIR filter architecture is modified by incorporating dynamical re configurability. This dynamical nature allows architecture to vary the coefficient and coefficient values according to the control strategy. This work shows that higher performance of the architecture using GPU compared to FPGA implementation. In [19], Gheorghita et al. introduced a new approach for energy saving in multicore embedded systems. To carry out this research, authors discovered a new concept which is based on the operation mode information. Such as any infotainment device i.e. mobile phones can be used for multi-purpose applications. Let us consider for music player application where it provides mono and stereo type music selection. For energy saving application, stereo mode can be used. This technique further can be used for designing the devices based on their specific modes of operation.
Zeng et al. [20], suggested the use of DVFS technique for load balancing in geo-graphically distributed data centers. Cloud based application are increasing day-by-day and this demand increases operational expenditure on cloud service providers. Due to heterogeneous nature of geography, electricity prices also vary which motivated authors to study about assigning the task for geographically distributed data centers. In order to carry out this work, dynamic frequency scaling technique is applied and an optimization problem is formulated for obtaining the best solution for frequency scaling scenarios.
As discussed before about the categories of energy aware techniques for multicore systems. After discussing about DVFS based technique here in further section we present PMM (Power Mode Management) techniques. In embedded and multicore system scenarios, hardware plays an important role by providing the information regarding operating mode of the device which can be used for energy saving purpose. Variation in operating mode requires different amount of energy and it requires different time to reach in ideal conditions. Generally, returning back to the normal mode of the processor require more time. For saving the energy, these modes should be used efficiently.
Chou et al. [21] introduced a new approach for power mode selection technique using power management model by considering timing and power constraints for multicore systems. This technique mainly aims on the scheduling and reported that scheduling of mode changing can obtain the power and system constraints.
Shin et al. [22] developed an efficient technique to improve the energy aware performance by incorporating DVFS and PMM. In this work, authors have reported that real-time embedded systems face intervals due to essential slacks and furthermore, it is discussed that priority maintaining is an important task than estimating worst case. Energy saving with the help of slack is performed using offline DVFS technique and later both techniques are combined together for energy saving.
Bhatti et al. [23] considered the issue of power consumption minimization for multicore embedded system scenarios for real-time applications using dynamic power management and dynamic voltage and frequency scaling DVFS. In previous works, it has been reported that DPM and DVFS also provides better results for energy saving for certain conditions such as workload variation and configuration of multicore architecture. It was also reported that there is no single policy available which matches with the perfect operating conditions. Hence, rather than developing a new policies for different work load configuration, in this work authors developed a new approach for power management called as Hybrid Power Management technique. This technique is a combination of DVFS and PMM technique where it selects the best policy for any given set of conditions.
In last decade, a tremendous growth is experienced in mobile embedded systems. Since, it is already discussed that these type of mobile embedded systems are equipped with limited battery resources which causes system lifetime and reliability issue for real-time applications. This is a challenging task for researchers to support for these kind of devices and reducing power consumption. In order to address this issue, authors have developed an improved interface model for power management in software and hardware components. This technique of power management uses hierarchical modeling of components and avoids costly models for implementation. This technique allows us to obtain the information of component whether it is in use or not. This information can be obtained for complete system, sub-system and can be used for transitions of components from high-power consumption mode to low power consumption mode [24].
Proposed model
In previous section, we have presented a brief study about recent works which are presented in this field of multi-core processors energy saving schemes. From this study, it is concluded that still some challenges are present in this field which need to be addressed. Hence, in this work we present a new approach for deadline and energy aware task scheduling algorithm for multi-core embedded systems.
Hardware modeling
In this subsection, we present a multi-core architecture which is considered here for implementation of proposed technique.
Figure 1 Shows the multi-core architecture model [25] where multiple cores are connected to a bus through local memory module and finally all cores and buses are controlled using a bus controller module. These cores are denoted by M.

Multi-core architecture.
In Fig. 1, we have depicted multi-core architecture which is represented as M ={ m1, m2, …, m R } which contains R number of DVFS systems are connected through a communication link. This model can be called as multicore processor modeling. These resources can differ from each other in terms of memory size, processing speed and operating frequencies etc. In this modeling, it is considered that task execution cannot be stopped after initializing it and communication bandwidth may vary for each task and processors. To consider bandwidth linkage parameters, let R × R be a data transfer matrix where R s and R j denote the required time for data delivery between resources or processors.
Let us consider that DVFS enabled system require V voltage as supply and operating at a frequency F. Here multiple resources are present which require voltage supply for each resource. This voltage supply is represented in discrete sequence given as {V j ={ vj,1, vj,2, …, vj,N(j) }} where N(j) denotes voltage supply level for each resource. Similarly, we denote set of operating frequencies for given resources as {F j ={ fj,1, fj,2, …, fj,N(j) }}.
Multicore task modeling
In this sub-section we present the task model which contains multiple task need to be implemented using multicore embedded system. In proposed approach, we have considered a parallel job task application where tasks are represented by using Directed Acyclic Graph (DAG) technique. This can be expressed as where T denotes vertices and E denotes the edges of the network. Vertices of the graph denotes multiple tasks denoted by n such as that t i ∈ T, (1 ≤ i ≤ n). Here our main aim is to execute each task on the available processor resulting in energy saving and task completion at a given deadline for computation. Similarly, E denotes edges which contains task dependencies given as e ij = (t i , t j ) ∈ E, (1, ≤ i ≤ n, 1 ≤ j ≤ n, i ≠ j). According to the task dependencies scenario t i is considered as parent task and child task is assigned to t j . A task which is not having any ancestor is known as entry task and a task without inheritor is known as exit task.
In this work, we introduce a new approach which utilizes weight (w (t i )) assignment process for each task. According to this model, if weights are assigned to t i then total number of instruction to be executed can be found whereas if weights are assigned to edges (w (q j )) then it given the information about data transfer link from one task to another task. For better understanding of this, we present a simple pictorial representation of simple DAG as depicted in Fig. 2.

Simple DAG architecture.
Prior to proceeding to the proposed research work, we describe about important parameters which are used here: Time taken for execution
This is defined as the measurement of task execution time by considering a given specific task and available resources. In other words, it can be expressed as the ratio of weights of instruction and operating frequency. It can be expressed as
Time taken for communication or cost of communication
This is the measurement of estimated time required for data transferring from parent task t
p
to the current task. It can be expressed as
First initialization time
This can be computed as
Finish time
Here we present the DVFS energy consumption model used in this work. Complete power consumption in multi-core processors depend on the dynamic and static power consumptions. It can be concluded from studies that dynamic power consumption is caused due to charge and discharge nature of node capacitors based on the logic gates. It can be expressed as
Where A denotes total number of switching frequency per cycle, capacitance is given by C, v is supply voltage and f is the operating frequency of multicore system. In another term, it can be expressed as
Where C eff is average switched capacitance per clock cycle and can be computed as C eff = A × C.
Further, we develop a relationship between frequency level and voltage level of multi core processor given as:
Where V th is a threshold voltage, α = 1.5, K is considered as constant value and L d denotes depth of logic.
From Equation (4), it is clear that v is an important factor while modeling the dynamic power consumption for multi-core systems. Another power consumption is known as static power consumption (P
static
) which is caused due to leakage current and biasing etc. but overall power consumption performance is affected due to dynamic power consumption. Hence, in this work, we mainly focus on the dynamic power consumption issue. Dynamic energy of any given processor can be computed as follows:
>
Δt denotes the time taken for execution.
Energy consumption of any given task can be computed with the help of this modeling. This energy consumption is computed on provided resources. In other words, it can be expressed as:
w* (t i ) denotes required number of cycles to complete the task, f is operating frequency, v denotes voltage source.
In a multicore system, T number of tasks are defined for various applications. These tasks are given R number of resources and the computation stats is allocated in the matrix X. Total energy consumption is defined as:
In this equation, energy consumption of processors is not included when processors are in idle case i.e. task is not executed. For ideal case consideration, we present another modeling which is expressed as:
Where max{ FT|t i , r j | } is the finish time.
Total energy consumption can be expressed as:
In this section we describe the problem formulation for minimum energy consumption issue. In above section, we have discussed about work load scheduling approach using DVFS scheduling model where n number of tasks are allocated to the m number of processors. The main objective is to develop a new approach for minimal energy consumption during task execution and completion of task by considering task deadline D. In order to obtain the minimum energy consumption, slowing down voltage or operating frequency of processor, execution of processors at different frequencies can result in minimal energy consumption. Furthermore, in next section we develop an energy aware task scheduling approach for multicore system by considering deadline constraints.
Efficient approach for task scheduling
Here we discuss about the proposed approach for task scheduling and energy awareness by considering deadline constraints.
As discussed before, proposed approach consist two phases which are distribution of deadline and ordering of task. According to the first phase, user defined deadlines are assigned to a given task and individual tasks are also assigned with a sub-deadline to meet the task execution criteria. In the deadline distribution phase, user’s specified deadline. In second phase, numbering is assigned to each task based on their priority and task is forwarded to the best processor to meet the energy consumption criteria.
The complete process of proposed approach is presented in below given Algorithm 1.
The given algorithm describes the complete working process of proposed approach for task scheduling and energy awareness for multicore processors. First of all, all available resources are considered and arranges in a list of available resources which are denoted as R. For this process, task graph, deadline constraint and resources are given as input and output is expected in terms of minimal energy consumption. In order to carry out this process further, each task is assigned from the given task list. Later we compute mean execution time and communication time for the considered task t i . Once the task is assigned, we compute the start time and finish time constraints of particular task which is on execution. Further, sub-deadline of each individual task is assigned as mentioned in step 6. After assigning the sub-deadline, this task is marked as assigned task and next task is considered for processing.
After performing these steps, deadline distribution approach is applied where each task is given an overall deadline according to the available task, resources and computing processors. In the next stage, we apply priority based ordering of tasks which gives a final list of tasks considered for execution on multi-processors. After doing this process, we obtain overall criteria of the given task. From the list of processors, we can select the best available processors for given particular task.
Energy aware
Energy aware
As discussed in Algorithm 1 that deadline distribution is a main task in the given energy aware approach. Hence here we present the deadline distribution algorithm.
Algorithm 2 describes the task distribution process for each individual task. As discussed in Algorithm 1 step 8, we apply deadline distribution which is presented in Algorithm 2. Here we perform deadline assignment for the given task. According to this model, if task is assigned then we apply sub-deadline assignment for each task. Similar process is applied for total number of tasks present in the task list. Meanwhile, we compute start and finish time which are updated for every individual task. And finally we compute the priority of the given task and apply priority based task ordering to obtain the energy efficient performance of the multicore processor.
Distribution of task deadlines
Distribution of task deadlines
Furthermore, we develop an efficient approach for task prioritization. As described before that initially tasks are assigned for execution for each processor. For better performance, we apply primary task assignment model for each task by considering minimum required time for task completion.
This can be expressed as:
Computation cost of each task is given by
In order to prioritize the task, computation cost for each individual task is computed and given as
Here it is assumed that maximum operating frequency is provided to the each core and task prioritization is applied. Priority of each task can be expressed as
In this section, we present experimental analysis of proposed scheduling approach for multicore processors. As discussed in proposed model section, first of all we implement DVFS scheduling model where various tasks and task graphs are generated by considering different specification of each task. These tasks are selected randomly using task graphs from the list of tasks. This approach is implemented using MATLAB tool in 3.40 GHz Intel Core i7 processor.
Here we implement two algorithm i.e. conventional DVFS scheduling and proposed energy aware approach and compared the performance of these algorithms in terms of energy consumption and resource utilization. Implementation process is as follows: For any given input tasks, randomly select one task for processing and execution unit of the task. Apply initial scheduling for the task graphs using maximum execution frequency. Compute energy requirement for task completion and scheduling Compute resource utilization for scheduling and task completion
According to the process of task graph generator, a graph is generated which consists various list of tasks with different characteristics. Considered parameters for this model are mentioned below: Total number of tasks available in the task graph Overall density of edges in the task graph Total number of processor core available for processing the task Average time for task completion and operating frequency of the processor Task start time Task finish time Average time required for task completion
Let us consider that in this experiment we have 3 heterogeneous cores for processing. It is defined that core 1 is a low-power processor whereas core 3 is known as high performance core. Let P k be the power consumption values of each core which are set as P1 = 1, P2 = 2 and P3 = 4. These values of power consumption are assumed at highest operating frequencies. First of all, we have evaluated the performance of energy consumption by varying the number of tasks available in the task list. This performance is presented in Fig. 3 and performance evaluation is given in Tables 1 and 3.

Energy consumption performance.
Energy consumption performance for 3 cores
Number of tasks are varied from 20 to 100 tasks for the complete multicore processor system.
This performance is carried out using 3 processors. According to the simulation analysis proposed approach consumes less energy when compared with conventional DVFS scheduling model. Average energy consumed by conventional mode is 125.5 mJ whereas proposed approach consumes 106.2 mJ energy which improves the performance of multicore processing systems.
For performance measurement, we have considered another parameter which is resource utilization of the system. Resource utilization performance is depicted in Figs. 4 and 6 where proposed scheduling model is capable to utilize more resources when compared to conventional approach. More number of resource utilization leads to the faster task completion resulting in less energy consumption.
In Table 2, resource utilization performance of conventional and proposed approach is presented where it can be seen that proposed approach outperforms when compared with conventional mode of job scheduling.

resource utilization performance.
Resource utilization performance for 3 cores
Energy consumption performance for 6 core
Similarly, we carried out another simulation study by increasing the number of processor as 6 and compared the performance of conventional and proposed approaches for scheduling.
In Fig. 5 we show the comparative analysis proposed and conventional scheduling approach for 6 number of processors. According to the simulation study, conventional DVFS scheduling approach consumes 169.1 mJ average energy whereas average energy consumption of proposed approach is obtained as 128.9 mJ which shows that proposed approach provides better performance for more number of processors also.

Energy consumption performance.

Resource utilization performance.
Finally, we evaluate the performance in terms of resource utilization where it is reported that proposed approach gives better results.
Experimental study shows that proposed approach performs better in various scenarios. According to the study, performance is evaluated by considering 3 core processors in terms of energy consumption and resource utilization similarly, 6 core processors are considered for different scenario.
In this paper, we have studied the task allocation using DAGs,energy consumption performance and resource utilization on real-time multicore systems with test cases as 3 core and 6 core. We specify here that all nodes in the DAG allocation system are heterogeneous, with each node having a different power consumption. We proposed an efficient algorithm to solve the task scheduling and energy awareness by considering deadline constraint. We design an algorithm to determine the distribution of the deadline and ordering of the task, in the first phase we distribute the deadline in the second phase we calculate the power consumption and resource utilization. Necessary Examples are given to demonstrate the impact of the algorithm. Furthermore, we employ both simulation and practical evaluation to show that present theoretical results are consistent with the practical results. Further the work can be extended in cloud environment also by including various cloud parameters.
