A smarthome conversational agent performing implicit demand-response application planning

Abstract

In recent years, the growing use of Intelligent Personal Agents in different human activities and in various domains led the corresponding research to focus on the design and development of agents that are not limited to interaction with humans and execution of simple tasks. The latest research efforts have introduced Intelligent Personal Agents that utilize Natural Language Understanding (NLU) modules and Machine Learning (ML) techniques in order to have complex dialogues with humans, execute complex plans of actions and effectively control smart devices. To this aim, this article introduces the second generation of the CERTH Intelligent Personal Agent (CIPA) which is based on the RASA framework and utilizes two machine learning models for NLU and dialogue flow classification. CIPA-Generation B provides a dialogue-story generator that is based on the idea of adjacency pairs and multiple intents, that are classifying complex sentences consisting of two users’ intents into two automatic operations. More importantly, the agent can form a plan of actions for implicit Demand-Response and execute it, based on the user’s request and by utilizing AI Planning methods. The introduced CIPA-Generation B has been deployed and tested in a real-world scenario at Centre’s of Research & Technology Hellas (CERTH) nZEB SmartHome in two different domains, energy and health, for multiple intent recognition and dialogue handling. Furthermore, in the energy domain, a scenario that demonstrates how the agent solves an implicit Demand-Response problem has been applied and evaluated. An experimental study with 36 participants further illustrates the usefulness and acceptance of the developed conversational agent-based system.

Keywords

Intelligent personal agents virtual assistants natural language understanding multiple intents smarthome implicit demand-response planning

1. Introduction

Nowadays, the latest advances and developments in the domains of Artificial Intelligence (AI), Natural Language Understanding (NLU), Speech Processing and Recognition, and the Internet of Things (IoT), enable advanced Human-computer interaction and translation of human commands and intention into various actions performed by Intelligent Personal Agents (IPAs). Leveraging on the aforementioned advances, IPAs are able to communicate with humans, send them reminders, provide them entertainment, execute custom tasks and plans, and control a large amount of SmartHome devices over different communication protocols such as Wi-Fi, ZigBee and Bluetooth. The fast growth of demand for IPAs in everyday life, led the market-leaders to create a wide variety of commercial IPAs such as Google Home Assistant, Amazon Alexa and Apple Siri by moving further of previous significant research results such as ELIZA [1] and IBM Shoebox [2] which are considered as two of the most representative first attempts to create IPAs. In order to effectively interact with humans and assist them, all the developed solutions and systems are facing three major challenges: (a) the natural language understanding, (b) the management of the dialogue/conversation flow and (c) the effective execution of a series of actions and tasks. For addressing these challenges, a wide variety of IPAs frameworks have been developed. These frameworks equip developers and researchers with a series of tools that enable the design and development of Virtual Agents/Assistants and chat bots for various domains. RASA, Google Dialogflow, Facebook WIT.AI and Amazon Lex are among of the most popular IPAs framework solutions. Plenty of these frameworks provide high-level services to developers/researchers has no access neither to their internal ML models or their source code as they are accessed through web Application Programming Interfaces (APIs). These features enable the faster development of IPAs.

By exploiting the aforementioned domain advances, frameworks and corresponding research results, CIPA Gen-A has been developed and introduced in our previous work [3]. CIPA Gen-A extends the architecture of RASA’s StarSpace [4] adapation and introduces a dialogue generator for adjacency-pair dialogue scenarios. The work that is presented in the current article is the extended version of CIPA Gen-A, the CIPA Gen-B. The new agent derives all the core functionalities of its predecessor such as the multi-intent, the novel dialogue generator based on adjacency pairs and its application in different domains of CERTH nZEB SmartHome. In CIPA Gen-B, new functionality related to the execution of complex plans by an AI Planner [5], is introduced. The newly added feature enables the agent to perform planning of implicit Demand-Response (DR) actions, based on its conversation with a resident of the SmartHome, towards increasing a system’s efficiency (e.g., in terms of energy consumption management). To the authors’ knowledge, the proposed agent-based system is the first-of-its-kind which uniquely combines methods for NLU multi-intention, dialogue generation for adjacency pairs, and AI planning to form plans of actions for implicit Demand-Response. Furthermore, a novel experimental study with 36 participants has been conducted in order to assess the application, usefulness, and acceptance of the developed agent. Conclusively, this paper aims to answer the following questions:

Q1: How easily can humans with minimal prior knowledge of a domain interact with an Intelligent Personal Agent by using only a dialogue generator to define the conversational tree? Q2: Can an Intelligent Personal Agent support a multi-domain AI Planner to form complex plans of actions towards solving effectively an implicit Demand-Response problem?

The remainder of this article is structured as follows. A brief review of the related work is presented in the next chapter. The architecture of CIPA Gen-B which is based on Gen A is documented on Section 3. Besides the architecture, all the core functionalities such as the dialogue generator, the multi-intent models and the planning generator for the implicit DR are available on Section 3 as well. Section 4 provides the evaluation results for the NLU Models, the dialogue generator, implicit DR planning and the results of the acceptance study. In Section 5 a discussion for this work’s findings is presented. Finally, Section 6 describes the conclusions.

2. Related works

This work aims to present the release of CIPA Gen-B and its application in a SmartHome for the optimal planning of an implicit Demand-Response (DR) scheme. Given the fact that to the best of the authors’ knowledge, there is no published research regarding an Intelligent Personal Assistants/ Virtual Assistants that enable residential implicit DR schemes, a brief literature review regarding conversational task-based agents is presented, along with a short overview of methods used to apply DR plans in residential buildings.

Nowadays, the conversational task-based agents execute tasks based on verbal interaction with humans exploiting natural language processing and AI/ML capabilities. The introduced agent is based on dialogue generation as the modeling of the future dialogue’s generation is crucial for tasks’ execution. To this aim [6], agents learn to map messages, and responses enabled by recurrent neural networks (RNNs) are used for data-driven conversation modeling. Industry leaders such as Amazon Alexa, Google Home Assistant and Apple Siri are based on ML models for dialogue generation. Recently, Google introduced Meena,1 an end-to-end, neural conversational model that learns to respond sensibly to a given conversational context. A single evolved transformer encoder block is used for processing the conversation context block. An actual response is formulated by evolved transformer decoder blocks that are available in Meena. The dialogue generation based on the model of Encoder-Decoder is widely used. A neural network-based architecture, with hierarchical latent variables that capture dependencies over an extended conversation history was introduced by Serban et al. [7].

To this aim, Park et al. [8] introduces a variational hierarchical conversation RNNs model. Reinforcement Learning (RL) approaches [9] are popular as well for human dialogue generation based on training model for the production of the response sequences. Recently, a Deep RL approach [10] was introduced. It is based on the optimization of long-term rewards and the conversation between two agents for the discovery of possible actions and the maximization of the expected rewards. In another Deep RL approach [11] proposes a multi-agent system with independent single agents which have a partial understanding of the environment, that are able to overcome their limited knowledge by using a centralized referee during the learning phase. Besides the aforementioned general approaches for dialogue generation and task execution, applications related to SmartHome automation, which is the domain application of the current work have been introduced and they are based on usage of agents and dialogue systems. Besides the applications of world leaders (Amazon Alexa and Google Home), Dumitrescu in [12] introduced Cassandra, a voice assisting system that enables user to control SmartHome appliances. NLU and automatic speech recognition are used alongside a knowledge base containing predefined scenarios to handle the dialog with the SmartHome occupant. A predefined scenario is selected after analysis of the occupant’s voice command. This comes to opposition with CIPA Gen-B functionality which does not require any historical knowledge with predefined scenarios with its novel dialogue generator.

In another work in this domain, Park et al. [13], proposed a framework for the development of task-oriented dialogue systems in a SmartHome environment. The framework is able to build a dialogue system by editing the dialogue knowledge which is ontologically expressed. The framework is also equipped with a dialogue management system that is based on a rule-based system for defining the possible states of a dialogue and the behavior of the system for the given state. Both information-state [14] and finite-state [15] based systems are used. The finite-state methodology can be considered related to the adjacency pairs methodology that is used by CIPA Gen-B for dialogue generation, which is based on adjacency pairs [16] and enables the generation of all the possible dialogue trees. Personal assistants that contributes on users’ planning which is in accordance with the presented application of CIPA Gen-B is a field with limited applications but with some interesting results.

In another approach related to home applications, Lera et al. [17] introduces a context-awareness component for labeling user’s activities in a human-robot shared environment based on ANNs. The work provides an enhanced inference engine that is helpful for robot’s decision making as it based on dialog flow, time and date, and localization information for labeling plus an environment recognition component supported by acoustic signals.

Yu et al. [18] introduced Uhura, a personal assistant that support planning for complex human requests. Uhura integrates a knowledge base and a dialogue manager with a planner that provides reasoning and supports negotiation with the user until a resolution is reached for competing requirements. In order to simplify planning for the users, Uhura provides features such as multi tasks and constraints, natural language communication mechanisms and goal-directed interactions. Based on these features, the main functionality of this personal assistant is that it receives as input semantic, spatial and temporal constraints, that causes the extraction of tasks from the knowledge base, and provides, after the evaluation the alternatives of each task, plans that best meet the users’ requirements. Uhura, during the time of writing this article had not been implemented.

A multi-intent planner for story generation in a multi-agent system is proposed by Riedl et al. [19]. The planning algorithm uses causal reasoning and a simulated intention recognition process in order to generate narratives with plot coherence and strong character believability. The authors proposes a solution that merges methodologies derived from Belief-Design-Intention agent framework (enables the formulation of intent and intention recognition to the problem of narrative generation) with partial-order planning techniques. However, the introduced implemented intent driven planner has not been fully tested and the testing process is an empirical evaluation. In another approach, Geib et al. [20] introduced a model that uses plan recognition and automated planning for the creation of collaborative virtual agents. The model is based on two agents with different roles, initiator and supporter that use a shared action representation. The results have been tested in 3 experimental domains. The plan and actions of a domain are defined using Combinatory Categorial Grammars (CCGs) [21]. Both plan recognition and automated planning are enable by using the created CCG lexicon per domain. The approach finally delivers a virtual robot platform in which the supporter agent proposes a set of goals and plans to achieve in order to support the initiator agent.

In addition to the aforementioned approaches related to planning, some others related to decision support by agents can be considered. In [22] an agent based model framework for deciding the execution of an evacuation plan in two domains, a museum and a train platform is proposed. In particular, human behavior model is introduced by the authors. The model is based on various blocks such as desires, emotion, memory and belief. The blocks are matrix or weighted vectors. Above them there is a final block for decision making that all the previous models converge. In another work [23] related to agent systems for decision control, a decentralized system is proposed by the the authors. More precisely, the proposed solution is a multi-agent controller for vibrations control of smart structures and its reasoning functionalities are enabled by usage of replicator dynamics coming from game theory.

In the context of the Smart Grid [24], Demand-Response is defined as “the changes in electric usage by end-use customers from their normal consumption patterns in response to changes in the price of electricity over time” [25]. The DR programs are valuable asset in Distribution System Operator’s quiver in order to maintain the stable operation of the power grid during peak load periods [26]. DR programs are divided in two large categories, i.e. the incentive-based and the price-based, each of them is compromised by many subcategories [27]. In case of consumers only, DR can take two main forms: a) peak shaving, i.e. load reduction during critical time periods, a practice that implies temporary loss of comfort [28] and, b) load shifting, i.e. in times of high energy prices, customers move partially their demand to off-peak periods [29, 30].

How these plans are decided and implemented is out of the scope of this work. The question in place regards how these plans are applied from the customer/ consumer side. From the customer point of view, when a DR request arrives, the user is expected to comply with a message demanding either the specific operation of some appliances (e.g it is recommended to use the washing machine from 19:00 to 21:00 [31]) or by reducing the overall household consumption by a specific amount of power. The compliance to the DR schedule happens either in an automated manner (explicit user) via the control of the loads by a Building Management System (BMS) or in a more abstract way, where the user shuts down appliances without knowledge of their impact on their overall consumption. An important aspect is the nature of the household loads: these are divided into shiftable and non-shiftable loads [32] (also called controllable and non-controllable loads). This categorisation is arbitrary because it is directly connected to the residents’ needs or preferences and it also depends on the functional possibilities of the loads [33].

It is evident that any kind of DR scheme corresponds to shiftable/ controllable loads. The control of shiftable loads, in case of an automation system is present (BMS) can be implemented in a ruled-based manner or by employing some sort of optimisation with specific objectives (e.g. minimisation of user discomfort). Khorram et al. [34] proposed an optimised planner in order to reduce the total lighting consumption whilst comfort-related constraints are satisfied. The optimiser was validated over a set of multiple scenarios in simulation level and all control actions were designed to be applied without user interference. On the contrary, in [35], instead of lightning the direct load control on the HVAC were utilised for providing intra-hour load balancing services to the DSO. In an aggregated manner the proposed control signals could lead aggregators to additional revenue from participating in DR market. A similar approach was followed in [36], where the setpoints of the installed HVAC of a commercial building were adjusted for analyzing peak demand reduction capabilities.

3. CIPA Gen-B architecture

Figure 1.

Schema of CIPA.

In this section the design approaches of the CERTH Intelligent Personal Agent Gen-B architecture are presented. The conceptual architecture of the system is illustrated in Fig. 1. The user can interact with the agent either through speech and voice commands or through text messages. Three components of the agent are then utilized in order to execute the requested actions: a) the RASA2 Core with the conversational tree produced by the Dialogue Generator aiming to capture all adjacency-pair based dialogue branches specific to the agent’s domain b) the RASA NLU with the modified Starspace Module, enabling higher recognition accuracy of phrases denoting multiple intents by the user c) and the Demand Response Planning Module which produces and executes plans of actions to either lower (or increase) the resident’s energy consumption by a specific amount by controlling the SmartHome’s appliances while attempting to minimize the resident’s disturbance to these actions. The RASA framework was chosen as the basis of CIPA due to its popularity, extensible architecture and open-source format.

A design decision behind CIPA-gen B is supporting multiple task domains. This capability is enabled by utilizing metaprogramming techniques which generate the required Python code (in the RASA format) by Python functions that take a series of mappings that define the agent’s domain as input. Moreover, CIPA-gen B offers an API that supports exchange of messages in JSON format. Furthermore, it includes an SQL database for the storage of user’s messages and an action history. CIPA-gen B is provided with a multiple intent model for each of the two supported domains. The novel dialogue generator from CIPA-gen A, the supported multi intent model, as well as the Planning system for implicit DR response are explained in details in the following subsections. The technical descriptions come alongside with examples related to the supported domains by the assistant and their datasets so as to better explain the application of the proposed agent and make it easier for the reader to follow and understand the concepts in this document.

The remainder of this section, describing the modules that compromise the architecture of CIPA, is structured as follows. Section 3.1 describes the “Dialogue Generator” module that is added on top of RASA Core to handle the dialogues of the agent with the user, that is provide the rules handling the conversation tree. Section 3.2 describes the “RASA NLU with modified StarSpace Module” that tackles natural language understanding, applied on the supported domains, responsible of understanding the user’s words on a sentence level. Section 3.3 describes the supported domains of the agent and datasets. Sections 3.4–3.6 describe the “Demand Response Planning Module” responsible of producing and executing plans of actions to lower or increase the user’s energy consumption.

3.1 Dialogue generator

A novel dialogue-story generator that is based on the idea of adjacency pairs has been designed and developed for CIPA-gen B. The proposed story generator models dialogue trees that consist of a subset of adjacency pairs based on the following two assumptions: (a) the user may omit information required by an action and thus the Agent will have to ask for that information by interacting with the user in a dialogue and (b) the user may not cooperate by following up the conversation. For example, instead of replying to a question posed by the Agent, the user may request another action. These assumptions are based on studies of human-chatbot conversation patterns [37, 38], which reveal that conversations are multi-turn and that missing information (e.g., “location”) may exist in users’ phrases, which needs to be addressed by the agent before executing the required action. In addition, according to further studies on human-chatbot dialogue patterns, switching between different intents seems to be natural behaviour for users [39].

For every action requested, the story generator produces conversation flows with various pieces of the missing but required information and defines the conversation flows that interact with the user to obtain this missing information. This produces all valid conversation flows for each action. For example, consider the “Turn on HVAC” action from the nZEB SmartHome energy domain. The introduced generator will produce the following valid stories for this action:

•
The user gives the room and the on/off switch.
•
The user gives only the room and the agent asks for the on/off operation.
•
The user gives only the on/off operation and the agent asks for the room.
•
The user doesn’t give the required information and the agent first asks for the room and after getting a correct reply asks for the on/off option.

In addition, the generator produces dialogues for handling valid multi-intents by the user. For example in the turn-HVAC+change-HVAC-mode multi intent, the generator will produce all valid adjacency pair flows for the union of their slots. At the end of each story it will call each action sequentially. If a slot is common to both intents and the user has not provided it, the CIPA-gen B will ask it once.

Moreover, it produces invalid conversation flows in which the user does not cooperate by following up the conversation with the supported intents but he/she follows up the conversation with a different intent. The invalid conversation flows restart the conversation after the agent tells the user that it did not understand his/her intentions.

Write story

procedureWrite_Story (intent, intenToProcess, included, excluded, runRepeatedAct, mappings, c, repeat)

$s\leftarrow 0$

repeat > $-$ 1

write repeat signature number $c$ multiIntent(intent)

write multi intent signature number $c$ write single intent signature number $c$

WriteStoryII(intent,included,mappings)

$i\leftarrow 0$ to excluded.length

write ’-’ slots_to_act_map[excluded[ $i$ ]]

$\textit{repeat}=i$ and $\textit{runRepeatedAct}=\textit{False}$ and $s=0$

WriteStoryII(intentToProcess,[],mappings)

write ’ - utter_repeat’

break

$\textit{repeat}=i$ and $s=0$

WriteStoryII(intentToProcess, [excluded[ $i$ ]], mappings)

$s\leftarrow 1$

continue

write ’* inform{’ (excluded[ $i$ ], slot_example_fun(excluded[ $i$ ]))’}’

$!\textit{multiIntent}(\textit{intent})$ and $(\textit{repeat}=-1$ or $\textit{runRepeatedAct}=\textit{True})$

write ’-’ intent_to_act_map[intent]

$\textit{repeat}=-1$ or $\textit{runRepeatedAct}=\textit{True}$

each intent inte_ of multi intent intent

write ’-’ intent_to_act_map[inte_]

write ’- action_restarted’

The proposed algorithm that generates stories (Algorithm 3.1 to 3.1), takes as inputs an intent to action mapping (intent_to_action_map), a function that maps intents to a set of slots (that are needed for the intent – slot_fun), a slot example relation that maps slots to random slot instances (slot_example_fun), a slot to action mapping (that specifies which action requests each slot – slots_to_act_map) and a mapping of valid multi intents (multi_intent_map). The algorithm automatically generates the story file (in the RASA format)3 for training the dialogue models. The algorithm could be ported to generate the story file in other formats. It operates by including a tree-expansion phase to produce all possible story combinations that have slot information missing in the original user message, as well as all invalid variations of them in which the user’s reply is classified to a different intent from the expected one. Algorithm 3.1 generates a single RASA Core story, that is a branch of the dialogue tree, providing a possible path of user-agent interaction. Algorithm 3.1 generates all possible stories, that is all branches of the dialogue tree that flow from a specific user intent. Finally, Algorithm 3.1 is the main function of the generator which produces the full dialogue tree. The auxiliary functions used are (a) SlotsCombinations, which is responsible to generate all possible pairs (included, excluded) through a list of slot combinations and (b) WriteStoryII which writes the signature of a story along with the slots matched as input.

Generate all stories of an intent

procedureGen_Intent (intentToProcess, rc, mappings )

multIntent(intentToProcess)

Split intentToProcess to individual intents slots $\leftarrow$ union of slots of the individual intents slots $\leftarrow$ slot_fun(intentToProcess)

$c=1$

slotsComb $=$ SlotsCombinations(slots)

include, exluded in slotsComb WriteStory(intentToProcess, intentToProcess, included, excluded, runRepeated $=$ False, mappings, $c$ , $-1$ )

$c\leftarrow c+1$

$\textit{excludedItem}[i]$ in excluded

intent in intent_to_act_map

$\textit{intent}\neq\textit{intentToProcess}$

WriteStory(intentToProcess, intent, included, excluded,runRepeatedAct $=$ False, mappings, $r c$ , $i$ ) WriteStory(intenToProcess, intent, included, excluded, runRepeatedAct $=$ True, mappings, $r c$ , $i$ )

$rc\leftarrow rc+1$

$r c$

Story Gen – Main function of the algorithm

procedureStory_Gen (domain_mappings)

Write a story of no understand classified to utter default

$rc\leftarrow 1$

intent in intent_to_action_map $rc\leftarrow\textit{GenIntent}(\textit{intent},rc,\textit{domain\_mappings})$

$(\textit{inte}1\rightarrow\textit{intents})$ in multi_intent_map

$\textit{inte}2$ in intents

$rc\leftarrow$ GenIntent(inte1 ’ $+$ ’ inte2, $r c$ , domain_mappings)

slot in slots_to_act_map

Write a story consisting of a single inform and a slot classified to no-understand

Write a story consisting of a single inform classified to no-understand
3.2 Single and multi intent models

Both single and multi intent models are supported by the CIPA-gen B. In the single intent model the Chatito,4 a third-party open-source natural language generator is used in order to create a training set for the NLU module. Chatito was used to model patterns of phrases which are then expanded to produce a training dataset by generating various combinations of words and phrases. Chatito is a widely used tool for this task and is being used in other works regarding Conversational Agents for generating the training data [40]. For the NLU model we did not select Rasa NLU’s default model that uses the SpaCy Natural Language Processing system but opted to use a RASA NLU Model based on StarSpace [4] for intent classification. Afterwards, we defined in Chatito’s DSL (Domain Specific Language) examples of intents for the nZEB SmartHome domain. These do not correspond to an exact one-to-one mapping, as we defined some extra intents. The two most important of them are the inform and nounderstand intents. The first should match all user messages that inform the agent about a slot (a parameter) of an action, such as the location of an action, in the case it was not included in the original message. The second one, the nounderstand intent, should match all user messages that resemble messages that correspond to other intents but that do not make sense. For example consider the user message “turn on the lights” and the message “turn on the door”, the first message should be classified to the turn the lights intent, whereas the second should be classified to the nounderstand intent. By specifying these intents and text patterns in the Chatito DSL, we used the generator to generate our training set for the NLU Model.

For the dialogue Model, we used the Rasa Core’s memorization model. Furthermore, we added a fallback policy for both the intent classification and dialogue classification models. Therefore, the agent will be able to reply that he does not understand anything in the cases of low classification confidence.

For training the multi intent model, we developed an algorithm that takes as input a Chatito DSL file alongside with a mapping of valid multi intents and produces an extended Chatito DSL file that includes patterns of these multi intents by combining the patterns of their single intents counterparts. In addition, it uses a random variable to consider omitted duplicated words when combining messages from two intents. For example, in the case of combining “please turn on the HVAC in the kitchen” with “turn the HVAC mode on”, the second “HVAC” word will probably be omitted in the training example for the multi-itent.

Thereafter, we used the output of this algorithm to generate our NLU training examples using Chatito. The output of this algorithm produced a multitude of multi intent data that surpassed the single-intent ones in the data. To balance the classes we post-processed this output by tripling to quadrupling the single-intent observations in the data. Furthermore, we extended RASA NLU’s adaption of the StarSpace method, by adding more hidden layers to the NLU model and training an NLU Model for multi-intent classification. We used a 512 $\times$ 384 $\times$ 256 $\times$ 128 configuration for network A and a 384 $\times$ 128 configuration for network B. We also changed the embedding parameters: embed_dim and num_neg to 30. We have performed a grid search to set these parameters.

3.3 Supported domains and datasets

CIPA-gen B has been deployed in CERTH nZEB SmartHome in two different domains, energy and health. The CERTH nZEB SmartHome is a rapid prototyping and a novel technologies demonstration infrastructure resembling a real domestic building where occupants can experience actual living scenarios while exploring various innovating smart IoT-based technologies with provided Energy, Health, Big Data, Robotics and Artificial Intelligence (AI) services. Table 1 presents the intents recognized for both domains of the SmartHome.

Table 1
Summary of available operations of CIPA-gen B

Energy domain
• Get temperature of a room • Set temperature of a room to a specific value • Turn on or off the lights of a room • Dim the lights of a room to a specific value • Turn on or off the HVAC of a room. • Change the mode of the HVAC of a room to cooling/heating/ auto/dry/fan • Change the fan of the HVAC of a room to low/medium/high/ auto • Get the humidity of a room • Get options • Get available rooms • Increase/Decrease consumption by X Watts
Health domain
• Get the oxygen rate • Get either the systolic or diastolic blood pressure • Get glucose levels • Get the heart rate • Get pulse • Get the motion in a room • Get the door open status of a room • Get the temperature in a room • Get the movement in a room • Get the luminance in a room

Energy domain

•

Get temperature of a room

•

Set temperature of a room to a specific value

•

Turn on or off the lights of a room

•

Dim the lights of a room to a specific value

•

Turn on or off the HVAC of a room.

•

Change the mode of the HVAC of a room to cooling/heating/ auto/dry/fan

•

Change the fan of the HVAC of a room to low/medium/high/ auto

•

Get the humidity of a room

•

Get options

•

Get available rooms

•

Increase/Decrease consumption by X Watts

Health domain

•

Get the oxygen rate

•

Get either the systolic or diastolic blood pressure

•

Get glucose levels

•

Get the heart rate

•

Get pulse

•

Get the motion in a room

•

Get the door open status of a room

•

Get the temperature in a room

•

Get the movement in a room

•

Get the luminance in a room

CERTH nZEB SmartHome – Health Domain: Health related IoT devices monitors a variety of physiological attributes, and enabling the extraction of valuable data through intelligent processing towards preventing situations that could lead to harmful outcomes. A dataset, based on the testers’ inputs, has been created. Thereafter, it is referred as the gold set of the health domain. This set is continuously extended and updated for nZEB SmartHome’s health domain. Due to its size (1972 samples) it is solely used as a test set.

CERTH nZEB SmartHome – Energy Domain: SmartHome is equipped with energy domain related IoT devices that monitor the energy consumption and production, and the conditions of the entire building while various algorithms can support automation and energy efficiency scenarios. A dataset has been created from the testers’ (SmartHome occupants) inputs, thereafter referred to as the gold set of the energy domain. This set is continuously extended for nZEB SmartHome’s energy domain. Due to its size (1608 samples) it is solely used as a test set. A common pattern that has been found in the data was the omission of information crucial for an operation to be performed. For example a user could utter “Turn on the lights” without providing the room where he/she wanted the action to be performed. Moreover, there were observations such as “Turn the lights” which did not contain the fully specified action information, such as if the user wanted the lights on or off. As the user could request a change for the state of the lights of another room this is crucial information.

Table 2

Action costs for light level settings

Level of change	Level of change cost	Human presence cost	TV operation cost	Outdoor lights cost
20	1	2 (true)/0 (false)	2 (true)/0 (false)	2 (true)/0 (false)
40–60	2	2 (true)/0 (false)	2 (true)/0 (false)	2 (true)/0 (false)
80–100	3	2 (true)/0 (false)	2 (true)/0 (false)	2 (true)/0 (false)

Table 3

Action costs for HVAC speed settings

Level of change	Level of change cost	Human presence cost	Extreme outdoor temperature cost
20	1	2 (true)/0 (false)	2 (hot/cold)/0 (medium)
40–60	2	2 (true)/0 (false)	2 (hot/cold)/0 (medium)
80–100	3	2 (true)/0 (false)	2 (hot/cold)/0 (medium)

3.4 Implicit DR application problem

In order to demonstrate the applicability of the designed agent on the energy planning problem, the use case that will be demonstrated is the following: the residents of the SmartHome participate in an implicit DR scheme. The information arrives as a notification on their smartphone, which essentially comprises a prompting signal to increase or decrease their total energy consumption of their household loads by a specific amount of watts for a specific period of time. The examined household consist of 9 different rooms (i.e. Living Room, Kitchen, Double Bedroom, Single Bedroom, Playroom, Guest Room, Hall, WC, Corridor) where a variety of smart loads are installed, while the power consumption of each load is measured by a smart energy meter. In general, DR schemes are implemented by either explicit directions towards the residents regarding turning on or off specific appliances at specific time intervals as suggested by Jovanic et al. [31]. However, such an approach – in case of BMS absence – may cause discomfort to a user because it demands actual actions from their part (e.g. gradually shutting down loads, dimming lights, changing the temperature and fan speed of HVAC etc.). On the contrary most BMS have the ability to monitor and control individually the installed loads, but they lack planning functionality. Therefore in order to eliminate this problem the proposed agent is utilised as a virtual assistant to DR handling. The agent collaborates with the BMS for retrieving information related to the current status of the loads and for sending set-points to these loads. In order to demonstrate the aforementioned proof of concept, the two most utilized loads in implicit DR schemes were selected; lighting and HVAC control. It should be noted at this point that since in the SmartHome a centralised HVAC system is installed, in order to demonstrate the intelligence of the agent planner regarding the different selected actions for each room, we have selected the control of the HVAC fan coils which are independent in each space.

The lights in each room of the SmartHome are dimmable and respective measurements have been realised regarding their consumption at different levels, with a 20% step (“Off”, 20%, 40%, 60%, 80%, 100%). Similarly, each room’s fan coils speed has been measured, corresponding to the predefined by the manufacturer speed levels (“Off”, “Slow”, “Low”, “Medium”, “High”, “Power”), which were assigned to the same 20% step levels as for the lights. Finally, smart sensors measurements reporting the human presence in each SmartHome room and the outdoor luminance and temperature are also employed in order to define fully the scenarios exampled in the next paragraph. It should be noted that proper validation of the sensors measurements took place in order to ensure their proper operation.

3.5 Agent operations and cost definition

In order for the agent to respond to the user’s request, a set of specific operations from which the agent can choose, has been defined. These operations are changes in the SmartHome’s electrical devices percentage of operation. Specifically, these changes can vary from 0% to 100% and from 100% to 0%, in steps of 20% (e.g. from 20% to 80%, from 60% to 0% etc.) The two groups of electrical devices that can be adjusted are lights and HVAC systems, in which the lights can be dimmed and the fan speed can be altered. In each group different conditions are considered, particularly human presence, TV operation and outdoor light for the light group as well as human presence and outdoor temperature for the HVAC group. According to the above, the possible scenarios of actions to be performed from the agent, are generated.

For the agent to be able to choose the ideal combination of achieving the user’s request, some specific costs are defined. The costs are defined arbitrarily according to the following factors: hot/cold/medium for the output temperature and true/false for human presence, TV operation and outdoor light and finally the change of light or HVAC operation level. The total cost of every action derived from the sum of condition costs. The actions that include human presence have higher cost, since the agent must avoid causing disturbances to rooms that humans are present (e.g. turning off the lights). Also, due to the fact that actions changing the lighting – compared to changing the fan speed level – lead to higher power consumption, these actions have been associated with higher cost, especially in case residents are present in the room. In Table 2 for lights and Table 3 for HVAC systems, the costs of the possible action are presented.

$\displaystyle AC_{lt}^{i}=\begin{cases}\textit{LHPM}(\textit{LOCC}_{i}+\textit% {HPC}_{i}\\ \quad+\textit{TOC}_{i}+\textit{OLC}_{i})&\text{if $\textit{HP}_{i}$}\\ \textit{LOCC}_{i}+\textit{HPC}_{i}+\\ \quad\textit{TOC}_{i}+\textit{OLC}_{i}&\text{otherwise}\\ \end{cases}$ (1) $\displaystyle\textit{AC}_{hc}^{i}=\begin{cases}\textit{HHPM}(\textit{LOCC}_{i}% +\textit{HPC}_{i}+\\ \quad\textit{EOTC}_{i})&\text{if $\textit{HP}_{i}$}\\ \textit{LOCC}_{i}+\textit{HPC}_{i}+\textit{EOTC}_{i}&\text{otherwise}\end{cases}$ (2)

where,

•

$AC_{lt}^{i}$ : Associated Cost of lights action $i$ ,

•

$AC_{hc}^{i}$ : Associated Cost of HVAC action $i$ ,

•

LHPM: Lights Human Presence Modifier,

•

HHPM: HVAC Human Presence Modifier,

•

$\textit{LOCC}_{i}$ : Level of Change Cost of action $i$

•

$\textit{HPC}_{i}$ : Human Presence Cost of action $i$ ,

•

$\textit{TOC}_{i}$ : TV Operation Cost of action $i$ ,

•

$\textit{OLC}_{i}$ : Outdoor Lights Cost of action $i$ ,

•

$\textit{EOTC}_{i}$ : Extreme Outdoor Temperature Cost of action $i$ .

Equations (1) and (2) compute the action costs of an action, according to the costs defined in Tables 2 and 3. There exists an action $\textit{AC}_{d}^{i}$ , for each $d=\{lt,hc\}$ , denoting the lights and HVAC, and each combination of the logical variables whose costs are defined in Tables 2 and 3. The EOTC variable is true when either the outdoor temperature is hot or cold, while it is zero when the outdoors temperature is mild (15 ${}^{\circ}$ C–30 ${}^{\circ}$ C). Through trial and error, we set $\textit{LHPM}=10$ and $\textit{HHPM}=2$ . The reasoning between the difference of the two modifiers lays in the fact that the lights consume a lot more energy than the HVAC on the nZEB SmartHome.

Tables 4 and 5 present the sensitivity analyses of the two formulas using the Sobol method [41]. The input data for performing the sensitivity analysis were generated using Saltelli’s sampling scheme [42, 43]. The samples generated in both cases were in the order of 1e $+$ 06. Both formulas were most sensitive to the HPC factor (high first-order sensitivity).

Table 4

Sobol sensitivity analysis of Eq. (1)

Parameter	S1	S1_conf	ST	ST_conf
LOCC	0.016616	0.000429	0.027739	0.000102
HPC	0.861305	0.002015	0.916918	0.002276
TOC	0.033233	0.000695	0.055479	0.000212
OLC	0.033233	0.000611	0.055479	0.000209

Table 5

Sobol sensitivity analysis of Eq. (2)

Parameter	S1	S1_conf	ST	ST_conf
LOCC	0.070312	0.000774	0.078124	0.000254
HPC	0.765625	0.001831	0.789063	0.001833
EOTC	0.140624	0.000912	0.156250	0.000389

3.6 Planning problem for implicit DR

For a user to ask the agent to perform implicit Demand-Response actions, she has to ask the agent to “Decrease consumption by $X$ Watts” (the user can also ask the agent to increase the consumption by a specified amount of watts). When the agent classifies either the decrease or increase consumption intent and obtains the user-specified amount of watts requested to perform, it generates a problem instance in the Planning Domain Definition Language (PDDL, according to the PDDL domain that will be described in the next sub-section) recording the SmartHome’s current state (i.e, which lights are on and by which amount, which HVACs are on and at what fan speed, the locations of people in the house, the outside temperature etc.), and a goal interval according to the following formula:

$\displaystyle[W_{s},W_{e}]=[\lfloor T-CT\rfloor,\lfloor T+CT\rfloor]$ (3)

where $W_{s}$ and $W_{e}$ denote the start and the end of the interval, $T=O-X$ (or $T=O+X$ when an increase of consumption is requested) is the target amount of total watts ( $O$ denotes the current energy consumption in watts) and C is a constant controlling the interval width. Subsequently, the agent calls a PDDL Planner to obtain a plan with a list of actions to execute to achieve this goal (while minimizing the total cost of the actions) and executes them in the SmartHome. The PDDL Plannner used [44] operates by searching using a variation of hill-climbing in the space of all reachable states, while using heuristic evaluation by solving a relaxed task in each single search state, using a variation of the GRAPHPLAN algorithm [45]. The relaxed plans produced are used to inform the search by ways of a goal distance estimation.

3.6.1 PDDL domain for implicit DR

To define the PDDL domain of the energy domain for performing implicit DR response, we categorized the PDDL actions according to Table 6. We consider normal outside temperatures values ranging from 15 ${}^{\circ}$ degrees Celsius to 30 ${}^{\circ}$ .

Table 6
Categories of PDDL domain actions

Id	Device	Human presence	Normal outside temperature	TV operation	Outdoor light
1	HVAC	Yes	No	–	–
2	HVAC	No	No	–	–
3	HVAC	Yes	Yes	–	–
4	HVAC	No	Yes	–	–
5	Lights	Yes	–	Yes	No
6	Lights	No	–	–	–
7	Lights	Yes	–	Yes	Yes
8	Lights	Yes	–	No	Yes
9	Lights	Yes	–	No	No

For each of the nine categories we have a multitude of PDDL actions that correspond to the level of change of the lights or HVAC fan percentage (i.e., 100 to 80, 100 to 60, 80 to 20, 40 to 0, 0 to 20, 0 to 100 etc.), all the various combinations both ascending and descending in steps of 20, each with its corresponding action cost and total watts change amount. Formally, for the set $A=$ {0, 20, 40, 60, 80, 100}, we take obtain all 2-tuple permutations without repetition ( $n=6,r=2$ ), that is 30 actions for each of the 9 categories for a total of 270 actions.

The PDDL Domain contains 2 types (room and presence), 3 predicates (sroom y - room, hu- man-presence ?x - presence, at-place ?x - presence ?y - room). It also contains 7 PDDL function definitions (total cost, fan-level, light-level, total-watts, outdoor-temperature, tv-operation, outdoor-light), where the last two take binary values (0 or 1). These predicates and PDDL functions handle integer values. To handle real numbers for the total watts change, we multiplied all numbers that correspond to total watts change (both in the PDDL Domain and in PDDL problem instances) by a factor of 10 (as the total watts change values had one decimal place). We defined nine PDDL actions which include some meta-variables (not part of the PDDL specification) that correspond to the different values that differentiate the multitude of actions that belong to each category. For example, Action 8 is defined as:

Lights-fromx-toy-withHP-Outdoorlight-without-TV (?room,?presence): An action which changes the light-levels at room ?room by {diff}, increases/decreases ({inc}) the total-watts by {w}, and increases the total cost by {c}, when there is human presence in the room, the outdoor lights are on but there is no TV operation.

The meta-variables {x} and {y} correspond to the before and after the action application lights percentage, while {inc} corresponds to increase, when $y>x$ and decrease otherwise, {diff} corresponds to $|y-x|$ , {w} corresponds to the action’s total watt change and {c} to that particular action’s cost. A generator was implemented as a Python script which takes as input the nine PDDL actions with the meta-variables, and a table defining the total watts change and action cost of the nine categories of actions for each percentage change difference of the lights or HVAC fans from 20 to 100 in steps of 20 (this corresponds to 45 table entries). The generator outputs the final PDDL actions of the domain with the meta-variables grounded, as well as with their corresponding total watts change and cost amounts. The generated actions follow the PDDL specification. The other 8 categories of actions are defined in similar manner to category 8.

3.6.2 PDDL problem generation

To define a problem instance for implicit Demand-Response we define a PDDL Problem with the 9 rooms of the SmartHome and a PDDL object for human presence. For each room we define if there is human presence currently in that room or not. In addition, the current energy consumption in watts (total-watts variable) is set. For each room we define the fan-levels of the HVAC (a value of 0 denotes that the HVAC is not currently operational in the room), as well as the light levels for each room (a value of 0 denotes that the lights are off in the room). Moreover, we define the outdoor temperature, the status of the outdoor lights (boolean variable), and if there is a TV on in a room where there is human presence we set the TV-operation flag to 1.

To define the goal we subtract the user’s requested change in total consumption from the total-watts variable when the user tells the agent to decrease the consumption (or add the user’s requested change when the user requests an increase) and compute the goal interval using Eq. (3). We use an interval instead of a specific value for the goal because the actions’ change in total consumption may not add up exactly to the goal value. Finally, we minimize the problem according to the total-cost variable, that is the sum of the selected actions’ costs.

3.6.3 Generating and executing a plan

When CIPA-genB classifies an intent for either decreasing or increasing the energy consumption, it queries the SmartHome for (a) total energy consumption, (b) the light status of each room, (c) the HVAC status of each room, (d) the outside temperature, (e) the status of the outdoor lights, (f) human presence in each room, (g) if there is a TV on in a room where there is human presence.

Subsequently the agent generates the problem instance with the current status and the goal according to the previous sub-section. For computing Formula (3) we set $C=0.03$ . By lowering the value of this parameter we achieve a smaller difference between the total energy consumption provided by the generated plan to the energy consumption goal. Too small a value though may result in no valid plan found that achieves the goal. To generate a plan it calls the Metric-FF Planner [44] using the generated problem instance and the problem domain, while doing a best-first search to optimize the total-cost metric. Metric-FF was chosen due to its stability and because it is a well-established PDDL Planner. Afterwards, CIPA-gen B parses the Planner’s output, and for each action it obtains: (a) the target action’s device (lights or HVAC), (b) the target room, (c) the target percentage. If the target percentage is 0 it turns off the device in the action’s room. Moreover, If the target percentage is greater than 0 and the device is off in the action’s room it turns on the device. Finally the agent executes these actions in the SmartHome while informing the user of the plan executed.

Table 7
Evaluation metrics

Evaluation	Accuracy*	Lowest conf	Precision	Recall	$F_{\beta}$
M-intent (energy) default StarSpace model	97.39%	0.61	0.9714	0.9714	0.9714
M-intent (energy) extended StarSpace model	99.00%	0.53	0.9838	0.9838	0.9838
M-intent (health) default StarSpace model	99.39%	0.64	0.9904	0.9904	0.9904
M-intent (health) extended StarSpace model	99.49%	0.79	0.9873	0.9873	0.9873

4. Evaluation

CIPA-gen B has been deployed on the CERTH nZEB SmartHome. In the following sub-sections we evaluate its various components, that is (a) the NLU multi-intent models for the Energy and Health Domains, (b) the dialogue generator based on adjacency pairs and (c) CIPA’s Planning module for the energy domain for implicit Demand-Response.

4.1 NLU models & dialogue generator

In this sub-section the evaluation results for the multi intent models and for both domains, energy and health, are presented. Prior to introducing the results of the evaluation, an assumption for mis-classification should be taken into consideration. Most of the mis-classification were of the type no-understand to inform. This mis-classification type is not a problem as it is handled by the dialogue model. We can classify orphan informs (that is informs that do not occur in the middle of the dialogue) as no-understands, whereas incorrect informs in the middle of a dialogue that do no match any of the valid dialogue flows generated by the generator presented in this paper are handled by the produced stories for invalid scenarios. So, if we consider this type of mis-classifications as correct our accuracy is increased (column Accuracy* in Table 7).

Table 7 presents the results of the multi-intent NLU model on the gold sets of the two domains. The Precisional/Recall/ $F_{\beta}$ scores are micro-averaged. We have included various text patterns for these intents in the Chatito DSL and generated our training data for the NLU Models as we did not have access to enough real world user messages for these domains. Training our NLU Models on real-world data should provide a stronger model for these domains. We also provide comparisons to multi-intent NLU models for these domains which use the unaltered RASA StarSpace adaptation. Our extended model has higher Accuracy* scores in both domains.

To evaluate the dialogue generator we present examples of valid use cases. In Table 8, indicative examples of possible user messages for the actions supported for the nZEB SmartHome energy domain by CIPA-gen B, are presented. A user can phrase a command for the agent in various ways. Moreover, the agent can converse with the user if it recognizes an intent that information is missing. For example if the user commands the agent to “turn the lights”, the agent will reply with either “where?” or “in which room?”. If the user replies with a valid room (i.e., “In the kitchen”) the agent will reply “Do you want them on or off?”. If the user replies with either “on” or “off” the agent will execute the turn-lights action. A representative example of the generative process of the algorithm through the conversational flow between the agent and a user, as far as set temperature-single intent command are concerned:

•
Active node $=$ stands on an active state that has not fulfilled all requirements and is able to generate new scenarios.
•
Final node $=$ stands on a final state and cannot generate new scenarios.

Table 8
A subset of nZEB SmartHome energy multi-intent domain actions and examples

Multi-intent User message

Turn-lights $+$ HVAC Turn on the lights in the kitchen and the air condition

Turn-lights $+$ HVAC Turn off the lights and the air condition in the kitchen

HVAC $+$ HVAC-change-mode Turn on the HVAC in the kitchen and set mode to cool

HVAC $+$ HVAC-set-fan Turn on the HVAC in the kitchen and set fan to high

Turn-lights $+$ set-temperature Turn on the lights in the kitchen and set the temperature to 20

Table 9
Scenarios’ initial conditions

S-id Season Initial consumption Goal Human presence Outdoor temperature Outdoor light TV

1 Summer 1500 Watts 1025 Watts Kitchen 40 Yes Yes

2 Autumn 1000 Watts 725 Watts Living room 22 No Yes

3 Summer 1550 Watts 1150 Watts Living room and kitchen 40 Yes Yes

4 Summer 1550 Watts 925 Watts Living room and kitchen 40 Yes Yes

Figure 2.
Finite automata of story generator example.

A case example was used as part of the evaluation process. The example aims to demonstrate that the Novel Dialogue Generator of CIPA-gen B is able to produce all the possible use case scenarios in its conversation with a user by adopting the concept of adjacency pairs. In particular, in the presented example the agent requires information about the value of the temperature and the place (room). There are 4 initial cases: (a) the user defines the intent accompanied by the value or (b) the place or (c) both of them or (d) the case that the user inputs only the intent. In the next step of this discourse, the user can either provide valid information that fulfills a slot or provide an invalid input that is categorized in one of the 14 invalid operations in the sector of energy (lights, HVAC, etc) including no-understand operation in the case of no relevant input. Every single invalid input in the Fig. 2 diagram is translated to multiple alternative rejected scenarios. The node that contains value and place is a final node. On the contrary, the rest of the nodes need at least one more slot to be fulfilled in order to end up in a final state. In the same logic, every active node generates more complex scenarios until a final node is born. In this example, the set_temperature intent generates 64 unique scenarios that cover all possible contingencies and they are illustrated from the colourful nodes in the Fig. 2. Each node represents the given input from the user. Calculation of all possible generated scenarios in a multi intent command is a process identical to the single intent, differentiating only in the number of slots. Multi intents commands have as slots the union of the slots of two intents. So, for example in the set_temperature_lights intent that is responsible to set the temperature in a room and turn the lights, the available slots comprise the union of (value, place) from set_temperature and lights_on_off from light. The total number of generated scenarios are 281.

The evaluation of the CIPA story generator was an experimental process conducted through a comparison between the enumeration of all possible scenarios that can be generated from a dialogue between an agent and a user, as illustrated in the Fig. 2, and the actual scenarios that are generated by CIPA (the parsed data from the generated file and the calculation of the sum of the generated scenarios for every intent). CIPA has been evaluated as a competent agent that behaves properly covering all contingencies and as such we consider our first research question positively answered.
4.2 Implicit demand-response planning

Multi-intent	User message
Turn-lights $+$ HVAC	Turn on the lights in the kitchen and the air condition
Turn-lights $+$ HVAC	Turn off the lights and the air condition in the kitchen
HVAC $+$ HVAC-change-mode	Turn on the HVAC in the kitchen and set mode to cool
HVAC $+$ HVAC-set-fan	Turn on the HVAC in the kitchen and set fan to high
Turn-lights $+$ set-temperature	Turn on the lights in the kitchen and set the temperature to 20

S-id	Season	Initial consumption	Goal	Human presence	Outdoor temperature	Outdoor light	TV
1	Summer	1500 Watts	1025 Watts	Kitchen	40	Yes	Yes
2	Autumn	1000 Watts	725 Watts	Living room	22	No	Yes
3	Summer	1550 Watts	1150 Watts	Living room and kitchen	40	Yes	Yes
4	Summer	1550 Watts	925 Watts	Living room and kitchen	40	Yes	Yes

To evaluate CIPA’s Planning module for implicit Demand-Response we showcase 4 scenarios executed by CIPA-gen B on CERTH’s nZEB SmartHome. In Tables 9 and 10 we present the initial conditions of the scenarios and the goal consumption in watts. Table 11 presents the plans generated and executed by CIPA to reach the target goal consumption. The CPU time elapsed for generating these plans by the AI Planner ranged from 0.01 (Scenario 2) to 0.63 seconds (Scenario 4).

Table 10
Scenarios’ device initial conditions

S-id	Device	Living room	Kitchen	S-bedroom	WC	D-bedroom	Hall	Corridor	Play-room	Guest room
1	HVAC	100%	60%	80%	100%	100%	80%	100%	20%	0%
1	Lights	100%	80%	80%	100%	100%	100%	80%	0%	20%
2	HVAC	0%	0%	60%	60%	20%	0%	80%	0%	0%
2	Lights	100%	40%	100%	60%	0%	80%	100%	0%	0%
3	HVAC	80%	80%	60%	100%	100%	100%	80%	0%	60%
3	Lights	20%	40%	20%	80%	40%	60%	80%	0%	20%
4	HVAC	80%	80%	60%	100%	100%	100%	80%	100%	100%
4	Lights	100%	20%	60%	80%	80%	60%	80%	20%	0%

Table 11

Plans for scenarios 1–4

Step	Plan for s-id 1	Plan for s-id 2	Plan for s-id 3	Plan for s-id 4
1	Lights 100%–20% at living room	HVAC 60%–40% S-bedroom	Lights 60%–0% hall	Lights 100%–40% living room
2	Lights 100%–0% at WC	Lights 100%–0% S-bedroom	Lights 40%–0% D-bedroom	Lights 80%–0% WC
3	Lights 100%–0% at hall	Lights 100%–0% corridor	Lights 80%–0% corridor	Lights 60%–0% S-bedroom
4	Lights 100%–0% at D-bedroom	–	Lights 20%–0% S-bedroom	Lights 60%–0% hall
5	–	–	HVAC 100%–0% WC	Lights 20%–0% play-room
6	–	–	Lights 80%–0% WC	HVAC 100%–0% WC
7	–	–	–	Lights 80%–0% D-bedroom
8	–	–	–	HVAC 100%–20% hall
9	–	–	–	HVAC 100%–0% D-bedroom
10	–	–	–	Lights 80%–0% corridor

In the first three scenarios the agent avoids disturbing the humans, whereas in the fourth scenario the goal is more ambitious and the agent has to disturb the humans in the living room by dimming the lights. However, even in that scenario the agent does not turn the lights completely off, but instead dims them. In Scenario 1 the agent avoids lowering the HVAC fans as the outside temperature is high, whereas in Scenario 3 it executes only one HVAC lower fans’ action in a room without human presence. CIPA reached the goal in all scenarios and as such we consider our second research question positively answered.

4.3 Acceptance study

Herein, the results of an experimental evaluation study assessing the usefulness and acceptance of the agent are presented. We consider this as an important first step towards conducting further longitudinal studies in real-world settings in the future.

4.3.1 Methodology

Participants were recruited through an electronic invitation to the staff of a research centre in Greece (CERTH), in which the rationale of the study was explained. An online questionnaire was administered to the study participants. The questionnaire was comprised of three main sections. In the first section, demographic and personal information of the participant were requested. Questions about education, familiarity with speech interactions with a virtual agent, familiarity with smart home automation and voice control automation, were asked.

In the second section of the questionnaire, the participants were requested to perform a series of actions in interaction with the intelligent agent, and indicate the success or failure of the actions. More specifically, the participants were asked to connect to the online SmartHome platform, through which the agent communicates with the users, and the energy management of the smart home takes place. The requested series of actions was the following:

1.
Change the status (on, off or dim) of the lights in a room.
2.
Change the status (on, off, fan speed, temperature) of the HVAC in a room.
3.
Change the status (on, off) of both the lights and HVAC in a room.

Subsequently, the participants were asked questions about whether the agent identified the missing parts of user’s voice commands and whether the right action in the right room was executed, the level of easiness/ difficulty in communication with the agent, and the frequency of interactions with the agent which they could tolerate.

In the third section of the questionnaire, questions about the usefulness and acceptance of the agent were asked. More specifically, participants were asked questions about their concern for the environment, whether the agent could help them in energy savings, their awareness about Demand-Response programs, their willingness to change their electricity consumption to gain monetary benefits, their intention to use the agent to adjust the electricity consumption, their tolerance in permitting the agent to handle loads in empty rooms or rooms they are present. The questionnaire used can be found in the Appendix.
4.3.2 Results

In total, 36 people (21 male, 15 female) participated in the study. The mean age of the participants was 28.5 $\pm$ 5.1 years. All participants were university graduates. In a scale 1–5, participants overall had a medium familiarity with speech interactions with a virtual agent (mean: 2.7 $\pm$ 1.1). 39% of the participants had no smart home automation at their home, while 14% were familiar with heating automation, and another 14% were familiar with smart lights. Mean familiarity with voice control automation was 2.4 $\pm$ 1.0 in 1–5 scale.

In the questions following the execution of the requested interactions with the agent, the vast majority of the participants found that the agent executed the right action (94%), and the agent correctly identified missing parts of their voice commands (75%). Participants were regarded to be concerned about the environment (mean: 3.9 $\pm$ 0.9 in 1–5 scale), and most of them were aware of Demand Response programs (58%). 97% of the participants were willing to change their electricity consumptions in order to gain monetary benefits.

Participants overall found it easy to communicate with the agent (mean: 3.8 $\pm$ 0.8 in 1–5 scale)). In terms of frequency of interactions they could tolerate, 50% of the participants responded that they could have daily interactions with the agent, and another 30% answered that they could try it a few times to see if it works for them. Furthermore, more participants were inclined towards the opinion that the agent could help them in energy savings (mean: 3.9 $\pm$ 0.6 in 1–5 scale), and 83% of the participants would like the agent to control the loads in home. In summary, the results clearly showed that the intelligent agent was well-accepted by participants and regarded to be useful.

A correlation analysis on the participants’ answers, using Spearman’s rank correlation coefficient and setting the significance level at $p<$ 0.05, further revealed some interesting findings. Concern about the environment was positively correlated with perceived easiness in communicating with the agent ( $p=$ 0.02), and easiness in communication with the agent was correlated with perceived usefulness of the agent in energy savings ( $p=$ 0.002). These findings suggest that concern about the environment may be an important behavioural factor in assessing the easiness to use such systems targeting energy optimisation. Moreover, the correlation between perceived easiness of use and perceived usefulness, shows the importance in developing easy-to-use systems in this domain. Furthermore, there was a high correlation between answers on letting the agent to handle loads in empty rooms versus loads in rooms the participant is present ( $p<$ 0.001). This finding suggests that potential users of such a system may not be particularly concerned about their presence in rooms, where the agent is operating. However, it becomes clear that further real-world studies are needed to confirm this finding.

5. Discussion

We presented a task-based agent that (a) enables conversation with humans without prior knowledge of the environment and (b) provides an AI planner to form complex plans for solving a DR problem. A general purpose task-based agent was introduced, equipped with a novel dialogue generator, which was applied to two domains of the CERTH nZEB SmartHome, energy and health domain. The developed CIPA-gen B is based on the RASA framework and supports multi-intent classification for the aforementioned domains capable of recognizing up to two intents per user command. Towards to a domain agnostic agent, the code-base was engineered so as to be easily applied to different domains. CIPA-gen B uses an embedding method for its NLU model that is not based on pre-trained word vectors of a specific language so as it can be generalized to any natural language. The training data should have to be generated for that language, using the tools selected and the algorithms developed, while omitting a dependency on SpaCy for slot extraction. This consists the only limitation on training a model for a different language. In addition, messages from real-world users should be collected in order to build a real world usage data set that can be used for both training the model to achieve greater accuracy and for evaluation purposes.

Furthermore, a novel dialogue generator, based on the idea of adjacency pairs, has been implemented in order to enable the proposed agent to generate all the possible scenarios in a conversation between the agent and a user. The generated dialogue tree can be utilized by the agent during operation using a variety of methods. In CIPA we apply the RASA framework’s Memoization policy, which follows the dialogue tree generated. More advanced methods, such as calculating the distances to the various dialogue tree branches by using a distance metric could be also applied. These methods could be investigated in further work.

In addition, CIPA-gen B is connected to an AI Planner and utilizes AI Planning research [5] to form complex plans of actions. A PDDL Domain has been defined for an implicit Demand-Response scheme on the CERTH nZEB SmartHome and CIPA-gen B can form and execute plans of actions to reach the required energy consumption. In this light, our research question Q2 has been answered positively. CIPA gen-B is to the best of our knowledge the first implemented Intelligent Personal Agent that utilizes an AI PLanner. Furthermore, CIPA gen-B is (to the best of our knowledge) the only Intelligent Personal Agent that solves an implicit Demand-Response problem.

An experimental user evaluation study was conducted to assess the usefulness and the acceptance of the intelligent agent. The outcomes of the study showed that the agent was regarded to be helpful in energy savings and the vast majority of the participants would like the agent to control loads at their home. Furthermore, the users did not find any major difficulty in communicating with the agent. It is important also to note that despite the medium familiarity of the participants with speech interactions with a virtual agent, and the high number of participants who had no smart home automation at their home, the ease in communication with the agent was deemed to be high. In this context, our research question Q1 has been answered positively. Overall, the results suggest that the proposed system is highly acceptable. However, further longitudinal real-world studies are necessary to show the value of the proposed system in daily living.

Future research and development related to CIPA agent will be focused on the extension of the agent’s capabilities. In the next releases of the agent we plan to integrate the PDDL Problem Generation with more advanced dialogue trees from the part of the agent, that is the agent will form the Planning problem instance after a dialogue with the user. Moreover, we plan to extend the Planning problem into a Temporal Planning one. This will allow the agent to schedule these actions and activate them for specific time intervals, thus allowing the authors to implement a more complex implicit Demand-Response scheme. In addition, we consider using Machine Learning to learn the user’s preferences when generating the planning problem instance in more complex domains. Finally, we aim for a more advanced conversion model from intents/entities to planning actions/variables, as well as Probabilistic Planning and Probabilistic Reasoning over Time support. Moreover, new dynamic ensamble methods could be tested for the NLU models [46]. Our future work also involves the testing of the health domain agent with users in real-life settings.

6. Conclusion

In conclusion, we have presented a conversational agent-based system for application in a SmartHome. Through utilizing and integrating components for NLU multi-intention, dialogue generation for adjacency pairs, and AI planning to form plans of actions for implicit Demand-Response, the system is able to effectively interact with humans, execute complex actions, and control smart devices. Experimental results demonstrated the usefulness and acceptance of such an intelligent system.

Footnotes

https://ai.googleblog.com/2020/01/towards-conversational-agent-that-can.html.

rasa-core (0.10.4), rasa-nlu (0.13.2).

http://35.196.60.7/docs/core/0.10.4/stories/.

https://github.com/rodrigopivi/Chatito.

Acknowledgments

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements No 643607 (myAirCoach), No 732679 (ACTIVAGE) & 773960 (DELTA).

Appendix

Figure 3.

Page 1 of Questionnaire.

Figure 4.

Page 2 of Questionnaire.

Figure 5.

Page 3 of Questionnaire.

Appendix: Questionnaire

References

Weizenbaum

. ELIZA – a computer program for the study of natural language communication between man and machine. Commun ACM. 1966; 9(1): 36-45. Available from: http://doi.acm.org/10.1145/365153.365168.

Dersch

. Shoebox – A voice responsive machine. Datamation. 1962; 8(6): 47-50.

Alexiadis

Nizamis

Koskinas

Ioannidis

Votis

Tzovaras

. Applying an intelligent personal agent on a smart home using a novel dialogue generator. in: Artificial Intelligence Applications and Innovations. Cham: Springer International Publishing; Maglogiannis

Iliadis

Pimenidis

, eds, 2020; pp. 384-395.

Fisch

Chopra

Adams

Bordes

Weston

. StarSpace: Embed all the things! in: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, February 2-7, 2018. 2018. Available from: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16998.

Nau

Ghallab

Traverso

. Automated planning: Theory & Practice. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.. 2004.

Vinyals

. A neural conversational model. arXiv preprint arXiv: 150605869. 2015.

Serban

Sordoni

Lowe

Charlin

Pineau

Courville

, et al. A hierarchical latent variable encoder-decoder model for generating dialogues. in: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. 2017; pp. 3295-3301.

Park

Cho

Kim

. A hierarchical latent structure for variational conversation modeling. in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018; pp. 1792-1801.

Galley

Brockett

Spithourakis

Gao

Dolan

. A persona-based neural conversation model. in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016; pp. 994-1003.

10.

Monroe

Ritter

Jurafsky

Galley

Gao

. Deep reinforcement learning for dialogue generation. in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016; pp. 1192-1202.

11.

Simões

Lau

Reis

. Exploring communication protocols and centralized critics in multi-agent deep learning. Integr Comput Aided Eng. 2020; 27: 333-351.

12.

Dumitrescu

. Cassandra smart-home system description. in: 2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD). IEEE. 2017; pp. 1-6.

13.

Park

Kang

Seo

. An efficient framework for development of task-oriented dialog systems in a smart home environment. Sensors. 2018; 18(5): 1581.

14.

Traum

Larsson

. The information state approach to dialogue management. in: Current and new directions in discourse and dialogue. Springer. 2003; pp. 325-353.

15.

Goddeau

Meng

Polifroni

Seneff

Busayapongchai

. A form-based dialogue manager for spoken language applications. in: Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96. 1996; 2: pp. 701-704.

16.

Schegloff

Sacks

. Opening up closings. Semiotica. 1973; 8: 289-327.

17.

Lera

Martín

Olivera

. Neural networks for recognizing human activities in home-like environments. Integr Comput Aided Eng. 2019; 26: 37-47.

18.

Shen

Yeh

Williams

. Towards personal assistants that can help users plan. in: Intelligent Virtual Agents. Cham: Springer International Publishing. Traum

Swartout

Khooshabeh

Kopp

Scherer

Leuski

, eds, 2016; pp. 424-428.

19.

Riedl

Young

. An intent-driven planner for multi-agent story generation. in: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems – Volume 1. AAMAS ’04. USA: IEEE Computer Society. 2004; pp. 186-193.

20.

Geib

Weerasinghe

Matskevich

Kantharaju

Craenen

Petrick

. Building helpful virtual agents using plan recognition and planning. in: Twelfth Artificial Intelligence and Interactive Digital Entertainment Conference. 2016.

21.

Steedman

. The syntactic process. Cambridge, MA, USA: MIT Press. 2000.

22.

Cimellaro

Mahin

Domaneschi

. Integrating a human behavior model within an agent-based approach for blasting evacuation. Comput-Aided Civ Infrastruct Eng. 2019; 34(1): 3-20. Available from: https://doi.org/10.1111/mice.12364.

23.

Gutierrez Soto

Adeli

. Multi-agent replicator controller for sustainable vibration control of smart structures. Journal of Vibroengineering. 2017; 19(6): 4300-4322. Available from: https://doi.org/10.21595%2Fjve.2017.18924.

24.

Colak

. Introduction to smart grid. in: 2016 International Smart Grid Workshop and Certificate Program (ISGWCP). 2016; pp. 1-5.

25.

Albadi

El-Saadany

. Demand response in electricity markets: An overview. in: 2007 IEEE Power Engineering Society General Meeting. 2007; pp. 1-5.

26.

Tian

Yan

Sun

Cao

. A framework for dispatching operation of the flexible load in the urban core area. in: 2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC). IEEE. 2017; pp. 738-743.

27.

Sebastian

Margaret

. Application of demand response programs for residential loads to minimize energy cost. in: 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT). 2016; pp. 1-4.

28.

Wang

Gong

. User-side load fast precise dispatching model based on contracts and direct load control. in: 2017 2nd International Conference on Power and Renewable Energy (ICPRE). IEEE. 2017; pp. 747-751.

29.

Uddin

Romlie

Abdullah

Abd Halim

Kwang

, et al. A review on peak load shaving strategies. Renewable and Sustainable Energy Reviews. 2018; 82: 3323-3332.

30.

Duman

Güler

Deveci

Gönül

. Residential load scheduling optimization for demand-side management under time-of-use rate. in: 2018 6th International Istanbul Smart Grids and Cities Congress and Fair (ICSG). 2018; pp. 193-196.

31.

Jovanovic

Bousselham

Bayram

. Residential demand response scheduling with consideration of consumer preferences. Applied Sciences. 2016; 2: 6.

32.

Melhem

Grunder

Hammoudan

Moubayed

. Optimal residential load scheduling model in smart grid environment. in: 2017 IEEE International Conference on Environment and Electrical Engineering and 2017 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe). IEEE. 2017; pp. 1-6.

33.

Rosin

Hõimoja

Möller

Lehtla

. Residential electricity consumption and loads pattern analysis. in: Proceedings of the 2010 Electric Power Quality and Supply Reliability Conference. IEEE. 2010; pp. 111-116.

34.

Khorram

Faria

Vale

. Optimizing lighting in an office for demand response participation considering user preferences. in: 2019 International Conference on Smart Energy Systems and Technologies (SEST). IEEE. 2019; pp. 1-6.

35.

. An evaluation of the HVAC load potential for providing load balancing service. IEEE Transactions on Smart Grid. 2012; 3(3): 1263-1270.

36.

Cai

Ramdaspalli

Pipattanasomporn

Rahman

Malekpour

Kothandaraman

. Impact of hvac set point adjustment on energy savings and peak load reductions in buildings. in: 2018 IEEE International Smart Cities Conference (ISC2). IEEE. 2018; pp. 1-6.

37.

Mensio

Rizzo

Morisio

. Multi-turn QA: A RNN contextual approach to intent classification for goal-oriented systems. Companion Proceedings of the The Web Conference 2018. 2018.

38.

Zamanirad

Benatallah

Rodriguez

Yaghoubzadehfard

Bouguelia

Brabra

. State machine based human-bot conversation model and services. in: Advanced Information Systems Engineering. Cham: Springer International Publishing. Dustdar

Salinesi

Rieu

Pant

, eds, 2020; pp. 199-214.

39.

Stolcke

Ries

Coccaro

Shriberg

Bates

Jurafsky

, et al. Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics. 2000; 26(3): 339-374. Available from: https://www.aclweb.org/anthology/J00-3003.

40.

Chittò Báez

Daniel

Benatallah

. Automatic generation of chatbots for conversational web browsing. in: Conceptual Modeling - 39th International Conference, ER 2020, Vienna, Austria, November 3-6, 2020, Proceedings. vol. 12400 of Lecture Notes in Computer Science. Springer. Dobbie

Frank

Kappel

Liddle

Mayr

, eds, 2020; pp. 239-249. Available from: https://doi.org/10.1007/978-3-030-62522-1_17.

41.

Sobol

. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation. 2001; 55(1-3): 271-280.

42.

Saltelli

. Making best use of model evaluations to compute sensitivity indices. Computer Physics Communications. 2002; 145(2): 280-297.

43.

Saltelli

Annoni

Azzini

Campolongo

Ratto

Tarantola

. Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Computer Physics Communications. 2010; 181(2): 259-270. Available from: https://www-sciencedirect-com-443.web.bisu.edu.cn/science/article/pii/S0010465509003087.

44.

Hoffmann

. The metric-FF planning system: Translating “Ignoring Delete Lists” to numeric state variables. J Artif Intell Res. 2003; 20: 291-341. Available from: https://doi.org/10.1613/jair.1144.

45.

Blum

Furst

. Fast planning through planning graph analysis. Artificial Intelligence. 1997; 90(1-2): 281-300.

46.

Alam

KMR

Siddique