Abstract
Traditional urban planning methodologies cannot solve today’s planning challenges that cities pose. Citizens’ needs have substantially changed the last decade while urban planning methods have remained the same since the 70s. A more citizen-centric urban planning is presented in this paper. To that end, and to effectively deal with citizens’ larger data needed for the urban planning methodological reinvention, interdisciplinary work is supported with information and communication technologies. We conducted a survey of citizens on urban time-use (Marsal MLl and López MB (2014) Smart urban planning: Designing urban-land-use from urban time-use. Journal of Urban Technology 21(1): 39–54). The results on urban-time-use distribution were converted into urban-land-uses as pioneering methodology for the reinvention of urban planning methodologies. In order to get the highest voluntary participation from citizens to ensure a good representation of all ages and social groups, the survey has to be designed with adaptive hypermedia techniques. The adaptive hypermedia techniques we propose in this paper combine stereotype and feature-based models which we explore for the purpose to include them in the survey. The combination of stereotype and feature-based models has different advantages, among others: stereotype techniques avoid to initiate survey profiles from scratch and feature-based techniques allow a personalized questionnaire to be employed. Moreover, personalization, in combination with user profiles, allows prediction which is of great interest for this research due to its planning purposes.
Keywords
Introduction: The need for a reformulation of urban planning methodologies with information and communication-based systems
Urban planning standards (urban master plan’s design parameters, the binding delimiters of urban land and use distribution) have not been updated since their introduction back in the 1970s or even earlier. This is the case of Spain and many other countries attached to the Romans style of Law. These standards place mandatory and individualized values on amounts of public space, public facilities, roads and parking areas, social housing, etc. They are a set of indices that urban planners have to use when working on the conceptual design of a city’s master plan. The use of these non-updated standards means that such indices cannot provide an appropriate response to citizens’ urban needs in the ever-more complex world we live in. For instance, it is hard for urban planning standards for public space design to match citizens’ needs for public spaces, when today such spaces are being used very differently from how they were when the standards were introduced 40 years ago.
Urban planning instrumental methodologies (urban master plans and urban planning acts, in other words) implement such standards as the planning process’s basic design reference. Urban planning standards need a radical update and revision, and this will have an impact on urban planning instrumental methodologies. The need for an update and revision framed in the strong relationship between urban planning standards and urban planning methodologies sets the conceptual framework of our research. The main goal is the revision of urban planning standards and to find valid references to propose new standards.
In our understanding, the only valid reference to decide on the land-use amounts needed for housing, public spaces, facilities, etc. is the use that citizens do of those city’s elements (Marsal and López, 2014). As more utilization has an urban element as more space has to be devoted to that end (e.g. public space vs. facilities). The indicator which best documents how citizens utilize the different city elements is their urban time-use, in other words, how citizens distribute their time to perform their daily activities in the city space. Daily activities are performed as a part of an urban process. An urban process (similarly to business or industrial processes) consists of more than one urban activity, which can also include e-activities. Today’s citizens not only perform urban activities in cities, but also urban processes. For example, we can take “shopping” as an urban process because it means not merely the activity of buying the product but also traveling to the shop, maybe getting information about the product prior to the purchase, sometimes going to different shops to compare products before making the final decision, and even conducting the purchase on-line to have the product delivered to the home. Today we can rarely talk about isolated activities, as their growing complexity involves more than one action, resulting in a process which also often includes technologies. When e-activities are involved the idea of a process is further strengthened. In this research we take into special account the role of e-activities when redefining urban standards, as remote activities mean a physical subtraction of urban space for the corresponding activity.
The debate about the intersections and correlations between time and space has generated increasing interest in the academic-research and policy-making contexts. In the academic-research environment, the study of time and space has traditionally been cross-disciplinary, with an epistemological focus, establishing correlations between the fields of sociology, geography, economy, urban planning, and related fields. These studies began with analysis of the geographical, sociological, and economic implications of urban time-use for peoples’ lives (Kellerman, 1989; Szalai, 1972, 1978). The consideration of urban time-use in urban planning activity arrived later, in the early 90s. Accountability of urban time-use habits in urban planning practice took place hand in hand with the introduction of participative policies which included all actors who are subjects of urban time-use routines (Mückenberger, 2011). Most relevant examples of the introduction of participatory local time policies in urban planning activity can be found in Italy (under the so called ‘tempi della città’ movement; Belloni, 2013). The introduction of urban time plans and time policies into Italian urban planning practice occurs through the development of architectural and urban design methodologies sensitive to time and space (Bonfiglioli, 2004). More recent Italian strategies to introduce the space and time dimensions in urban planning explore these relationships in a creative, innovational, way. For instance, they take the perspective of young city users, to learn the way they tackle today’s processes of acceleration and compression of urban times and spaces (Camozzi, 2013).
Since the 90s, the impact of local time-use has spread to policies accompanying urban planning practice. Examples of this expansion into policy making can be found in Italy too and in other West-European countries, mainly Germany, Spain, Finland, and France (Mückenberger, 2011). In Germany, the penetration of participatory urban time-use policies into urban planning instruments is made through specific projects that turn specific policies into tangible projects. Examples include making working time more flexible, establishing a better balance between work and life, and promoting gender equality through childcare arrangements (Mückenberger, 2011). In Spain and Finland, urban time-use policies included in urban planning practice are mainly designed to reduce gender differences (Carrasco and Rodriguez, 2000; Goodin et al., 2004) and to better balance employment among the population (Adams and Van Erde, 2010; Natti et al., 2011). In France, policies are more focused on how to introduce well-being through the planning practice (Cardoso et al., 2010; Krueger et al., 2009).
The research presented in this paper corresponds to the methodological description of a hypermedia adaptive survey on urban time-use. This urban time-use survey is the basis for the Smart Urban Planning (SUP) method, to obtain more citizen-centric planning standards (Marsal and Lopez, 2014). A pilot testing SUP method has already been conducted, with published results (Marsal and Lopez, 2014). In that pilot test we did not use hypermedia adaptive techniques while carrying out the survey on urban time-use. Thus, we experienced all of the inconveniences of regular web-based and face-to-face surveys. In this paper we present a more efficient way to conduct the urban time-use survey—which is the informational basis for the SUP method, by making use of hypermedia adaptive techniques.
The SUP method contributes to the study of participatory urban time and space and its implications for urban planning practice by developing a new method to design urban planning standards. Our method transforms urban time-use into urban-land-use. The new values on urban-land-use will be the new urban planning standards, varying from one city to another (contrary to the current fixed national values), and therefore better respond to local realities. Collection of urban time-use habits in each city will be performed through a participatory survey, open to all citizens. In this research we present the characteristics of this participatory adaptive hypermedia (AH) survey on urban time-use.
The urban planning standards’ revision proposed by SUP does not consider city spaces as simple land pieces where certain activities can be performed. Our citizen-centric standards’ revision transforms urban processes into land amounts and their uses. These complex transformations are computer-based and begin with the setting of a “rule of correspondence” fixing a qualitative correlation between urban time-use and urban lands, followed by “indicators of equivalence” to set the quantitative correlation between urban time-use and urban lands.
Our main goal is to learn about citizens’ urban needs through their urban-time-use distribution, and to this end a hypermedia adaptive survey on urban time-use is designed as the most effective method to collect citizens’ information. This paper aims to explain why an hypermedia survey is chosen as a method for citizens’ participatory process, how the hypermedia adaptive survey on urban time-use is designed, and how it would be conducted. With that purpose, the paper is structured as follows:
In the following section, Adaptive Hypermedia (AH) systems’ techniques selection for survey purposes: stereotype and feature-based models combination, we describe the design of a stereotype survey as the start of a feature-based survey on urban time use. The characteristics of a feature-based survey and the methodology of how it should be conducted are also described. In the same section, the combined technique of stereotype and feature-based models is explained. In the following section, The design of the AH survey in this research, the existing normalized standards for hypermedia adaptation are described as well as how to implement them for survey purposes. In the subsequent sections, Stereotyped survey: user groups’ construction and Feature-based survey: personalization, a description on how to conduct both surveys is provided. Finally, in the third and last section, Conclusions and future research: exploring predictive functionalities of AH surveys, the predictive potential that the blending of stereotype and feature-based models bring is presented as a topic for further research.
Adaptive Hypermedia (AH) systems’ techniques selection for survey purposes: Stereotype and feature-based models combination
AH is generally accepted as the crossroads in hypermedia and user modeling (UM) research (Brusilovsky, 1996, 2002; de Bra et al., 2004a, 2004b). Adaptive Hypermedia Systems (AHS) belong to the class of user-adaptive software systems (Schneider-Hufschmidt, 1993). These technologies allow an individual user of a hypermedia application to personalize the content and presentation of the application according to their preferences, objectives, and knowledge (Perkowitz and Etzioni, 1999, 2000). Personalization is defined as how an application’s content, navigation, and interface are tailored to match the specific needs of an individual (User Modeling (UM)) or a community (Group Modeling (GM)) (Callan et al., 2001).
User and group models are created through UM and GM processes in which unobservable information about users is inferred from observable information from such users (Zukerman et al., 1999; Zukerman and Albrecht, 2001) by modeling user interactions with the system for a specific domain concept. A user model, as a representation of information about an individual user, is essential in order for an adaptive system to provide an adaptation effect (Brusilovsky and Millán, 2007). User interaction modeling represents and defines the interaction between the user and the application. The data stored in the interaction model are used to infer user characteristics with the objective of updating and validating the UM. For this purpose, this component includes evaluation, adaptation, and inference mechanisms (Benyon, 1993; de Bra et al., 2004a). The domain model represents a set of domain concepts in which each concept is related with other concepts and represents a semantic net. The most important function of this model is to provide a structure for the representation of user domain knowledge (i.e. to store the estimated level of the user's knowledge for each concept) (Benyon, 1993; de Bra et al., 2004a).
In order to better understand the newest AHS applications of our interest–which include information systems, volunteered geographic information, and geodata collection—a reference to the earliest and most well-known AHS applications is a must. Adaptive Educational Systems (AES), commonly used in e-learning are the first AHS deployments, paving the way of any later evolution. A typical AES system operates as the following: when a user is studying content supported by AES (e.g. the meaning of a word), the user can select the depth of the information displayed by the application by prioritizing and highlighting the most important information based on the user’s existing knowledge (such as “expert, intermediate or novice”) thanks to adaptive navigation techniques such as web-based mechanisms like “hide, sort or annotate” details of a piece of information, etc. In AES, user knowledge is frequently the only user feature being modeled as it is the sole AES objective.
User knowledge modeling used in AES is similar to a scalar model, where the level of a user’s knowledge is estimated by rating values using a numerical scale or by qualitative ranking using a categorical scale. Scalar models, especially qualitative ones, are quite similar to stereotype models (Brusilovsky and Millán, 2007) which are of interest of the research presented in this paper as a method to initiate the public participatory survey proposed here. Stereotype models, developed by Elaine Rich (1979, 1989) in the late 70s and extensively used in early adaptive systems, especially between 1989 and 1994 (Hoic-Bozic et al., 2009), attempt to cluster all possible users of an adaptive system into different groups or stereotypes with one adaptive solution being developed for each stereotype. What makes scalar models and stereotype models slightly different is that knowledge assessment in the former is made by the user him or herself while in the latter the assessment is made by a third party. The shortcoming of both models is their low level of precision (Brusilovsky and Millán, 2007) due to the techniques’ generous averages of what is being modeled.
More recently, in search of more precision, many of the AHS that focus on advanced UM use more elaborated techniques based on user feature modeling, the so called feature-based models. Feature-based modeling is today’s dominant approach in AHS UM, with new contributions constantly coming from different researchers. Although feature-based models have overtaken stereotype models, the latter are useful for user and GM purposes in combination with the former. One of the most popular combinations is the use of stereotypes to initialize a feature-based model (Tsiriga and Virvou, 2003). This technique avoids the difficulties presented by employing user feature modeling from scratch. As we will see in the next section, we utilize this combined technique in our research: a two-step approach where feature-based models help us to elaborate the results of the participatory process of the previously formed stereotype groups.
An interesting application for us of stereotype and feature-based techniques in combination is found in information systems. The AVANTI project (Fink et al., 1997) provides a good example. The project aims to develop and evaluate a distributed information system providing hypermedia information about a metropolitan area (e.g. public services, transportation, buildings) for a variety of users with different needs (e.g. tourists, citizens, travel agency clerks, elderly people, the blind, wheelchair-bound people, and users with less serious forms of muscular dystrophy). A stereotype model is constructed by means of an initial interview that provides the basis for primary assumptions about the user. This is a valuable source of information for initially assigning the user to certain user groups, or stereotypes. The final feature-based user model will contain explicitly modeled assumptions which represent the important characteristics of an individual user, like preferences and interests, domain knowledge, and physical, sensorial, and cognitive abilities (Fink et al., 1997).
The AVANTI project demonstrates that stereotype and feature-based techniques in combination can be very powerful for distributing information. Similarly, our research will show how a combination of stereotype and feature-based models can be extremely useful not only in distributing information but also in gathering user information. In our research, users are citizens who are surveyed using an AH survey to collect information about their time-use distribution to rethink urban planning methodologies focusing on citizens’ urban needs. To discover these needs we will use the above presented two-step adaptive survey on citizens’ time-use distribution in relation to their urban processes. As mentioned earlier, the results will be transformed into the amounts of different land types (based on the land’s different urban uses) needed for citizens’ urban processes to be performed through a “rule of correspondence” (to elaborate qualitative pairs between different time-uses and different types of land use) and “indicators of equivalence” (to elaborate quantitative correlations between time-use and land-use amounts).
The design of the AH survey in this research
The AH survey on urban time-use designed in this research is based on normalized AH techniques and designed according the existing standards of adaptation. Standardization is very important, not because the main goal of this research is to rethink urban planning standards but because it provides a framework where to build on further developments of a given topic. We will design our citizens’ participatory process through the survey on urban time-use capitalizing on the existing standards of AH techniques.
Recently, adaptive systems, also called personalized systems, have been developed in many fields other than the educational, such as e-commerce, information distribution, and e-scheduling. What all these applications of adaptive systems have in common is that they are user-centric and every user is modeled to obtain the different user profiles. Depending on the field applied, the user profile will contain different descriptive information (e.g. personal identification, preferences, habits, etc.). From that the system usually provides user services or information appropriate to the profile (Duc-Long et al., 2010). In contrast, user profiles in what we call the “adaptive survey method” help not to provide but to retrieve user information in an accurate and precise way. This research proposes an “adaptive hypermedia survey method” as a retrieval information technique in two steps, from a stereotype-based user survey to a feature-based user survey. The research will model feature-based survey results relating to citizens’ urban-time-use distribution for the urban planning methodological reinvention.
As has been seen, AES are older than the rest of AHS. The educational field can be considered mature in comparison with other AHS areas (commerce, information systems, scheduling management, etc.) since certain standards have been already developed and some organizations are applying different regulations. The “Learning Technology Standards Committee” (LTSC) and the “Global Learning Consortium” (IMS Global) are examples of organizations that define specifications and standards for e-learning. More specifically, the “Computer Society Standards Activity Board” of the LTSC has defined data-centric specifications (such as P1484 and P1484.2) to simplify interoperability between different systems and to facilitate the reuse of learning tools and contents (Martins-Antonio et al., 2008).
Another standardization initiative is the one promoted by several working groups, 1 where certain specifications (ISO/IEC JTC1/SC36) in respect of student information exchange between different systems are defined. Such specifications, called the “IMS Learner Information Package Information Model”, define data models and the syntax and semantics describing user characteristics, knowledge, and abilities. User acquisition of knowledge, capacities, aptitudes, personal information, relations, security parameters, preferences and learning style, performance, portfolio, etc. are also described (Martins-Antonio et al., 2008). In a similar way to IMS, “Public and Private Information for Learners” (PAPI Learner) standardizes not students’ curricula information (IMS) but their performance (Cobaleda and Duitama, 2009).
Within the adaptive educational field, various standards and normalized methods can be found in the area of tools and techniques. It is widely accepted inside the AH research community that the first benchmark technique for AH applications was a tool called AHA! (de Bra et al., 2003). This seminal technique provided a generic architecture that led to further research in many different directions. A complete list of normalized methods and tools for adaptive educational purposes can be found in Knutov et al. (2009). Among these methods, and bearing in mind our adaptive survey design, it is particularly worth highlighting the AtoL technique (Brusilovsky, 2002; Perkowitz and Etzioni, 1999). AtoL models groups of students according to a sequence of answered questions and their level of correctness, so that educational content can be adjusted to the students’ learning patterns. As will be seen in the following pages, we will capitalize on this technique when designing survey’s stereotype step by using a sequence of answered questions, while not taking into account the degree of correctness but the user’s cognitive skills, in order to better adjust survey’s feature-based step.
In contrast with the educational field, in generic AHS there are no regulations or specifications for data exchange and system interoperability, but standard methods and widely accepted tooling do exist. A complete list of most recognized methods and tools for AHS design can be found in Martins-Antonio et al. (2008). The previously referenced AVANTI system is also listed here. On the list, we found two methods of particular interest in respect of our urban time-use survey design. Both methods are based on stereotype models, either in terms of initiating a feature-based model (as in our case) or taking stereotypes as a single technique: HYPERTUTOR (Kavcic, 2000) is a system to describe the user strictly based on stereotypes. It uses exercises to obtain information about users and employs stereotypes for UM. The user can belong to one of three groups: novice, medium, or expert. INTERBOOK (Kavcic, 2000) is a tool for authoring and delivering adaptive electronic textbooks on the web. This AHS initiates a user model using stereotypes. For the interests of this research, we will consider HYPERTUTOR method when forming stereotype groups. INTERBOOK will be considered as a reference technique to initiate a feature-based survey with stereotype groups.
Finally, to conclude this review on methods and techniques in whose we build on, we need to mention what has been standardized and normalized in generic AHS to date. It is widely agreed within the AHS research community that the adaptive process is divided into three levels or layers: direct adaptation techniques, adaptation language, and adaptation strategies. This classification is aimed at standardizing adaptation techniques at the different levels. It therefore works towards exchanging adaptive techniques between different systems, as well as helping the authors of AH by giving them higher-level handlers of low-level adaptation techniques (Cristea and Calvi, 2003). Direct adaptation techniques are the lowest level of adaptation, including all existing AH applications: from adaptive presentation (inserting/removing fragments, altering fragments, stretching text, sorting fragments, dimming fragments) to linkage adaptation techniques (adaptive guidance and adaptive navigation support: direct guidance, link sorting, link hiding/removal/disabling, link annotation, link generation, map adaptation). For an extended summary, visit Brusilovsky (2002) and Wu (2002). Adaptation Language is the medium level of adaptation. Here, higher level techniques are grouped into standard adaptation mechanisms and structures, commonly known as programming language (Cristea and Calvi, 2003). Lastly, Adaptation Strategies are the highest level of adaptation. In this last level, adaptation techniques are based on cognition, modeling the user’s profile and processing user information. This research is closer to the Adaptation Strategies as survey’s feature-based step is based on user’s cognition level. Therefore, we will consider Adaptation Strategies standard framework when building up our survey’s feature-based part.
By using the described existing standards for AHS in the design of both stereotype and feature-based surveys, we ensure this research can help other survey initiatives to build on existing standards as well as to enlarge the current framework of hypermedia system standards implementation.
Stereotyped survey: User groups construction
As a first step, a stereotype survey is constructed by means of an initial interview to citizens which provides the basis for primary assumptions about the user, and is a valuable source of information for initially assigning the user to a certain user group or stereotype (Fink et al., 1997). The object of this survey is not to uncover information about urban time-use but specific details of citizens’ technological motivation and knowledge, in order to assign them to more precise stereotype subgroups prior to conducting an AH feature-based survey on time use.
Based on primary assumptions about the user (stereotype groups) and additional information collected about his or her knowledge of the application domain (stereotype subgroups), and once the right technique is selected, the system will be able to draw further inferences in order to create more precise assumptions about the user (a feature-based model). For instance, if the user is a retired person (primary assumption-stereotype group definition) and they are familiar with the web because they are used to talking with relatives using internet telephone systems (user knowledge of application—domain-stereotype subgroup definition), this information can be exploded and a precise feature-based model designed using clustering techniques, as we will see in the following section.
Although the stereotype survey will focus on retrieving information about a user’s technological knowledge and motivation in order to better design the feature-based survey’s applications and presentation, there will be some preliminary questions about time-use in order to collect information for the feature-based survey’s content adaptation. The user’s cognition depth, concision, explanatory skills, etc. revealed in these basic time-use questions will be the basis for constructing the feature-based survey’s user model. However, it could happen that a number of user models would be created because a feature-based survey needs to adapt survey content to several user profiles. F2F Stereotype Survey Questionnaire: Questions for initial stereotypes groups and subgroups formation and feature-based survey’s user model profiles construction.
Let us continue with the example of a retired person (stereotype group) interested in and knowledgeable about technology (stereotype subgroup). This person may have good explanatory skills so the use of free-text, for example, in the feature-based survey could be a good option for him or her and other users with this profile or, what amounts to the same thing, for all users attached to the same user model. On the other hand, another elderly person with the same amount of knowledge about and interest in technology could have no explanatory skills, and here a feature-based survey would be conducted with predefined options only, for example. Therefore, as will be seen shortly, the user model will accumulate all explicitly modeled assumptions gathered in the early stages of the stereotype phase representing the significant characteristics of the individual user. In this research these are age and occupation (initial stereotype group formation), technological preferences and interests—domain knowledge—and cognitive and communicative abilities—explanatory skills–(for stereotype subgroups formation and feature-based survey initiation).
To achieve the above, the stereotype survey is constructed in the following steps: Stereotype group formation: Stereotype groups are formed using age groups and occupation as first assumptions in a citizen’s profile. Stereotype survey design: Technological interests and preferences are surveyed in order to define stereotype subgroups. Basic questions about time-use are asked in order to gather information about a user’s cognitive and communicative abilities, so that accurate user models can be constructed and used to conduct a feature-based survey on time-use that is as personalized as possible. Conducting the stereotype survey and data collection: This first survey is face to face. It will be done in the universities participating in the research. Undergraduate students would be requested to both take part in and administer the survey in their immediate context (family, friends, etc.). In addition, they are asked to conduct the survey among a representative sample in a different context (e.g. a family in a different neighborhood). Students gather all data collected in databases specifically designed to fit with standards of AHS. Data pre-processing and information extraction: Information obtained in the previous stage cannot be directly processed. Noise and inconsistencies have to be cleaned. In this phase it is very important to prepare data for pattern discovery algorithms for feature-based modeling in the steps that will follow. Pre-processed data are pre-classified. All training items (data) receive a unique label signifying the class (stereotype group) to which the data belong. Given these data, a supervised learning algorithm builds a characteristic description for each class, covering all its training items. The process of assigning each training item to a unique class is called pre-classification, and supervised learning techniques apply here. Data processing: In contrast to data pre-classification into pre-set stereotypes, the construction of groups of individuals who share the same technological interests and knowledge (the same characteristics featured in the former stereotype group) is performed using clustering techniques. Clustering techniques belong to the group of unsupervised learning methods which do not require pre-classification of the training items. The main difference between these and supervised learning methods is that the categories (classes) are not known in advance but constructed after the clustering. When the cohesion of a cluster is high, it means that the training items in it are very similar and thus define a new class (in our case, stereotype subgroup definition). Stereotype Survey Pilot Test: Most common correlations between initial stereotypes groups, stereotype subgroups, and feature-based survey’s user model profiles obtained in a pilot test.

A clustering algorithm finds the set of concepts ensuring that: 1) the similarity between training data of the same concepts is maximized and 2) the similarity between training data of different concepts is minimized. In a clustering algorithm, the key question is how to establish the similarity between two items in the training data set (Frias-Martinez et al., 2006). Clustering techniques can be classified into hard clustering and fuzzy clustering. In hard clustering, data are divided into crisp clusters, where each data point belongs to exactly one cluster. In fuzzy clustering, the data points can belong to more than one cluster, and associated with each of the instances is a membership grade that indicates the degree to which it belongs to each of the different clusters (Frias-Martinez et al., 2006).
In this research, stereotype survey results will be clustered using hard clustering techniques because it is wanted that data belong to one single cluster for subgroup formation for the creation of further feature-based groups. Hard clustering techniques may be grouped into two categories: hierarchical and non-hierarchical (Jain and Dubes, 1999). A hierarchical clustering procedure involves the construction of a hierarchy or tree-like structure, which is basically a nested sequence of partitions, while non-hierarchical or partition procedures result in a particular number of clusters at a single step (Frias-Martinez et al., 2006). Non-hierarchal techniques will be applied in this research to clusters coming out of stereotype survey results, with the aim being to obtain a specific number of these clusters.
The main non-hierarchical clustering techniques are: 1) k-means clustering and 2) self-organizing maps (SOM) (Frias-Martinez et al., 2006). In the k-means clustering technique (Jain and Dubes, 1999) the number of K clusters is given as an input. The algorithm then picks k items, called seeds, from the training set in an arbitrary way. Then, in every iteration, each input item is assigned to the most similar seed, and the seed of each cluster is recalculated to be the centroid of all items assigned to that seed. This process is repeated until the seed coordinates stabilize (Frias-Martinez et al., 2006). The SOM algorithm (Kohonen, 1998), apart from being used in a wide variety of fields, is an interesting tool for exploratory data analysis, particularly for partitioned clustering and visualization. It is capable of representing high-dimensional data in a low-dimensional space that preserves the structure of the original data. SOM’s main advantage, in comparison to k-means clustering, is that similar input vectors are mapped to geometrically close winner nodes on the output map. These are called neighborhood preservations, which have turned out to be very useful for clustering similar data patterns (Frias-Martinez et al., 2006). The SOM algorithm has been selected to run stereotype survey results in this research as it works in one step and gives faster results, as well as being able to map an illustrative gradient of data distribution within the cluster.
We can find successful examples of SOM algorithm application in adaptive educational projects for student modeling. In the work named “Student modeling using principal component analysis (PCA) of SOM clusters” (Lee and Singh, 2004) the SOM algorithm is applied as a pre-processor of PCA. The authors, Lee and Singh, chose the SOM technique as it enables natural clustering without a priori knowledge. SOM quantizes the data set into clusters, grouping vectors with greater degrees of similarity together, which facilitates cluster analysis. Furthermore, PCA has been chosen because it does not require the number of clusters to be predetermined, as is the case with k-means clustering. PCA relies on the degree of variance to determine the number of clusters. PCA eigenvalues indicate the strength of the eigenvectors, thus verifying the number of clusters in the data set (Lee and Singh, 2004). We can also find some examples that use fuzzy clustering, as in Romero and Ventura (2010) Educational Data Mining review, where the fuzzy technique is chosen to support mobile formative assessment to help teachers to understand the main factors influencing learner performance (Chen and Chen, 2009). Fuzzy clustering application examples show that the technique is mainly chosen for mobile or changing data environments.
In general terms, clustering techniques are used to discover usage clusters and page clusters (Frias-Martinez et al., 2006). The usage model is a subcomponent of the user model that contains relevant characteristics of the environment (e.g. terminal location, user interface characteristics) in order to support a user’s technical motivation and to provide a convenient usage-oriented adaptation (Fink et al., 1997). Usage clusters aim to establish groups of users exhibiting similar technological interests and knowledge. Among the well-known and normalized models for usage clustering, we have paid particular attention to the Fu model (McGuinness and Van Harmelen, 2004). This model groups different users by taking into account their behavior and access patterns when using a web server. A complete list of normalized models for usage clustering can be found in de Virgilio et al. (2007). In this research we will use the SOM clustering technique to elaborate usage clusters within the predefined stereotype groups while considering the Fu model for usage pattern identification. These usage clusters group profiles into groups of users that require similar technical adaptations to match their technological interests and knowledge, which greatly alleviates the adaptation process (de Virgilio et al., 2007). On the other hand, page clusters are not of interest in this research thus the clustering of pages only reveals groups of pages with related content, to provide personalized Web content to users in an e-commerce environment, for example (Frias-Martinez et al., 2006).
In this research, the introduction of a threshold of similarity in the design of stereotype clusters is particularly interesting. As some authors (de Virgilio et al., 2007) state, a threshold of similarity avoids situations in which many clusters with few profiles exist or, conversely, a situation in which a limited number of clusters includes many profiles. In the former case, the selection of a profile would require an exhaustive search. In the latter, update operations would become rather inefficient. The threshold of similarity estimates that a profile can be included in a cluster if the distance from its root is lower than the threshold of the cluster (de Virgilio et al., 2007).
Feature-based survey: Personalization
As stated in the previous section, certain user actions developed in the stereotype model will be exploited for the acquisition of primary assumptions to elaborate the feature-based model. Specifically, we refer to the cognitive and communicative skills that users have shown in the short questionnaire about time-use contained in the stereotype survey.
The feature-based model’s target is to refine stereotype subgroups already obtained from usage clusters with the introduction of users’ cognitive and communicative styles. The model will contain relevant characteristics from former stereotype groups (e.g. “retired/elderly”) and from stereotype subgroups (e.g. “internet expert/quite familiar with internet/never used internet”), and the final introduction of a user’s cognitive and communicative skills, so that a feature-based survey on urban time-use that is personalized as much as possible can be designed. With that purpose, precise user models will be set. When feature-based conditions’ are met, a specific profile will be activated (or created) in order to classify a certain user. This will mean that features contained in the feature-based subgroup become assigned to the user.
Every user model will need a different adaptation for a personalized feature-based survey on time-use in accordance with the user model to be conducted. For example, although several users could be classified as “retired (stereotype group) + internet expert (stereotype subgroup)”, some of them might have good communication skills that will allow them to give excellent explanations of how they use their urban time. For these users it makes sense to activate some kind of free text application in the web-based survey. Such adaptation to cognitive style hides a three-fold implicit adaptation: adaptive content selection, adaptive navigation support, and adaptive presentation (Brusilovsky and Maybury, 2002). When the user is asked for information in the survey, the system will adaptively select and prioritize the most relevant requests according to the user’s profile. When the user navigates from one item to another, the system will manipulate the links (for example, “hide, sort, annotate”) in order to provide adaptive navigation support according to the user’s profile. When the user gets to a particular page, the system will present its content adaptively as well as presenting adaptive solutions to interact with the user according to his/her profile.
It will happen that some user models will provide low-hypermedia profiles (e.g. “retired + quite familiar with internet”) or even non-hypermedia profiles (e.g. “retired + never used internet”). In the first case, for low-hypermedia profiles, blended solutions can be used, similar to the already proven blended learning (BL) techniques. BL is becoming an increasingly popular form of e-learning, particularly suitable for use in the process of transition from traditional forms of learning and teaching towards e-learning (Alonso et al., 2005; Bubas and Kermek, 2004; Thorne, 2003). In this model of teaching and learning, significant numbers of f2f elements are combined with technology-mediated teaching (Hoic-Bozic et al., 2009). For non-hypermedia profiles, f2f solutions or telephone conversations are the only possible ways of conducting the feature-based survey in this research.
As in the case of stereotype group construction using clustering techniques, the key question will be what knowledge needs to be captured in order to choose the most suitable machine learning technique. Additionally, the choice of the learning method will depend greatly on the type of training data to be processed. We already know that the distinction between supervised and unsupervised methods is the need for training data pre-classification in the former case. Unlike in the previous case of usage cluster definition, in the modeling of a user’s cognitive style categories, or classes, may be known in advance as they will generically represent standard levels of cognitive skills (e.g. “excellent, good, regular, insufficient”). As a result, supervised learning techniques will apply.
The main supervised learning techniques are decision trees, classification rules, neural networks, k-NN, and SVM algorithms, all of which are used to model user behavior (Frias-Martinez et al., 2006). We have selected decision trees as they are typically used to execute classification tasks. The classification tree will be used to construct a personalized user model that takes into account levels of expertise in various areas, cognitive styles, etc. (Beck et al., 2003; Tsukada et al., 2001). Due to its ability to group users with similar characteristics, classification trees can be also used to implement recommendation tasks (Paliouras et al., 1999; Webb et al., 2001; Zhu et al., 2003). Decision trees go together with classification rules as the latter are an alternative written representation of the knowledge obtained from classification trees, which is in graphical form.
Conclusions and future research: Exploring predictive functionalities of AH surveys
The main conclusion of this paper is that AH surveys are a successful participatory process to collect citizens’ information on their urban time-use for urban planning purposes. Moreover, hypermedia adaptive surveys have inherent predictive functionalities; this will be explored in the future steps of our research.
We have presented a two-step innovative technique to conduct personalized surveys based on UM. The proposed technique can be summarized as a blending of stereotype and feature-based models so that the user model does not have to be initiated from scratch. What makes the mixed technique and its application to surveys even more interesting is its capacity for accurate prediction, a feature that is extremely important for urban planning aims.
Starting with the survey based on data concerning the ages and occupations (the initial stereotype groups) of a certain population set, and thereafter obtaining accurate results regarding that population’s urban time-use thanks to the feature-based model, the method allows urban time-use results to be extrapolated to the non-surveyed remaining population of a specific city merely on the basis of knowledge of its citizens’ age and occupation. Age and occupation are information readily available and collected in periodic population censuses. In other words, extrapolation functionalities of the method—based on initial user profiles and refined with personalization—will permit to extend survey’s results of a very significant sample to the remaining of citizens, allowing to represent the whole population of a certain city. This functionality is extremely important for urban planning purposes as the later cannot be biased, which means completeness when using data to represent urban features. Since the whole population of the city that needs to be planned is represented, the method is valid. With this, it is demonstrated that the blended technique of stereotype and feature-based models applied to surveys is an extremely powerful tool. This constitutes an important functionality in terms of our research’s aim of rethinking urban planning methodologies based on citizens’ urban needs, using urban time-use as the unit of measurement for these needs.
As the prediction possibilities remain to be studied, we have not yet decided on a specific technique to extrapolate data to a larger set of users rather than the sample, or which prediction technique to use. If an unsupervised technique is selected then we will choose association rules. Association rules capture sets of actions that are causally related. A typical application of association rules for UM is the capture of pages that are accessed together and typically used to implement recommendation tasks (Frias-Martinez et al., 2006). On the other hand, the k-NN algorithm and neural networks would be the most suitable supervised techniques. Neural networks are commonly used to predict user behavior (Sicilia and Garcia, 2004), but as behavior is not to be predicted, the best technique would be a k-NN algorithm. K-NN is a predictive technique suitable for classification (Friedman et al., 1975). Unlike other learning techniques in which the training data are processed to create the model, in k-NN the training data represent the model itself (Frias-Martinez et al., 2006). This special feature would make the k-NN technique very suitable for this research.
As already mentioned, stereotype and feature-based blended techniques applied to surveys introduce a predictive element into the model, allowing the urban-time-use distribution of not only a sample but also the overall population of a specific city to be surveyed. Stereotype and feature-based surveys’ predictive functionalities will allow to plan the dimensions and uses of city’s spaces as if we would have survey results for each and every citizen.
Under the procedures of this research, once citizens’ urban-time-use distribution is known, a rule of correspondence setting the qualitative relation between urban time-use and places where such time is spent can be set. Furthermore, indicators of equivalence will give the quantitative values of the correspondences. Although further research is needed to confirm this, it is likely that recommendation techniques would be suitable for establishing such correlations. Nowadays, recommendation techniques are being applied to complex and challenging research where lots of equivalences must be set in order to produce the right recommendation (an interesting example can be found in Ardissono et al., 2003). The evidence is that recommendation techniques would provide plenty of scope for application in this research’s data extrapolation and prediction aims.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
