Abstract
Various content-based methods (e.g., audio-visual feature recognition and dialogs analysis) have been proposed for analyzing, indexing and understanding movies. Our previous approach introduced a method for character indexing based on manual annotation. However, this method has shown several unsatisfactoriness on performing of indexing. For addressing this issue, in this work, we take into account using image processing techniques for semi-automatically character-based indexing. Besides, a movie ontological model is created for connecting character appearances and character’s roles in the movie. Moreover, we propose a system for assist user indexing manually. On the other hand, a searching and browsing tool is also introduced. Using this tool, user can query character-based semantic indexing. Experimental results are shown that our proposed method is able to assist user in consuming index time and providing a method for automatic indexing, searching and browsing based on semantic queries.
Introduction
Nowadays, discovering useful information from a movie has become a challenging task. Since the number of movies has been increasing rapidly, effective methods for movie analysis and understanding are needed. Considering these issues, movie content indexing and retrieval are required to describe, store and organize the movie content and to assist people in finding information quickly and conveniently [9]. A movie has the following characteristics: 1) character and their relationships; 2) movie storytellings; and 3) the role of character when they tell the story [2, 6].
Movie content-based analysis takes into account obtaining a structured organization of the original movie content and understanding its embedded semantic. Movie content-based indexing is the task of tagging semantic units obtained from content analysis to enable convenient and efficient information retrieval. In recent years, many approaches to this problem have been proposed including extraction and integration of some low- to mid-level audio-visual features such as visual-audio classes [14], color, textures, shapes, motion features, shots [27], key-frames [28], object trajectories, human faces [19] and so on to analyze the video semantic. However, such methods have shown some unsatisfactoriness on content analysis and indexing.
In general, a video is structured by mean of a hierarchy of video clips, scenes, shots and frames. Video analysis focuses on indexing and segmenting the video into the structure that have semantic meaning. Research in using image processing for indexing a video has a long story including shot boundary detection, key frame extraction and video scenes segmentation [9].
Among many methods of video/movie contents indexing, browsing and searching have been proposed, this issue is always a significant challenge task because of the complex information and various features including character appearance and its roles. Character-based indexing, browsing and searching are always seen as complex problem in the detection of appearance time and relationships among characters. In addition, the storytelling of a movie is also represented by the character co-occurrence and their dialogs. The existing methods for indexing and browsing based on character still have to face some challenges, especially in case of camera-shots and character features detection including appearance time, roles and relationship discovering and so on. Therefore, the study of character-based indexing, browsing and searching is important for user to understand the movie.
In our previous study, we proposed a manual indexing system and introduced a new method for extracting character network from indexed data (CoCharNet) [21, 24]. In this regards, indexing was the task to annotate character appearance in the movie. To address this problem, firstly, we proposed a indexing system for manual indexing of character appearance during movie playback. Data of this session is stored in XML scheme with attributes of Name, Starting/Stopping time when the characters appear “on-screen” visually. Secondly, we discovered relationships among characters based on indexing data by calculating total of co-occurrence time and number of co-occurrence time of characters. Using these data, we analyzed characters network to determine main character (protagonist). Thirdly, we segmented the movie by using average of intervals values. Finally, we extracted a summarization version of a movie based on the appearance of protagonist and main characters.
Based on these method, in this study, we propose a system for semi-automatically indexing character appearance, camera shots using image processing techniques including feature matching and face detection/recognition. Besides, we also develop a system for auto indexing, browsing and searching based on character appearance and its roles in the movie. In this regards, how to apply ontological model for indexing, browsing and searching are also focused. Regarding in term of appearance, in this work, we only take into account character appearing “on-screen” visually.
The main contributions of this paper include: 1) Our approach deals with character-based indexing problem including camera shots detection and character detection/recognition. 2) Using ontological model for indexing, browsing and searching in order to query useful information from a movie. 3) The simulation shows that our results are effective in term of improving the quality of character indexing and providing a auto indexing, browsing and searching system.
The remaining parts of this paper are organized as follows. In Section 2, we describe related work and the basic idea of indexing theory. Section 3 includes architecture of our system and proposed algorithm for indexing, browsing and searching useful information from a movie. Then, the evaluation of the proposed approach will be described in Section 4. Finally, we conclude this paper in Section 5.
Related work
Recently, many approaches have been proposed to deal with movie/video content analysis and indexing have been proposed. The state-of-the-art of these approaches focus on applying social network techniques to determine character and index the movie. There are several papers hat optimized signal solutions to solve this problem. In our previous study, we proposed CoCharNet [20, 21] for determining protagonist, main character and minor character. On the other hand, the approach in [14] used audio-visual analysis for detecting semantic indexing based on discovering speaker dialogs such as 2-speaker dialogs, multiple-speaker dialogs, and hybrid events. Perperis et al. [18] provided a method for detecting violence scenes in the movies based on ontological and reasoning that combines the audio-visual cues with violence and multimedia ontologies. Moreover, some approaches in [15, 16] used motion and color information and color histogram to index the video and build ontology editor for indexing video content. In this regards, Bloehdorn et al. [5] showed a method to build semantic ontology for multimedia analysis.
Recent research in using ontology for movie analysis have been focused on providing a controlled vocabulary to semantically describe movie related concepts (e.g., Movie, Genre, Director, Actor) and the corresponding individuals (e.g., “Ice Age”, “Drama”, “Steven Spielberg” or “Johnny Depp”). Other ontologies provided the Linked Data cloud to take advantages of synergy effects. The movie ontology MO mainly focused on concepts and the semantic relationships [1]. However, this approach only consider basic information of a movie such as Title, Character Name, Director, Released Date and so on.
A camera shot in the movie has an important role. Each of them contains a series of continues frame. There is strong content correlation between frame and each shot. To detect the shot, similarities between frames are extracted by using feature extraction, similarities measurement and detection to identify boundary if that are not similar. The features used for shot detection including edge change ratio, motion vector, conner points, histogram and so on [6, 9].
Nowadays, content-based research has become an important task in movie analysis. Many approaches have been proposed for indexing and understanding movie. Weng et al. [26] provided a method for analyzing movie videos by using social network techniques. This research focused on conducting semantic movie analysis to discover hidden semantic information including character roles and story segmentation. On the other hand, Park et. al. taken into account of character dialogs to extract character network and index protagonist, antagonist and support characters [17]. Moreover, Li provided a method to index events in the movie [14].
As mentioned before, this study takes into account how to improve the quality of character appearance indexing by using video and image processing including camera shot boundary detection. Besides, we provide a new method for auto indexing, browsing and searching useful information by applying ontological model and provide an approach for auto indexing based on character role, appearance and camera shots. Figure 1 depicts our proposing system architecture. Firstly, a list of characters are parsed from IMDb 2 database. This list then used for searching training data set from Google image and identifying face of characters based on an image that is parsed from Google images search services 3 . Secondly, feature matching is used to detect and identify who are appeared in a camera shot. The indexing data then is fined by user manual for each unrecognized shot. In additional, using this data including camera shots and character appearances, we apply our previous study to determine their roles [20, 21]. Moreover, we create a movie ontology for semantics extraction in a movie. Details of the system will be introduced in the next section.
Character-based indexing
In this section, we describe a method for building semi-automatic indexing system. On the other hands, we describe a movie ontological model for auto indexing, browsing and searching based on character in the movie including character appearance, co-occurrence and roles in a movie. In this study, we adopt face detection/recognition and feature matching method from OpenCV library in [3] to identify camera shots and character in the movie.
Camera shot of movie
In movie making, camera shots are a set of frames that runs uninterruptedly. Owing this characters and themes will be demonstrated using camera shots. In movie analysis, camera shots are very important in shaping meaning of a movie [13, 23], tin this study, we take into account applying camera shots boundary detection technology for indexing character appearance in the movie.
Among approaches that have been proposed for camera shots boundary detection. In this study, we focus on feature matching method in comparing two continuous frame to detect camera shots boundary in the movie. If two frames have difference features, a camera shot is detected.
Face detection and recognition
For understanding a movie, character plays the important role. When the audience watches a movie, they usually analyze movie story based on character appearance and their relationships. The most important task for movie analysis is to determine these facts. Many approaches have been proposed to address these problems. Haar-like is general method that used for human face detection and recognition [8, 25].
In this work, we apply Haar-like method for detecting and identifying character by using face detection and recognition. The training dataset for face detection and recognition are parsed from Google 3 images search and directly from the given movie.
Semi-automatic indexing
Recently, many approaches have been proposed to address automatic indexing problem including using image and audio processing/recognition, data mining and so on [12, 26]. However, such methods have several some unsatisfactoriness on content analysis and indexing especially in character identification including accuracy rate of object recognition and detection, features indexing and so on. For addressing this issue, we introduce a semi-automatically indexing approach. In this regards, we use two scenarios to index character in the movie including automatically and manually methods. The architecture of this system is shown in Fig. 3. The character list is parsed from IMDb and the training data for face detection/recognition is parsed from Google image search. Camera shots are indexed by using character detection/recognition. Finally, the unrecognized shots will be indexed by using manual indexing method. Note that character-based indexing of the movie is a sequence of camera shots in which each of them contains at least one character.
1:
2:
3:
4:
5:
6: detect the face from f;
7: recognize the face from f;
8:
9: recognized(s i );
10:
11: Manual identify character for s i ;
12:
13:
14:
15:
Algorithm 1 shows the character-based semi-automatic indexing process. Each shot in the movie will be indexed based on character appearance “on-screen” visually.
Semantic indexing structure of movies
In the oder to satisfy the needs of users, we define the structure of movie semantic indexing as Fig. 2. A camera shot in the movie will be indexed by describing its contents. When the user watches a movie, the perspective contents including characters, time interval and their relationships. In this work, the subject of a camera shot is referred to time that character is appeared in movie playback. In additional, the characters who are appeared in certain camera shots are also seen as the most important ones (i.g., Rose DeWitt Bukater in the Titanic movie). Regarding movie understanding, users are always likely to index and search by character’s characteristics such as the appearances time, relationships, playing roles and so on. So that, the subject of key character (i.e., protagonist) is a very effective and efficient attribute for users to index, search, and administer the movie.
For semantics indexing, we apply our previous study [20, 21] for determining character roles in the movie. Let be a set of character roles in the movie. Let be a set of characters. The character role in the movie is represent as
Let be a set of camera shots in a movie. A shot is represented as
Let be the set of camera shots in a movie. We can define a movie as a sequence of camera shots as the following.
Various methods which are related to applying ontological model for semantics extraction have been proposed [7, 22]. By using ontological model, we can provide automatic semantic indexing for camera shots which include Starting time, Ending time, who appear and its role based on user query. In a movie, a character play an important role. To index characters, we propose a new ontological model for auto indexing, browsing and searching based on semantics including camera-shots, character roles and so on by using user semantic queries. The Movie Ontology is defined as follows.
Based on this definition, we could represent the concepts and relation of the movie ontology as the following.
Class “Movie” represents the information of a movie including Title, Released Date, Director, Length. This concept is defined as
Class “Character” represents its characteristics in a movie including Name, Social value. This concept is defined as
Class “Camera-Shot” represents time intervals of a shot. This concept is defined as
A movie has a set of characters. Relationships between movie concept and character concept are determined as
Relationship between the character (c
i
) and the characters (c
j
) in a movie is represented by a pair (α
ij
, β
ij
) where α
ij
is the total time of co-occurrence, and β
ij
is the number of character co-occurrence of (c
i
) and (c
j
). Relationships between two characters concept are determined as
A movie is a sequence of camera shots. Relationships between movie concept and camera shots concept is determined as
Using this model, we can connect character-based indexing data to other semantic features including character’s role and so on. In this work, we apply our previous work [21] in analyzing the character network to determine character role and their relationships. More detail of using movie ontological model will be described in the next section.
In oder to use movie ontological model for semantics extraction, we introduce SQL-like language for querying information. This language maintains the basic structure (SECLECT-FROM-WHERE) and uses the concepts and the properties of movie ontology like tables and columns in relational database. Syntax and semantics of this language follows: SELECT: identifies the concept attribute values to be returned. FROM: specifies the concept or concepts to query from. This factor always follows SELECT. WHERE: declares logic operations and comparisons between concept’s attribute that restrict the answers returned by the given query. This factor always follows SELECT.
The operators of the query language are following: AND: connects two expressions in WHERE and return values satisfying both expressions. OR: connects two expressions in WHERE and return values satisfying at least one expressions. EQUAL_TO|NOT_EQUAL_TO: check whether the time of a camera shot matches/or not matches with specified value in WHERE. LIKE: checks whether a datatype values matches a specified value in WHERE. IS_ROLE: checks whether the role of character matches with specified value in WHERE. NOT_NULL|IS_NULL: check whether the concept is null or not in WHERE. DISTINCT: only return difference values in SELECT.
Algorithm 2 presents a method to explore data in movie ontology. Owing this algorithm, we can search useful information using the user query asfollows.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
Character-based indexing, browsing and searching tool
A movie is a sequence of camera shots in which each of them contains the useful information including time interval, character appearance and their relationships. To index, search and browse information from movie, we propose a system using movie ontological model. This system can assist user in semantic query about character in themovie.
Using Definition 1 and ontological model, we can build a query grammar for searching semantic information. Table 1 illustrates the query context in free grammar for movie ontology based on characters appearance characteristics such as character name, role and so on. Table 2 shows several examples of using SQL-like language for retrieval information from movie ontology .
Character-based indexing
In this study, we take into account semi-automatically indexing by using character appearance and their relationships based on applying image processing technologies. As shown in Fig. 1, the indexing data will be used to determine co-occurrence data among characters. By applying our previous study, we can discover protagonist, main and minor character. Applying movie ontology, we index, search and browse semantic information in the movie based on user semantic query.
Figure 4 illustrates movie ontological that uses for semantics extraction of a movie. In this regards, user semantic query will be analyzed and linked to ontological concepts for indexing movie based on character playing roles, time interval and relationships. Besides, we can connect character roles to data indexing for semantic searching andbrowsing.
Browsing tool
Our approach provides a new method for searching and indexing based on character in the movie. The search function uses semantic query from user to search, index and brow data. The indexing function displays camera shots, and character information, as shown in Fig. 5. By using movie ontological model, the semantics data can be performed based on user semantic query and supports retrieval as a more interactive process by semantic querying such as character role and/or their relationships. In addition, the relationships among character are used for extracting a network. Moreover, we can connect the indexed data with other semantic contents based on character in the movie. User also can pose semantic queries on the semantics of contents. These queries may be based on character roles or/and appearance and/or name or their strength of relationship.
Results and discussion
For experimental purpose, the semi-automatically indexing system is implemented using Java, VLCJ 5 , Prefuse 6 , and OpenCV 7 API. OpenCV API is used for camera shots detection and indexing by using feature matching and face detection/recognition. Google API and Google Custome Seach is used for getting face training data, and JSOUP 8 is used for parsing character list and movie information from IMDb. The movie ontology is implemented by using Protégé 9 .
We used 03 movies for evaluation section including Titanic 1997, Frozen 2012 and Start War A New Hope (1977). For experimental proposed method, we selected 19 characters that are parsed from IMDb database in each movie.
Results
Figure 4 shows the semi-automatically indexing module. Regarding this module, we have some components that are described in the following
As result, this study has better performance comparing manual system that we proposed in the previous work. As shown in Table 3 and Fig. 6, we can reduce the indexing time when use this method. For example, in processing 13 minutes of the Titanic movie, total of 150 camera shots are detected. Besides, when the number of face training data are increased from 180 to 220 and 260, the number of recognized shots are increased from 30 to 45 and 60 and the time consuming are increased from 3.5 to 5 and 7.5 minutes, respectively. This result shown that if we have more training data, the recognition rate is also increased. However, consuming time of using face detection/recognition for character identification is quite expensive because of frame by frame processing. We took 4 hours for processing 13 minutes of the given movie.
Table 4 and Fig. 7 show that when we use camera shots detection, user also can reduce the time consuming for indexing the given movie. Using this system, we can reduce average of 33.5% of indexing time.
Figure 8 illustrates indexing, searching and browsing module. Using this module, user can search by using semantic meaning of character appearance. Table 2 shows query example for semantic meaning of auto indexing. Regarding this module, we have some components that are described in the following
As result, when the query search is “SELECT Camera-shot.* FROM Camera-shot, Character Role IS_ROLE protagonist AND Role IS_ROLE main”. This query is meant “Give me all camera shots where protagonist and main character are appeared”. After processing this query, all of camera shots that contain the protagonist and main character are indexed. Besides, a network between protagonist and main character is also extracted and visualized. Figure 8 shows the result from this query in the Titanic movie. User can determine character relationships and browsing each camera shots by playing its and observing character network visualization for more clearly understanding.
Discussion
Recently, movie indexing has been identified as the most important task. Automatically indexing refers to many issues including story movie storytellings analysis, automatic identifying other features such as activities, events and so on. Character-based indexing is one of methodologies for assisting user in movie understanding. Although this approach is able to index based on character appearances where is many efficient features in the movie can apply for better performance such as audio features, character dialogs and so on.
The proposed system is feasible and flexible for character-based semi-automatic indexing by using image processing techniques. The training data is parsed from IMDb database and Google image search and direct from the movie. By this way, the character is able to be indexed. However, the correction rate of face detection/recognition and the processing time are the big issue that is needed to be addressed. Moreover, other semantic contents of the movie are also needed such as character’s activities, events and so on.
There are some limitations regarding this proposed method. In this study, we only taken into account character appearance ‘on-screen’ visually. In this regards, other features in the movie need to be considered including character activities, character emotions, character dialogs and so on for more deeply understanding and indexing. There are many interesting issues for the future work in this research area.
Conclusion
Recently, movie indexing perform the most challenging task. In order to understand a movie, user likely index character and their relationships during movie playback. In our previous work [21], we provided a method for manual indexing and used this data for extracting character network and a summarization version based on protagonist and main character in the movie.
In this paper, we provided a method for semi-automatic indexing and introduced a movie ontological model for semantics extraction based on semantic meaning. To address this issue, firstly, we introduced a system for indexing character appearance in the movie by applying feature matching and character detection/recognition based on OpenCV library. In this regards, character list and face training will be parsed from IMDb and Google image search. Besides, user also can get more training data directly from the given movie. Secondly, we introduced a new method for creating movie ontological model. Using this model, we were able to automatically index the semantic meaning of the movie based on character appearance. Finally, we introduced a tools for automatic semantic indexing, searching and browsing. Results show that our proposed method had more effect comparing with manual indexing system in reducing indexing time and the correction of character appearance.
It is considered that the proposed method is useful for not only movies but also video classes. Our method is also able to apply in a various videos class for indexing, searching and browsing. Next period can be achieved by developing indexing performances system based on integrating other features from movie including character dialogs, audio classification, and so on.
Acknowledgments
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (NRF-2014R1A2A2A05007154).
