Abstract
With the explosive growth in time spent on YouTube by babies and toddlers, it’s important to analyze what they’re watching on YouTube. Indexes that evaluate the contents of YouTube channels for infants and toddlers have been developed, but since those were evaluation-based indexes of educators and parents, it is difficult to find out what content children are watching. In this study, the YouTube content that infants mainly watch was content analyzed in three developmental areas: cognition, emotion, and socialization. Specifically, language destruction was analyzed for the cognitive field, verbal and physical violence for the emotional field, and emotional expression, understanding others’ emotions, emotional control, antisocial and prosocial behavior representation, and prosocial expression were analyzed for the socialization. As a result, the emotional index was the highest, and the physical violence index was very low. In general, emotional expression, understanding of others’ emotions, and prosocial behavior, which had a positive effect on early childhood development, were higher than linguistic destruction, verbal violence, and physical violence, which had a negative effect.
Keywords
Introduction
According to the 2020 Children’s Media Use Survey conducted by the Korea Press Foundation, the average daily use time of YouTube by children aged 3 to 9 in Korea is 115.4 min, exceeding the average daily use time of YouTube by domestic teenagers. The American Academy of Pediatrics (AAP) guidelines recommend that children under 8 months old should not use video devices and limit the screen use time to 1 hr per day of high-quality programs for children aged 2 to 5 years. So 115 min of time Korean children spend in front of screen is nearly double the amount of screen time recommended by the AAP. Another problem is that children are also being exposed to visual media at an earlier age. The age of first media use also accelerates with age. About 75% of children aged 3 to 4 watched television before 24 months, and about 50% used smartphones before 24 months (Korea Press Foundation, 2020). As the viewing time of infants increases, the number of YouTube channels targeting infants is also increasing. Various children’s contents such as animation, unboxing content, and English songs are actively produced and consumed on YouTube. Nox Influencer, a YouTube marketing and service platform, shows that there are always at least three or four children’s channels ranked in the top 10 of YouTube subscribers and users.
Considering the high proportion of programs for young children on YouTube, it is crucial to analyze what they are watching. Only recently have indices been developed and applied to analyze content aimed at infants and young children (Neumann & Herodotou, 2020), but these analyses are not systematic enough. They rely on scholars’ evaluations of factors such as age appropriateness, adequateness of video formats, and educational values, making it difficult to obtain a precise understanding of what children are actually watching. In this study, we analyzed YouTube content aimed at infants across three areas: cognitive development, emotional development, and socialization development. In particular, instances of language destruction involving incorrect grammar and nonstandard language use were examined within the cognitive domain. Meanwhile, verbal and physical violence, emotional expression, comprehension of others’ emotions, and emotional regulation were investigated in the emotional sphere. In addition, anti-social behavior, prosocial behavior, and prosocial expression were scrutinized within the realm of socialization.
Although YouTube has created a separate platform for children’s channels, YouTube Kids, to protect children’s viewing experience, it is important to analyze the content they are actually consuming, as many young children consume a wide variety of content. For example, content that is intended for adults, such as “Fake Soldier,” has become popular among Korean children, leading to instances where children mimic the military-style culture and catchphrases from the content (A. Kim, 2020). There have been many instances where children access inappropriate content on YouTube such as provocative content, sextual content, discriminatory remarks, and the use of profanity. Recently, videos of featuring young and rich lives of young children wearing luxury clothes were popular among Korean children (E. Kim, 2023). Those videos can install materialistic values in young children so that children prioritize money and material over other intellectual, ethical, and aesthetic values. The data collected from this study can serve as an important foundation for understanding what children watch in reality, and developing a more finite quality index for children’s programs in the future.
Literature Review
Content Analysis of Kid’s Programs
According to Y. Lee and Yoo (2020), there are several factors contributing to the growing use of children’s content on YouTube. First, legacy media fails to provide enough content that children prefer, while YouTube offers a wide range of genres of children’s content uploaded from across borders. This includes nonverbal content, simple, easy-to-follow narratives, and characters that can be accessed without cultural or language barriers. Second, YouTube is the easiest platform for content providers to reach a global audience, as anyone can access and watch content for free. Third, while broadcast media is heavily regulated by governments to protect children and youth, online platforms like YouTube have the advantage of being relatively free from content regulation. This has led to an increase in sexually explicit and stimulating content for children, as creators generate profits based on views, subscribers, and comments within the YouTube platform. Due to these characteristics of YouTube, creators often create videos with provocative content to attract viewers’ attention and generate revenue. In addition, children’s content on YouTube can promote various commerce-related businesses by introducing products related to the content, such as stationery and toys, and earning ad revenue based on video views. An example of this is the highest-earning YouTuber in 2018, a 7-year-old with a channel called “Ryan ToyReview,” who partnered with toy companies to produce licensed toys and books for his 17 million followers. Consequently, YouTube has become a favored platform for babies and toddlers, but it is important to note that anyone can create and broadcast video content on the platform, leading to a potentially excessive commercial or stimulating environment.
Considering the volume of YouTube viewing among toddlers and the status of children’s channels on YouTube, it is crucial to analyze and study the videos children are watching on YouTube. A recent study by Neumann and Herodotou (2020) presented YouTube’s evaluation criteria for children’s channels in four dimensions: age appropriateness, content quality, design features, and learning objectives. Age appropriateness refers to whether the content is suitable for a child’s ability and maturity to understand or process. Content quality refers to how meaningful and child-friendly the message is, with an emphasis on nonviolent messages. Design features refer to the presentation of structural and technical characteristics of video images, such as pacing and the use of computer graphics, to children. Learning objectives refer to the extent to which children can learn cognitive, physical, social, and emotional skills and abilities through interaction with videos. Based on this developed scale, Yang and Jeon (2021) analyzed 60 videos from six Korean children’s YouTube channels. The results showed that out of the 60 videos, 40 were recommended for infants and toddlers, and 20 were not recommended for them. The channels with the most recommended content were “Toymong TV” and “Mini Commando TV.”
Another study by Kwak (2019) evaluated YouTube kids’ content as focusing on stimulating videos that emphasize interest and fun rather than educational values. Some selected popular content used violence as a mean to avoid or escape from problems or awkward and difficult situations. Shin (2020) qualitatively analyzed the content of four famous YouTube channels for infants and toddlers in South Korea: “Boram Tube,” “Seoeun Story,” “Lime Tube,” and “Toy Pudding TV.” The analysis revealed two socio-cultural significances. First, children’s YouTube content commercialized private experiences with families in the home realm, showcasing sponsored commercial toys as the main focus of children and parent’s play activities at home. Moreover, homes were represented as spaces overflowing with product placement toys that were sponsored from toy companies. The second significance was the production of materialistic discourse and the deepening wealth gap between the rich and the poor. Many stories implied that economic costs were involved in experiencing fun and interesting play. For example, in the Vlog of “Boram Tube,” when a child begged his busy father to play with him, the father refused but continuously bought new toys, and the child opened and enjoyed them. In addition, the act of playing with toys, eating snacks, and visiting kids cafes were all presented as product placement advertisements, and most of the content, such as close-up exposure of character names, had commercial elements. Play was portrayed as artificial and stimulating, rather than natural, and noncommercialized play, such as parents and children singing songs, telling stories, or playing with everyday objects, was absent.
A study that analyzed advertisements inserted into children’s YouTube programs collected and analyzed 859 advertisement videos in 242 children’s videos (Y. Lee & Yoo, 2020). The study revealed that out of the 859 advertisements, 36 were advertisements for products harmful to children. Among them, there were 16 pawnshop advertisements, and advertisements for children’s favorite foods with high calories and low nutrition were also executed. Fried chicken, a food that can cause obesity or hinder growth, was featured in 36 cases, and even coffee advertisements were found in 29 cases. Considering that children’s viewing patterns are repetitive, advertisements can have a lasting effect on them, stimulating their desire to make purchases. YouTube has been the primary video platform for children for quite some time. It is evident that many videos tend to prioritize materialistic and commercial content, focusing on entertainment and fun rather than educational values.
Kid’s’ Cognitive, Emotional, and Socialization Development
In this study, we categorize the developmental areas of infants into cognitive, emotional, and social development and content analyzed YouTube videos that represents each developmental areas. First, the cognitive domain includes perception, attention, thinking, learning, memory, and creativity. Cognitive processes significantly contribute to a child’s development as they intersect with various aspects of growth, including emotional and behavioral development (Im et al., 2006). Cognitive development promotes positive play behavior in young children by allowing them to engage positively with peers and resolve conflicts effectively (Woo, 2016). Among the areas of cognitive development, YouTube mainly affects language development. Therefore, this study will focus on language development.
In early childhood development, emotional intelligence is a crucial ability that supports the development of various behaviors. It involves the ability to control and regulate one’s emotions through thinking or reasoning skills (Mayer & Salovey, 1995). Children with high emotional intelligence demonstrate positive self-regulation, decision-making, a positive self-concept, and prosocial behavior (Choi, 2018). According to Erikson (1950), expressing suppressed emotions through play helps relieve negative emotions such as conflict, anxiety, and aggression. Therefore, it can be said that the expression of emotions by protagonists in videos can contribute to children’s emotional development. This study examines how emotions are expressed, understanding of others’ emotions, and emotional regulation in YouTube videos for young children. In addition, violence in videos is a factor that influences the emotional domain of children. Thus, this research analyzes both emotional expression and violence in the emotional domain.
During early childhood, children establish various social relationships and learn essential prosocial behaviors, ethics, and gender roles necessary for living with others (Johnson et al., 2005). They also learn to navigate problems that arise in peer relationships and develop coping strategies (E. Kim, 2013). In play situations, children learn about social relationships through interactions with peers and improve their social skills, such as helping, sharing, caring, and cooperating with others (E. Kim, 2013). In addition, while competing and engaging in play conflicts, they learn about living harmoniously with others, moral standards, and social rules (Nelson et al., 2005). Therefore, various prosocial behaviors expressed in YouTube videos can influence children’s social development. This study analyzes both prosocial and antisocial behaviors expressed in the videos.
Cognitive Development
Although watching TV by infants under the age of 2 has a negative effect on language acquisition and improvement in mathematical abilities (Barr et al., 2010), it has been observed that viewing age-appropriate children’s programs by infants aged 2.5 years and older consistently increases their understanding of video media until the age of 12 (Anderson & Hanson, 2010). Since the cognitive effects of video content are strongly influenced by the content, it is important to study the formality of the video, the use of vocabulary in the program, and the mathematical content. In this study, we focus on the linguistic representation.
Early childhood is a period of rapid language development and a critical time for laying the foundation of language abilities (S. Han & Kwak, 2013). The language acquired during this time is not easily corrected (Kim, Choi, & Ko, 2017). Therefore, it is crucial to use accurate language in the content that infants and young children watch. Previous studies have indicated that increased screen time for infants has a negative impact on their language development (Cho, 2016; Chonchaiya & Pruksananonda, 2008; Williford et al., 2007). As infants spend more time watching screens, they miss opportunities for interactive conversations in real-life situations. In addition, increased screen time makes it difficult for parents to control and regulate the content their children watch, leading to indiscriminate exposure to various types of content. However, when watching educational programs primarily, viewing has a positive effect on language development (Linebarger & Vaala, 2010), so it is important to identify the content that children watch.
Several studies have analyzed the language used in children’s YouTube videos. In a study by Yang and Jeon (2021), it was found that the interactions of the characters in the “LARVA TUBA” series consisted mostly of expressing emotions through continuous utterances of single syllables (ah, uh), shouting, and laughing, rather than clear dialogues. In addition, in the content of “ToyMon TV,” the narrator mimics the speech of a child by mumbling and using unclear pronunciation. In a study that analyzed the language used in the “Carrie and Toy Friends” channel, which is gaining explosive popularity among children, creator Carrie uses explanations and voices suitable for infants and young children, and incorporates onomatopoeia and mimetic words that can attract infants and young children’s interest (Kim, Choi, & Ko, 2017). However, during the course of the video, Carrie, the creator, was found to have a high frequency of mispronunciations of standard and nonstandard words, such as “wipe” for “wippe” and “chocolate” for “zzocolatte.” The show also featured instances of hitting friends, shouting at them, and blaming friends.
Emotional Development
Media-related emotion research focuses on the process of perceiving various emotional content expressed in media, such as emotional empathy (Farrant et al., 2012) and emotion recognition (Bierman et al., 2008). It does not primarily focus on the subjective development of emotions. Emotional expression refers to the way individuals express their emotions through verbal and nonverbal behaviors, including facial expressions. Within the family, emotional expression plays an important role in providing a context for children to learn about emotional regulation and expression rules. Parents serve as models for their children, reinforcing their children’s emotional expression (Saarni, 1989). The way parents express emotions also influences children’s development of self-regulation by sensitizing them to nonverbal cues and helping them better understand emotions (Dunsmore & Karn, 2001). It also enhances their understanding of others’ emotional reactions (Halberstadt et al., 1999) and teaches them appropriate emotional responses in social interactions (Woo & Chong, 2003). Previous studies have shown that when mothers express positive emotions like joy and affection, it positively impacts their children’s emotional expressiveness and emotional literacy (Cassidy et al., 1992). However, excessive intensity of negative emotions such as anger, upset, and sadness can impair children’s emotional competence (Denham et al., 1997). When children watch YouTube and are exposed to emotional expressions of other children and adults, it can positively influence their emotional development by helping them understand the emotions of others. This study aims to examine how emotions are expressed by characters appearing in YouTube videos.
Another factor that affects emotional development is the presence of video violence. Repeated exposure to violent contents can lead to habituation of certain natural emotional reactions, which is called desensitization (Huesmann & Kirwil, 2007). Research and discussions on the harmful effects of video content have primarily focused on violent language, behavior, and sensationalism. For instance, Gerbner conducted an analysis of prime-time drama and entertainment programs on three major U.S. television networks (ABC, CBS, and NBC) in the late 1960s. The findings revealed that these programs depicted more violence than what occurs in real life. Based on this, Gerbner proposed cultivation theory, which suggests that heavy TV viewers, in comparison with light viewers, tend to “cultivate” a worldview that resembles the symbolic world portrayed on television—a world that is more violent than reality. Subsequent studies have reaffirmed that prolonged exposure to television leads to frequent exposure to violent content, thereby shaping the perception of society as malevolent and perilous (e.g., Gerbner et al., 1980; Morgan & Signorielli, 1990). Social cognitive theory, as proposed by Bandura (2002), explains that individuals can learn about violence through observing violent acts depicted in video content. This theory suggests that individuals can imitate violent behavior projected in real life, even without direct experience of violence.
Violent content is also prevalent on YouTube. For example, the Elsagate phenomenon shocked many parents, as it involved the mass distribution of YouTube videos depicting characters from the animated film “Frozen” in sexualized and violent ways (S. Kim, 2017). Studies have also shown that clicking on related content on YouTube can increase the risk of accessing adult content, regardless of the initial content selected (Kaushal et al., 2016). Individual YouTubers often include profanity, vulgar expressions, and abusive language in their broadcasts, alongside explicit content (Lee & Yu, 2017). This problematic language is found in an average of 19.7 instances per video and 1.9 instances per minute in individual creator content. Among these instances, approximately 35% consisted of profanity, 32% included vulgar expressions, and 22% involved informal language, shouting, or abusive language. While these YouTube contents, created by individual creators may not specifically target children, it is concerning that many children are exposed to such programs.
Socialization Development
According to Bandura’s social learning theory (Bandura, 2001), the mass media plays a role in shaping children’s values and behavior. Children tend to observe and imitate the actions of television characters, especially when they have confidence in their ability to imitate, when they are in similar situations to what they see on television, and when they perceive imitation as beneficial (Bandura, 1986). Many studies have demonstrated that children often identify with television protagonists and try to imitate their behavior and appearance (Bussey & Bandura, 1999), which can reinforce the process of socialization.
Cultivation theory suggests that children view the world portrayed on television as the real world, as they are continuously exposed to consistent values in television messages (Gerbner et al., 1980). This prolonged exposure to television influences the formation of children’s values and worldview. For instance, research has shown that consistent depiction of stereotypical gender roles on television is associated with a lower proportion of women entering the workforce and a higher proportion of women perceiving themselves as homemakers (Shrum, 2004; Signorielli, 2001). A study by Huber (2004) focused on prosocial behavior and found that first-graders who frequently watched videos portraying prosocial behavior exhibited more altruistic behaviors like sharing and helping others (Rosenkoetter, 1999). However, even educational programs and the Disney Channel only show about four instances of prosocial behavior per hour (Smith et al., 2006), indicating that it is important to examine the types, frequency, and context of prosocial behavior depicted in children’s video media.
Research Questions
Various evaluation criteria have been proposed for assessing children’s programs. However, the purpose of this study is to propose an evaluation index of children’s YouTube videos focusing on cognitive, emotional, and social development. Therefore, we will investigate the presence of cognitive, emotional, and social content in popular children’s YouTube videos (RQ1) and also assess how these aspects differ across different genres on YouTube (RQ2).
In addition, it is crucial to examine the context in which cognitive, emotional, and social outcomes occur. For example, it is important to identify whether verbal disruption involves adults or children, who the victims and perpetrators of violence are, and the gender of characters engaging in prosocial behavior. Hence, we will also analyze the contextual factors related to cognitive, emotional, and social aspects in popular children’s YouTube videos (RQ3).
Methods
Analysis Target
The analysis included 169 popular YouTube videos that were collected through top-ranked YouTube videos and parent surveys. The YouTube Top Ranking videos were determined based on ranking data from channels in the Kids/Children category on YouTube Ranking (youtube-rank.com). This website gathers statistical data such as views, likes, and comments from YouTube videos to rank channels. Out of the 200 children’s channels in the rankings, 54 channels were analyzed to ensure that the videos actually contained children-related content and were not solely uploaded to YouTube channels. In addition, a survey was conducted on 1,020 parents with children aged 4 to 6 from March 31 to April 8, 2021. The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board in South Korea (ewha 202103-0028-01) on March 29, 2021. The parents were asked to provide the titles of the programs their children usually watched on weekdays and weekends. A total of 6,252 responses were received. To avoid duplication and ensure relevance, 34 channels were added as analysis targets by checking if the content was uploaded exclusively on YouTube and if any duplicate programs were found among the top-ranking channels. Ultimately, out of the 88 channels collected from YouTube rankings and parent surveys, 85 channels were selected for analysis, excluding three channels that primarily served as toy manufacturers’ advertisements. From each channel, two videos were chosen as analysis targets: the most recently uploaded video and the most viewed video. This resulted in a total of 169 videos being analyzed, excluding one duplicate video.
Analyzed Items and Index Calculation Method
The analysis focused on the content and genres of videos related to cognitive, emotional, and socialization development that appeared in popular YouTube videos for children. Specifically, we coded one subitem for cognition: Linguistic disruption. For emotion, we included five subitems: verbal and physical violence, emotional expression, understanding the emotions of others, and emotional control. For socialization, three subitems were considered: antisocial behavior, prosocial behavior, and prosocial communication. The literature was referenced to provide specific details for each subitem. Regarding linguistic disruption in the cognitive field, the analysis examined whether the video content contained nonstandard pronunciations, ungrammatical expressions, slang words (slang, coinage, communication), and nonstandard vocabulary. Studies by Park and Hwang (2016) were consulted in this regard. Nonstandard pronunciation refers to intentionally deviating from standard language and standard pronunciation. Ungrammatical expressions involve the usage of grammatically incorrect phrases. Profanity encompasses sensationalized and degrading expressions such as “fuck” and “hell.” Nonstandard language refers to dialects. Each category (nonstandard pronunciation, ungrammatical expression, profanity, and nonstandard language) was assigned one point. A score of 4 was given if all four categories were present, while a score of 0 indicated no linguistic disruption.
In the field of emotional development, we defined violence based on the definitions provided by J. Lee et al. (2007) and Greenberg et al. (1980). According to these definitions, violence refers to overt threats or the actual use of physical force with the intention of causing harm to an individual or group. Verbal violence involves using language to express anger, insult, threat, or shame toward others. We coded hate, deception, threat, toughness, and slang as individual subitems, assigning a score of 1 to each. The maximum score for verbal violence is 6. Physical violence refers to instances where physical force is used to cause injury or express a clear threat. We specifically coded behaviors such as throwing objects, pushing forcefully, using one’s body to cause harm, and using tools to cause injury. Therefore, physical violence was assigned an index summing up to 4. The intentionality of the behavior was determined based on the context, and acts without violent intent, such as playful behavior, were not considered as physical violence. In the domain of emotional development, we used operational definitions based on three components: emotional expression, understanding others’ emotions, and emotion regulation as described by Lee et al. (2015). For emotional expression, we checked for verbal or physical expressions of typical infant emotions such as joy, anger, fear, jealousy, and curiosity. We considered the contextual cues provided by the video and coded the corresponding behaviors associated with these emotions. For example, joy was identified by smiling, laughing, jumping up and down, and clapping. Anger was identified by crying, tantrums, stubbornness, rebellion, retaliation, disobedience, and silence. Fear was identified by crying and screaming. Jealousy was identified by crying, pouting, and attacking the object of jealousy. Curiosity was identified by showing interest or asking questions about something new, strange, or unknown. To distinguish between anger, fear, and jealousy, we took into account the context of the video. Each emotion (joy, anger, fear, jealousy, and curiosity) was assigned a score of 1, resulting in a maximum score of 5. Understanding others’ emotions was examined based on previous studies, considering verbal and behavioral responses to others’ emotional expressions, inferring others’ emotions, understanding emotional triggers and situations, and understanding emotions and behaviors (E. Han, 2006). Each item was coded individually, and a score of 1 was assigned for each response, resulting in a maximum score of 4. Emotion regulation was defined as the ability to delay or control the expression of one’s current emotions, taking into account situational characteristics. We coded suppression of negative emotions and suppression of positive emotions separately, following the research by Choi (2003). Emotion was coded with respect to genre, frequency of analysis, age, and gender. Notably, for the subitems of verbal and physical violence, we analyzed both the perpetrators and the victims, and age and gender were examined separately for each group.
Socialization was analyzed in three different ways: antisocial behavior, prosocial behavior, and prosocial expressions. Antisocial behavior was examined in terms of hate and rule violations. Hate was defined as “content that promotes social prejudice by using derogatory expressions against specific individuals or groups based on race, gender, place of origin, etc. without reasonable grounds” according to the Korea Communications Commission (2015). We identified instances of misogyny, transphobia, objectification, insults, defamatory statements about targeted groups, discriminatory harassment, public ridicule, contempt, degradation, incitement to hatred, stigmatization, dehumanizing rewriting of history, and promotion of prejudice. Rule breaking was coded based on whether there were violations of family rules, social rules, or peer group rules. Hateful expressions were coded using six items, and rule violations were coded using three points, resulting in a total index of nine points. Prosocial behaviors were coded based on Huang’s (1989) criteria, which included six items: rescuing, helping, sharing, caring, giving way, and comforting. Prosocial expressions were defined by Lee and Kim (2018) as “a type of selfless and caring human behavior that enables individuals to form and maintain good interpersonal relationships, helps them to solve problems in their lives independently, and has a positive impact on living harmoniously in the environment.” We coded each video for the presence of five items: persuasion, compromise, friendship, sharing, and expressing interest. The videos were categorized into several genres: learning, vlog, unboxing/product review, animation, game, music, and other. This classification is based on C. Kim’s (2021) genre classification of children’s video content.
The analytical methods were as follows. For Research Question 3, we added up the scores (1 for present, 0 for absent) that we assigned to each video while coding for cognitive, emotional, and socialization-related details to get the frequency with which the detail appeared. If a detail appeared multiple times in a video, it was coded as 1 because we only coded whether it appeared or not. The frequency of a detail in each of the 169 videos was then summed and divided by the total number of videos (169) to calculate the proportion that the detail appeared. Note that the sum of the frequency of each detail and the number of videos in which the detail appeared is not the same because multiple details may appear in a video. For example, if nonstandard pronunciation and ungrammatical expression appeared in a video, each of these details was coded as 1, resulting in a frequency of 2, but the number of videos in which it appeared was coded as 1.
Second, in relation to Research Question 1, we obtained the cognitive, emotional, and socialization indexes for each video, as well as the indexes for each subsection. The index for each subcategory was calculated by dividing the frequency of occurrence of each subcategory by the total number of subcategory items. For example, if there are nonstandard words and slang in a video, 2 (1 nonstandard word + 1 ungrammatical expression) is divided by the total number of details, 4 (nonstandard pronunciation, ungrammatical expression, nonstandard word, slang), and the language destruction index of the video is 0.5. The subsection indexes for each video were summed up and divided by the total number of videos, 169, to get the subsection index
Third, Research Question 2 was conducted by cross-analyzing the subitem index calculated in Research Question 1 with genre.
The analysis was conducted between 7 and 12 July 2021, by three master’s students majoring in communication and childhood studies. Prior to the main coding, three rounds of training were conducted, and inter-coder reliability was checked on 18 videos, which accounted for 10% of the sample. The coding commenced when Cronbach’s alpha values averaged .07 or higher. The specific analyzed items, their operational definitions and examples, as well as reliabilities, are attached as Appendix 1.
Results
Cognitive, Emotional, and Socialization Index of Popular Kid’s YouTube
We analyzed the cognitive, emotional, and social content of 169 popular YouTube videos for children and calculated the mean indexes (Table 1), which were 0.06 (SD = 0.16) for cognitive, 0.07 (SD = 0.09) for emotional, and 0.06 (SD = 0.06) for social, with the emotional index being the highest. Specifically, the cognitive development-related items had an index of 0.06 (SD = 0.16) for linguistic destruction, and the emotional subscales had an index of 0.02 (SD = 0.11) for verbal violence, 0.01 (SD = 0.04) for physical violence, 0.20 (SD = 0.19) for emotional expression, 0.11 (SD = 0.28) for understanding other’s emotion, and 0.02 (SD = 0.09) for emotional regulation.
Cognitive, Emotional, and Socialization Indexes of Popular Kid’s YouTube.
N = 169.
Among the emotional subscales, emotional expression was the highest and physical violence was the lowest, with lower scores for verbal and physical violence, which are generally negative, and relatively higher scores for emotional expression and understanding others, which are positive. For the socialization subscales, the scores were 0.11 (SD = .09) for antisocial behavior, 0.07 (SD = .12) for prosocial behavior, and 0.07 (SD = .12) for prosocial communication. Overall, positive items were associated with higher scores than negative items for emotional and social development.
Kid’s YouTube Cognitive, Emotional, Sociality Index by Genre
As presented in Table 2, when we categorized the cognitive, emotional, and social indexes by genre, we observed predominantly positive subscales in the animation genre. More specifically, within the learning genre, we identified five subscales: linguistic disruption, emotional expression, antisocial behavior, prosocial behavior, and prosocial communication. Among these, emotional expression and antisocial behavior scored higher than the other subscales, and all the subscales related to social aspects were notably higher. Vlogs stood out as the only genre that encompassed all dimensions in the cognitive, emotional, and social aspects. Vlogs featured a mix of positive and negative content, with physical violence (0.01, SD = 0.05) and understanding others’ emotions (0.16, SD = 0.33) scoring higher compared with other genres. The emotional expression index in the vlog genre was notably higher at 0.22 (SD = 0.19) than other indexes. In the unboxing product review genre, we observed the presence of linguistic disruption (0.08, SD = 0.18), emotional expression (0.17, SD = 0.18), and prosocial behavior (0.01, SD = 0.04). The animation genre exhibited mainly positive subscales, including linguistic disruption (0.06, SD = 0.15), emotional expression (0.31, SD = 0.20), understanding others’ emotions (0.05, SD = 0.15), prosocial behavior (0.14, SD = 0.15), and prosocial communication (0.11, SD = 0.16). The game genre was distinctive in that it had no linguistic disruption but featured a higher level of verbal violence (0.14, SD = 0.25). Emotional expression (0.27, SD = 0.12) was also more prominent in this genre compared with others. Within the music genre, we found all social-related specific items (antisocial behavior 0.13, prosocial behavior 0.04, SD = 0.12), prosocial expression (0.04, SD = 0.11), emotional expression (0.05, SD = 0.09), understanding others’ emotions (0.03, SD = 0.10), and linguistic disruption (0.01, SD = 0.05).
Kid’s YouTube Cognitive, Emotional, Socialization Index by Genre.
Note. N = 169. 1 = Learning(n = 10), 2 = Vlogs(n = 115), 3 = Unboxing/ Product Reviews(n = 15), 4 = Animation(n = 7), 5 = Game(n = 3), 6 = Music(n = 15), 7 = Others(n = 4). The Learning, Music, and Others genre SDs in the Antisocial Behavior Index are not calculated because there is only one video. Seven include topics such as eating, ASMR, introducing car prices, and taking care of newborns.
Cognitive Development Content on Kid’s YouTube
After analyzing 169 popular kid’s YouTube channels, we found that 14.2% of the videos exhibited linguistic disruptions, with a total of 24 videos containing such content. Specifically, ungrammatical expressions were the most common at 11.2% (19 cases), followed by profanity at 9.5% (16 cases), nonstandard pronunciation at 7.1% (12 cases), and nonstandard language at 1.2% (2 cases).
Emotional Development Contents on Kid’s YouTube
We categorized emotional development into five components: verbal violence, physical violence, emotional expressions, understanding others, and emotional control. As presented in Table 3, the results showed that emotional expression and understanding others’ emotions, which are positive aspects of emotional development, appeared in 19.9% and 16.0% of the kids’ YouTube videos, respectively. In contrast, negative aspects such as verbal violence, physical violence, and emotional control were observed in 6.5%, 5.9%, and 4.7% of the videos, respectively. Let’s delve into each of these components more closely.
Emotional Development-Related Contents on Popular Kid’s YouTube.
Note. The total number of contents analyzed is 169, but since we only coded videos with emotional development-related content, the sum of the frequencies is 88.
Verbal violence was identified in 11 videos, making up 6.5% of the total 169 videos. Hate speech, which included degrading, insulting, putting down, and mocking, accounted for 6.5% (11 cases), followed by rude remarks, abusive language, informal speech to adults, shouting, profanity, and slang, which accounted for 2.4% (4 cases).
Physical violence was found in 10 out of the 169 videos, constituting 5.9% of the total. The most common type of physical violence observed was hard pushing, occurring in four cases (2.4%). This was followed by bodily injury in three cases and injury with a tool in one case. There were no instances of objects being thrown.
Emotional expression was present in 19.9% (33 cases) of the 169 video contents analyzed. Among the various emotional expressions observed, joy was the most prevalent, appearing in 55.0% (93 cases) of the total videos, surpassing half of the videos. In addition, curiosity accounted for 23.1% (39 cases) of the emotional expressions, indicating more positive emotions compared with negative emotions such as anger (10.1%, 17 cases), fear (8.3%, 14 cases), and jealousy (3.0%, five cases).
Understanding others’ emotions was found in 16.0% (27 cases) of the videos. Behavioral reactions to emotions expressed by others and guessing about others’ emotions were the most common at 12.4% (21 cases) each, followed by understanding the causes and situations that trigger emotions (11.2%, 19 cases), verbal reactions to emotions expressed by others (10%, 18 cases), and understanding the connection between emotions and behavior (8.9%, 15 cases).
Emotional control was observed in seven contents, accounting for 4.1% of the 169 analyzed videos, appearing less frequently compared with other aspects. Specifically, 4.8% (8 cases) of the contents exhibited tolerance toward negative emotional expressions such as anger, sadness, and frustration, while only 0.6% (one case) of the contents showed tolerance toward positive emotional expressions such as joy and pleasure.
We also conducted further analyses to examine differences by age and gender for videos with verbal violence, physical violence, emotional expressions, understanding others, and emotional control. In particular, for verbal and physical violence, we analyzed victims and perpetrators separately.
As shown in Table 4, our analysis of the verbal violence index, categorized by the gender of both perpetrator and victim, revealed that the highest index, reaching 0.57, occurred when a female perpetrator verbally targeted a male victim. Notably, the verbal violence index was approximately twice as high when the perpetrator was female compared with when the perpetrator was male (0.57 for female perpetrators, 0.46 for male perpetrators; 0.29 for male victims, and 0.46 for female perpetrators to female victims).
Index of Violence Perpetrators/Victims by Gender.
Note. The total number of content analyzed is 169, but only videos with verbal/physical violence are used to calculate the violence index. In Verbal Violence, Male Perpetrator–Male Victim SD = 0.20, Range = 0–0.29, Male Perpetrator–Female Victim SD = 0.10, Range = 0–0.14, Female Perpetrator–Male Victim SD = 0.60, Range = 0–0.86, Female Perpetrator–Female Victim SD = 0.18, Range = 0–0.57. In Physical Violence, Male Perpetrator–Female Victim SD = 0.18, Range = 0–0.25, Female Perpetrator–Male Victim SD = 0(1 case), Range = 0–0.00, Female Perpetrator–Female Victim SD = 0.14, Range = 0–0.25.
As indicated in Table 5, our analysis focused on the verbal violence index concerning the age of both perpetrator and victim. It was found that the highest index was observed in cases where the perpetrator was a nonadult and the victim was an adult, while the lowest index was observed when the perpetrator was a nonadult and the victim was a nonadult. In general, nonadult victims (0.29, 0.14) showed a lower index compared with adult victims (0.36, 0.57).
Index of Violence Perpetrators/Victims by Age.
Note. The total number of content analyzed is 169, but only videos with verbal/physical violence are used to calculate the violence index. In Verbal Violence, Adult Perpetrator–Non adult Victim SD = 0.20, Range = 0–0.29, Adult Perpetrator—Adult Victim SD = 0.25, Range = 0–0.57, Nonadult Perpetrator–Nonadult Victim SD = 0.00, Range = 0–0.00, Nonadult Perpetrator–Adult Victim SD = 0.61, Range = 0–0.86. In Physical Violence, Nonadult Perpetrator–Nonadult Victim SD = 0.13, Range = 0–0.25.
Regarding the physical violence index by age, all instances involved both the perpetrators and victims being adults. There were no cases of adult perpetrators and nonadult perpetrators and adult victims. The physical violence index for nonadult perpetrators and nonadult victims was 0.13.
We conducted a further analysis to examine whether sentiment varies depending on the gender and age of the individuals in the content. As shown in Figure 1, the results revealed that female generally scored higher in fear (male 0.43, female 0.56) and jealousy (male 0.00, female 0.52), whereas male scored higher in joy (male 0.30, female 0.03) and anger (male 0.36, female 0.05). There was no gender difference in curiosity (male, female 0.42).

Gender Index of Emotional Expressors.
In the terms of emotional expression, anger (adult 0.38, nonadult 0.49) and fear (adult 0.18, nonadult 0.20) were more prevalent in nonadults than in adults (See Figure 2). However, adults exhibited high level of joy (adult 0.34, nonadult 0.29), jealousy (adult 0.60, nonadult 0.47), and curiosity (adult 0.46, nonadult 0.37) compared with nonadults.

Index by Age of Emotional Expressors.
When it comes to understanding other’s emotion, adult and male scored higher than nonadult and female (nonadult 0.60, adult 0.81, male 0.73, female 0.65) (See Figure 3). In the terms of emotional control, female and nonadult scored higher than male and adult in emotional control (nonadult 0.50, adult 0.40, male 0.44, female 0.33).

Differences in Understanding Other’s Emotion and Emotional Control Index by Age and Gender.
Socialization
We categorized socialization into three types: antisocial behavior, prosocial behavior, and prosocial expression. As shown in Table 6, the findings demonstrated that prosocial behavior was the most common, comprising 27.7% (46 cases), followed by antisocial behavior at 20.1% (34 cases), and prosocial expression at 17.8% (30 cases).
Socialization Development-Related Contents on Popular Kid’s YouTube.
N = 169.
When we further analyzed antisocial behavior, we divided it into hate and rule violation. It was found that antisocial behavior was present in 20.1% (34 cases) of all videos. In the case of hate, the most common form was inappropriate body exposure causing at 3.6% (six cases), followed by public ridicule, belittlement, or demeaning at 3.0% (five cases), discriminatory harassment and prejudice promotion at 2.4% (four cases), expressions that cause disgust and harmed sentiment at 1.2% (2 cases), and insults and defamatory expressions against target groups at 0.6% (one case). In terms of rule violations, 6.5% (11 cases) involved household rule violations such as meal rules or curfews, 10.7% (18 cases) were violation of social rules such as dumping trash, cutting in line, etc., and 5.8% (10 cases) were violations of rules within peer groups, such as breaking promises.
Prosocial behavior was found in 46 instances, accounting for 27.2% of the 169 videos. Among specific prosocial behaviors, caring was the most common at 14.8% (25 cases), followed by helping at 12.4% (21 cases), which was also a prosocial behavior that occurred more than 10% of the time. Sharing was found in 6.5% (11 cases), rescuing in 3.0% (five cases), and consoling in 3.6% (six cases), but yielding was not observed.
Prosocial communication was present in 17.8% (30 cases) of the content. The most common form of prosocial communication was expressing interest, accounting for 13.0% (22 cases), followed by sharing at 5.9% (10 cases), persuasion at 3.6% (six cases), and compromising and expressing the desire to be friends, both at 1.2% (two cases each).
Discussion
As Korean toddlers spend an average of nearly 2 hr a day on YouTube, we conducted an analysis of popular YouTube videos for children to better understand the impact of these videos on children’s cognitive, emotional, and social development. The analysis included nine subcategories and 46 detailed items, consisting of one cognitive item (language development), five emotional items (such as portrayal of violence, emotional expression, understanding of others emotion and emotional control), and three social items (antisocial behavior, prosocial behavior, and prosocial communication).
The results of the analysis of cognitive, emotional, and socialization indexes are presented in Table 7. First, the cognitive, emotional, and socialization indexes of popular YouTube videos for children were examined. It was found that the emotional index was the highest, followed by cognitive and socialization. Among the nine subscales, emotional expression was the highest, while physical violence was the lowest. Generally, positive aspects such as emotional expression, understanding others’ emotions, and prosocial behavior, which have favorable effects on early childhood development, were more prevalent than negative aspects such as disruptive, verbal, and physical violence. In addition, a high proportion of videos contained content related to socialization, with a significant number featuring prosocial behavior and prosocial expression. Emotional expression and understanding others’ emotions were also frequently observed, indicating positive evaluations.
The Ratio, Number, and Index of Content Related to Cognitive, Emotional, and Social Development in Popular Kid’s YouTube.
Note. LD = Linguistic destruction, VV = Verbal violence, PV = Physical violence, EE = Emotional Expressions, UOE = Understanding other’s emotion, EC = Emotional control, AB = Antisocial behavior, PB = Prosocial behavior, PC = Prosocial communication.
These findings contrast with previous studies that identified violent and overly commercialized content in YouTube videos targeting infants and toddlers (H. Lee & Yu, 2017; Shin, 2020; Yang & Jeon, 2021). In contrast to previous research, the prevalence of violence was low, while emotional and prosocial behaviors were frequently exhibited. This discrepancy may be due to this study’s focus on the videos frequently watched by parents’ toddlers and popular content on YouTube Kids. In addition, since the study specifically examined programs watched by 4- to 5-year-olds, parents may have intentionally chosen educational and informative programs, as they had control over channel selection. Based on this study, it can be concluded that the content consumed by infants and toddlers under the age of 5 through YouTube has relatively positive effects on their cognitive, emotional, and social development.
When examining the cognitive, emotional, and socialization indexes by genre, it was found that vlog videos contained a variety of cognitive, emotional, and social content, along with animation and learning/educational genres. However, vlogs also exhibited negative aspects such as language destruction, verbal and physical violence, and antisocial behavior, whereas animation and educational programs primarily contained positive aspects like emotional expression, understanding others, prosocial behavior, and prosocial communication.
YouTube features creators with various purposes and expertise, ranging from established broadcasters to individual creators. Animation and educational programs are often produced by broadcasting companies or production studios similar to those of traditional broadcasters. These programs are typically first aired on terrestrial or cable channels and then uploaded to YouTube, thus undergoing regulatory review and censorship to some extent concerning language, physical violence, and appropriateness. However, YouTube also hosts a vast number of videos created by individual content creators who can easily produce and upload videos without systematic monitoring. Vlogs, the most frequently created genre by these individual creators, tend to contain more violent and antisocial content compared with programs produced by broadcasters. Popular vlog programs, such as “BoramTube,” “Minko Balal,” and “Wonder Kids TV,” often exhibit more violent and antisocial content than their broadcast counterparts.
In summary, emotional content was found to be the most prevalent among cognitive, emotional, and social development areas, with emotional expression being the most frequently observed aspect. This suggests that many YouTube programs for infants and toddlers focus more on emotional expression through storytelling and emotional development, rather than cognitive development such as language and math learning.
Children’s programs serve as an effective medium to convey and teach emotional content to young children, as they often incorporate stories that encompass a wide range of emotions. Infants and toddlers are highly susceptible to the emotional content presented in these videos, making the high frequency of emotional expression a desirable phenomenon. However, the lack of content related to cognitive and social development is disappointing. Several studies have suggested limitations in acquiring language skills through videos, as they reduce opportunities for children to acquire and practice language skills through interaction with their peers (Cho, 2016; Chonchaiya & Pruksananonda, 2008; Williford et al., 2007). Thus, the prevalence of emotional content in children’s programs may reflect the medium’s characteristics.
When examining the violence index and distinguishing between perpetrators and victims based on age and gender, it was found that in cases of verbal violence, the index was higher when underage individuals were the perpetrators and adults were the victims. However, the index was lowest when nonadults were the perpetrators and underage individuals were the victims. Analyzing it by gender, the index of verbal violence was high when the perpetrator was female compared with when it was male. In addition, the index was lower when the victim was female as well. Similar patterns were observed for physical violence, with the index being higher when the perpetrator was female compared with when it was male. Regarding emotional expressions, children scored higher on emotion s such as anger and fear, while adults scored higher on joy, curiosity, and jealousy. When analyzed by gender, women showed high levels of fear and jealousy, while men showed high levels of pleasure and anger. Adult males scored higher on understanding others’ emotions, while nonadult females scored higher on emotional control. It is widely acknowledged that girls are more sensitive to emotional content and more likely to express emotions compared with boys (K.-H Kim, 1999; B.-R Lee, 1997; Yoon, 1997). Many studies have demonstrated that girls have higher emotional intelligence than boys. Expressing anger has been considered inappropriate for women (S. Kim, 2014), and women are shown to express fear and jealousy more than men in the media (Cantor, 2009). Moreover, parents’ attitudes toward emotional expression have been found to be more permissive toward girls expressing emotions like anger, curiosity, and disgust, but more controlling and repressive toward boys (Lee & Chung, 2002). This study also observed that female characters appearing in YouTube videos expressed emotions more actively, while male characters regulated their emotions. Children exposed to these videos may internalize the idea that it is natural for women to express emotions, while men should suppress them. Therefore, it would be desirable to see more gender diversity in emotional expression. The significance of this study lies in its quantitative analysis of YouTube video content. The study analyzed YouTube programs for infants and toddlers based on three developmental areas: cognitive, emotional, and social development. Nonetheless, this study is subject to a limitation as inferential statistics could not be executed due to the relatively small number of subjects analyzed. It is worth noting that conducting inferential statistics could potentially reveal significant disparities in the content across various genres; however, this was precluded by the presence of approximately 30 cells with values of 0 and 1. In future research, further refinement and categorization of the subdomains within each developmental area, along with the evaluation of YouTube programs accordingly, will be necessary. In addition, there is a need to develop indices that can assess YouTube programs for infants based on cognitive, emotional, and social development.
Footnotes
Appendix
Analytic Items and reliability.
| Items | Reliability | Analytics items | ||
|---|---|---|---|---|
| Cognitive development | Linguistic destruction | Nonstandard pronunciation | 1.00 | Frequency, genre |
| Ungrammatical expressions | 1.00 | |||
| Profanity(swear words, slang) | 1.00 | |||
| nonstandard language | 1.00 | |||
| Emotional development | Verbal violence | Hate speech | 0.72 | Frequency, genre, age, gender |
| Lies | 1.00 | |||
| Threat | 0.76 | |||
| rude remarks, abusive language, and informal speech to adults | 1.00 | |||
| Shout | 1.00 | |||
| Profanity and slang | 1.00 | |||
| Physical violence | Throwing things | 1.00 | Frequency, genre, age, gender | |
| a hard push | 1.00 | |||
| bodily injury | 1.00 | |||
| a tool injury | 1.00 | |||
| Emotional expression | Joy | 0.84 | Frequency, genre, age, gender | |
| Anger | 0.84 | |||
| fear | 0.84 | |||
| Jealousy | 0.73 | |||
| Curiosity | 0.78 | |||
| Understanding Other’s emotion | Verbal responses to others’ emotional expressions | 0.82 | Frequency, genre, age, gender | |
| Behavioral responses to others’ emotional expressions | 0.80 | |||
| speculation of other’s emotions | 0.72 | |||
| Understanding emotional causes and situations | 0.78 | |||
| Understanding Emotion and Behavior Relevance | 0.71 | |||
| emotional control | Tolerate expressing negative emotions | 1.00 | Frequency, genre, age, gender | |
| Tolerate expressing positive emotions | 1.00 | |||
| Socialization | antisocial behavior | Improper body exposure that causes disgust, etc. | 1.00 | Frequency, genre |
| Publicly ridiculed, belittled, or demeaned | 1.00 | |||
| discriminatory harassment | 1.00 | |||
| Offensive language, such as promoting bias | 0.73 | |||
| Offensive and emotionally damaging language | 1.00 | |||
| insults and defamatory expressions against the target group | 0.92 | |||
| Violating household rules, such as meal rules or curfews | 0.73 | |||
| Violating social rules such as dumping trash, cutting in line, etc. | 1.00 | |||
| Violating rules of peer groups, such as breaking promises | 1.00 | |||
| prosocial behavior | rescuing | 0.84 | Frequency, genre | |
| Helping | 1.00 | |||
| Sharing | 1.00 | |||
| Caring | 1.00 | |||
| Making concessions | 1.00 | |||
| consoling | 1.00 | |||
| Prosocial communication | Persuading | 1.00 | Frequency, genre | |
| Compromising | 0.70 | |||
| Expressing to be friends | 0.73 | |||
| Sharing | 1.00 | |||
| Expressing interest | 0.70 | |||
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2020S1A3A2A02095619).
