Abstract
The research study tries to understand teenagers’ online engagement and the behavioral transformation in buying stuff online. The study also tries to ideate the stability of spike in online buying (if any) and its sustainability. Statistical tools like the K-S test, M.L.R. test, Pearson Correlation has been used to justify the study and the usage of machine learning algorithms to construct a predictive model of behaviour and its efficiency. The study will help online retailers understand their sales figures’ stability. It will allow them to strategize their marketing functionalities to make the space more attractive even after the world comes out of the pandemic. The increasing usage of intelligent android devices and relatively cheap data has surged the penetration of online engagements among all the age group peoples. The youngsters are engaging in online stuff hence bringing down a considerable transformation in buying behaviour, pattern, and a collective change in marketers’ approach to strategizing according to the ever-evolving market forces.
Keywords
Introduction
It is known and evident that the universities, schools, colleges, tuition centres, entertainment parks are still closed since the last two academic sessions due to the effect of a fatal pandemic that had hit the world at the end of 2019. We have seen many changes in various spaces, and many business operations have witnessed massive transformations, including the education system in our country (India). Under the restrictions imposed by the Central and State government at various point in time and people working in a hybrid model (online & offline), the frequency of hanging out with friends and families have also gone down considerably. The school-going teenagers who mostly rely on their parents to hang out are almost locked in their homes, and the same situations for the college-goers. The study is the most critical engagement in teenagers’ lives, which has gone online past two academic sessions. The teenagers spend a good amount of their parent’s and pocket money buying favorite stuff like stationeries, games, clothes, shoes, books and electronic accessories. USA based investment bank and asset management firm Piper Jaffrey claims that teenagers are highly engaged in online buying, and the numbers are going up each day. The total spending of USA teenagers was $77 billion in 2018, and it has been found that the teens are spending more money and their time with the online-only retailer. As we know that the world is a global village, and the impact of digitization is nothing less in the Indian teenagers. As per a survey conducted by Fampay, 84% of Indian teenagers love shopping online, but 67% of them continue to pay in cash, and 52% pays through their parent’s debit/credit cards. It further confirms that teenagers are technologically sound, and this lockdown has made them more attracted to the digital world. It is evident that when we are in the 2
Objective
The study’s objective is to understand the online engagement of teenagers’ pre and post COVID-19. The study focuses on comparing online buying behaviour among teenagers based on existing buying determinants and the associated factors due to COVID-19 (Shokouhyar et al., 2021). Later, the machine learning classifiers are used to frame a model and compare which model can predict the best result for the used consumer behaviour dataset.
Conceptual framework
The consumer action and reaction about searching, purchasing, using, evaluating and disposal of any product and services to satisfy their own needs is termed as Consumer Behaviour (Schiffman, 2002). The strategies regarding store image, influencers, brand image, and online or web advertising have optimistic outputs or effects on brand equity. It influences buying decisions mostly among young consumers when it comes to apparel. Features like Price, Aesthetics, Uniqueness, Rarity, and Symbolism are essential in attracting young minds and appropriately branded. The purchase intention and brand awareness are interdependent considering the effect of the above-said factors discussed (Zahid & Khan, 2020). Consumer buying behaviour is highly influenced by technology, culture, society, lifestyle etc. The Internet helps organizations and marketers to build a bridge of relationships with customers. The study says that attributes like the website design, customer service features, reliability of the brand, and privacy play an essential role in online buying decisions. The new components, situations and substances create a kind of addiction in its inception/introductory stage. This has been studied as a situational behaviour while a model on virtual shopping addiction was developed. (Rose & Dhandayudham, 2014). Specific behaviour drives compulsive online buying, perceived by the society which controls severity. (Thomas et al., 1989). Several kinds of literature have recognized it on consumer behaviour that through self-regulation, individuals can modify their responses towards any action, but specific influences work. (Rose & Dhandayudham, 2014) Cognitions, emotions and behaviors are those resources that the individual relies on to control their reactions to any specific action. The self-control art depletes the help, and the overall capacity of self-control or regulation gets reduced in the process. (Rose & Dhandayudham, 2014; Kathleen Vohs, 2003). Age as a demographic factor is highly predominant. The benefits of shopping virtually (online) is more understood by the population ageing between 18–25 years. It is having a tremendous effect on online shopping behaviour over e-retailing sites. (Rahulana et al., 2013). Teenagers accept new technologies, and their adaptability and flexibility enhance the probability of grabbing the technology compared to the Gen Y buyers who take more time in concluding a buying decision when it comes to virtual mode (Javadi et al., 2012). The innovators and early adopters have a higher risk appetite comparatively than any other class of buyers. According to a theory by Smith and Mangold, Innovation, Adoption Lifecycle Theory, they generally have better buying power. AC Nielsen had conducted an interesting study covering 38 markets and thousands of respondents. It has been found that virtual shopping is exponentially growing in our country (India), and the average purchases in our territory are more than the global average.
Online buying and consumer behaviour
The critical reasons for enhancement in online purchases are identified as review availability that works as a form of recommendations, promotional deals, and offers, convenience of buying by reducing geographic constraints of owning any product and the feature of cash on delivery along with the diversity of development as in comparatively deep assortment (Gupta, 2017). The youths and the teenagers are engaging online, and online shopping is increasing where the factors like free delivery, the convenience of products, Cashback/loyalty program, width of the product assortment are playing a pivotal role as critical determinants for the virtual space (Krbova & Pavelek, 2015). The intention to purchase is directly dependent on brand awareness and fame based on factors like price, assortment, reliability, and transaction ease of the platform (Zahid & Khan, 2020).
Consumer behaviour in crisis
Concerns have been raised about the possibility of corona-virus transmission and passengers’ lack of confidence in public transportation, resulting in a significant obstacle to overcome. This is a source of concern for passengers and riders and drivers and operators. Since the death of Detroit bus driver Jason Hargrove, unions representing transportation workers have called for mandatory mask policies for riders. Those fears have recently been given a more severe foundation by a paper titled “It Is Time to Address Airborne Transmission of COVID-19.” It is expected that SARS-CoV-2 will behave similarly and that transmission will occur via the airborne route. Overcrowding, particularly in public transportation and public buildings, is an important pathway, and they recommend avoiding it as much as possible (Shokouhyar et al., 2021). The need or desire is identified by the person who is a consumer.
Then the purchase is made to start the consumption phase of the product. The price sensitivity in the market is higher than the brand loyalty studied during the global recession in 2008. Therefore, brand management undergoes specific changes during this time. The misconception of the financial market regarding brand bubble disrupts leading to the major economic crisis, production slowdown, price war etc. Consumers fall out of love with the brands they used to love (Dainora, 2009). As buying behaviour is a vital and ongoing or continuous process for deciding on the search of any product and till its consumption or disposal, the influence of internal and external factors is different for every consumer (Valaskova & Kleistik, 2015). The social issues affect macro consumer behaviour, but individual elements need to be studied (Soloman & Panda, 2020).
The approaches behind understanding consumer behaviour can be classified into three groups-economic, sociological and psychical. The psychical is based on the relationship between the consumer’s behaviour and the psyche. In contrast, sociological approaches emphasize the different situations influenced by social influencers/leaders and are also affected by various social occasions. Other basic knowledge of micro-economy, consumers define their requirements (Valaskova & Kleistik, 2015). During the crisis, it has been seen that consumers try to maximize their satisfaction, utility, and joy for owning the consumer good (Flatters & Willmort, 2009). The adverse events like economic crisis, viral break out, inflation, etc., have a different perception in the mind of other consumers. The risk attitude and the related perception have different perceptions in consumers’ sense.
The new trends emerge during critical times. The attitude towards risk personifies the interpretation of the consumer, which is concerned with the risk content and quantifies the likeness/dislike related to it. The perception regarding any risk describes the probability of the consumer being exposed to the content related to risk (Kautsara, 2012). There can be specific changes in the consumption pattern of behaviour based on dimensions like value consciousness, risk-averse and materialistic. Some studies have proved the significant changes in behaviour based on the utility pattern of the individual (Ang et al., 2001). The behaviour during crisis moments seems to be critical among those exposed to resources. They also emphasize consumption based on values and generally avoid excess consumption, and display dissatisfaction on such matters. The children are taught the traditional values, and the tendency during a crisis often focuses on the demand for simplicity (Flatters & Willmort, 2009). All these alterations and modifications interest the study towards exploring teenage behaviour and increased online engagement in this digital era of the pandemic, which has led to a global economic crisis.
Consumer data analysis and machine learning
The use of big data has transformed decision-making in a wide range of fields. Handling a large amount of data to analyze specific behaviors is helping to improve business processes (Marin-Marin et al., 2019). Attention is being paid to big-data analytics, which makes a significant contribution to determining business strategy while also providing valuable information for the design and development of service innovation. Because of technological advancements, online sellers can now make real-time price changes of significant magnitude and proximity (Thuethongchai et al., 2020). Because of the integration of information and communication technologies into education, we can collect information about the teaching and learning process, demonstrating the importance of forecasting consumer trends and then assessing the prognosis (Victor et al., 2018). The machine learning algorithms have been used in several studies to predict consumer behaviour. The classifiers like Support Vector Machine (Benatti et al., 2017), K.N.N. (Paul et al., 2016), Gradient Boosting, Random Forest and Decision Tree has been carried out to study the quality of prediction on the used consumer dataset and has been tried to analyze their effectiveness with the help of following measures (Table 1).
M.L.R. Test (Enhanced online buying vs. pre and post COVID-19 factors)
M.L.R. Test (Enhanced online buying vs. pre and post COVID-19 factors)
H
H
H
Methodology
This study is an exploratory, quantitative kind of research study aimed at teenagers to understand their online engagement and shopping behaviour during the pandemic. The probability technique, simple random sampling, has been used to collect the necessary primary data based on which the study has been accomplished. Since the data collection was done during the COVID-19, the mode was online. The survey questionnaire (Google form) was circulated via emails WhatsApp to fetch data from the teenagers. The questionnaire is majorly formulated to capture three types of information, i.e. demographic data (gender, age, etc.), as these data have immense importance in studying consumer behaviour (Rahulana et al., 2013), online and offline activity for comparison in engagement and most importantly the behavioural data about online buying behaviour. 127 responses are collected. The data collected through observations of 127 users were subjected to pre-processing to detect anomalous values (outliers), which allowed for the generation of different subsamples of data based on their distributions based on the data collected. All samples, including the original sample and each of the subsamples, went through the same procedure, which included the following steps: Selection of the target variable (variable to be predicted); identification of the most relevant variables on the selected target variable; and prediction of the target variable using the most relevant variables. In each case, the predictive models are created to classify the target variable based on the most influential variables. The determination of the variables can be best predicted through certain objectives. The precision could be achieved in predicting one or the other; the different predictive models were compared. CRISP-DM is a data science methodology that is frequently used in classification problems (predictions of discrete variables). It consists of several phases, including pre-processing (to clean and assess the suitability of a dataset), selection of the most relevant attributes (for model construction), and generation of classification models (with their corresponding details) (Chapman et al.). In most cases, after evaluating the models, it is customary to allow the models to be returned to the attribute selection phase. This scheme has been used in several studies, with slight variations, and has proven effective (Liu) (Rabasa & Heavin, 2020). An outline of such a methodology, which is presumptively used in this paper, is presented. To reach the objective of the study, the hypothesis is framed in context to (I) understand the online engagement of teenagers in pandemic and (II) associate it to their online behaviour and study the sustainability of the growth in the view of identified determinants, i.e. Brand Awareness [BA] (Nizar & Janathanan, 2018) (Zahid & Khan, 2020), Product Assortment [PA], Promotional Activities [P], Home Delivery [H.D.] (Krbova & Pavelek, 2015). Since it is a study in pandemic and it has an intention to understand the impact of determinants, i.e. Boredom (B.D.), Family’s Safety (F.S.), Own Safety (O.S.), Government’s Restrictions (G.R.) on the buying behaviour and the study has been made to understand the impact. To test the hypotheses, certain statistical tools are being used to understand the data. With the help of Python, certain machine learning algorithms are also being run on the dataset to evaluate the precision, accuracy, Recall, F1 score to understand the effectiveness of the underlying dataset on the teenage online behaviour to study the model. Earlier, it was investigated that Random Forest Classifier (RanForC) is an accurate machine learning algorithm to find the relationship between consumer behaviour and willingness to buy a product based on certain factors (Valecha et al., 2018). Hence, the algorithms like Support Vector Machine Classifier, Decision Tree Classifier, Gradient Boosting Classifier, K.N.N. Classifier, and Random Forest Classifier have been trialled on the dataset to understand the best fitment of the data amongst the above classifier algorithm. From the block diagram of the classifiers given below, we can understand that training is a must to run the prediction, so 70% of the data from the dataset has been used to train the classifiers, and the remaining 30% has been used to test the dataset and predict the values. Along with classifier analysis, the statistical tools like Kolmogorov-Smirnov test along with descriptive stat of the dataset,
Basic model of classifiers
The following 5 machine learning algorithm classifiers have been used to predict the behavioural pattern regarding the enhanced online buying and has been tried to understand the suitability and efficiency of the models in studying the specific behavioural dataset. To train the classifier, 70% of data from the dataset has been used as this is a crucial task before prediction, and the rest 30% of the data, has been used to run the forecast. The following are the algorithms that have been used in the study. The essential prediction efficiency based on the measures (Table 1) has been mentioned.
Efficiency measures for classifier algorithms
Efficiency measures for classifier algorithms
*TP-True Positive; *T.N.-True Negative; *F.P.-False Positive; *F.N.-False Negative.
Among the statistical tools, the Kolmogorov-Smirnov (K-S) test has been done to understand the significance of the difference in the consumer’s perception regarding the identified and studied determinants. The said test functions with the principle that the dataset is typically distributed, i.e. there is no significant difference between the consumer’s perception about the studied event based on the D value and D-critical value, which is formulated from the test by the formula:
The multiple linear regression study is carried out based on the received behavioural data about the buying pattern, likelihood of enhanced online buying behaviour among the youngsters under the purview of perceived consumer behaviour towards the determinants Brand Awareness [abbr.: BA] (Nizar & Janathanan, 2018; Zahid & Khan, 2020), Product Assortment [abbr.: PA1], Promotional Activities [abbr.: PA2], Recommendations [abbr.: R.D.], Home Delivery [abbr.: H.D.] (Krbova & Pavelek, 2015) while making buying decisions along with the additional factors due to pandemic, i.e., Boredom (abbr.: B.D.), Family’s Safety (abbr.: F.S.), Own Safety (abbr.: O.S.), Government’s Restrictions (abbr.: G.R.).
The Pearson correlation is used to understand the strength of interrelationship between the studied factors. It helps to understand the power of the model and the feasibility of the dataset in terms of the study undertaken.
To analyze the data, the data has been pre-processed post to which the correlation between the events was studied. K.M.O. test was conducted to understand the suitability of the data for factor analysis. It was found that the model is fit for factor analysis post to which multiple linear regressions were conducted to understand the specific effect of the determinants (pre & post COVID-19) and hence developed a prediction model that whether the data frame that is being constructed from specific responses given by the consumers is strong enough to predict the enhanced online buying behaviour or not. The demographic distribution based on gender, age group, and educational qualification has been studied, ensuring the normal distribution of data among the age group. The engagement data has also been collected to ensure proper digital exposure, online engagement, and diversity. It has been noted that some of the most used online verses are social media, online shopping and web series viewership.
Online engagement and outdoor games
Although many teenagers are still inclined to outdoor games, the online engagements of the youngsters are comparatively higher. The impact of social media, online shopping and the emerging trend of web series is dominating the space. Further from the collected data, it has been studied that the teenagers are mostly indulged and interested in buying educational (books, stationeries, etc.) fashion (clothes, accessories, shoes, cosmetics) and Technological buying (electronic gadgets). Correlation among the pattern of buying has been studied. It can be that there is a 30% positive correlation in the buying pattern of Fashion and Educational stuff.
In contrast, the tech buying and fashion buying attitude is negatively correlated. While analyzing the virtual engagement pattern of the teenagers, it can be seen that the usage of social media has good significance to online buying and the emerging trends of O.T.T. platforms like Netflix, Amazon Prime, Hotstar, Zee5 etc. are attracting teenagers, which needs to be taken care by the digital marketers to reach the target consumers. The data mining and use of A.I. have increased a lot. The e-retailers are using efficient SEO tools to grab the consumers, and it is evident that online shopping is positively correlated to website surfing. Teenagers’ engagement on YouTube is entirely related to educational and fashion buying by 25% approximately. In contrast, teenagers are more indulged in buying fashion stuff like clothes, shoes, and cosmetics virtually as the factors are positively correlated.
Classifier efficiency score
Classifier efficiency score
Classifier Results on Consumer Dataset-1 and Dataset-2.
Now, the determinants (pre and post COVID-19) that play a pivotal role in engaging the youngsters are quantified. To understand their inter-dependability and the study related to the behaviour towards enhanced online consumption during pandemic [abbr.: EOSP], Pearson Correlation and Multiple Linear Regression (M.L.R.) is performed after the K-S test (study significance of the determinants based on consumer perception about the determinants). From the Kolmogorov-Smirnov test on the determinants, it can be seen that the importance of the determinants are high except Boredom (abbr.: B.D.) and Government Restrictions (abbr.: G.R.), and they are insignificant in terms of the conducted study on the enhanced online spending in pandemic (abbr.: EOSP). The correlation study of the factors shows that there is a high correlation between the brand awareness and the factors like convenience of home delivery [H.D.], product assortment [PA1], promotional activities [PA2], Own safety [O.S.], Family Safety [F.S.]. Overall the pre-covid19 buying determinants have a high correlation.
In contrast, it is also evident that the factors i.e. own safety, Family safety has a high correlation among them. It is also highly correlated to the convenience of home delivery, recommendation and government restrictions. From the above K-S test and Pearson correlation, it can be said that the study has a high significance in the current market scenario.
M.L.R. test
The M.L.R. test is conducted thrice to understand the overall model’s significance and the dependent variable’s dependency on the independent variable (pre and post-COVID-19 factors, combined). The multiple linear regressions have been carried out EOSP vs. BA, PA1, PA2, RD, HD: studied pre-COVID-19 determinants (Zahid & Khan, 2020; Krbova & Pavelek, 2015). The Anova summary clearly states that the F value is 0.0000000153
Bar graph (efficiency of algorithms on consumer datasets). Efficiency meter of the consumer dataset 1 and dataset 2 used in the research in predicting the behaviour
Confusion matrix from consumer datasets.
Studied parameter [dependent (Y)] – Responses are being recorded. They are processed in binary (No-0, Yes-1) form to understand whether the respondents would continue to engage in enhanced online buying even after the pandemic situation gets over and has been predicted with their underlying behavioral responses to the [independent (X)] purchasing intention, i.e., Brand awareness, Product Assortment, Sales promotional activities, Recommendations and Convenience of Home Delivery. The support vector machine classifier (Svm C) is the most accepted algorithm in studying this kind of consumer behaviour dataset (Fig. 1). In dataset 2, the y variable is same as in dataset 1. The binary data based on their decision about enhanced online shopping post to the pandemic situation and has been predicted based on the post COVID-19 determinants that have been used for the study, i.e. Boredom due to lockdown, family safety, own safety and government restrictions. In this dataset, the machine learning algorithms like Svm C, Knn C and Gradient Boosting Classifiers have equal efficiency in predicting the model (Fig. 2). The confusion matrix (Fig. 2) produced from both the dataset about the consumer indicates the positive enhancement in online buying engagements from the given data model.
Conclusion
There is no debate that online buying has increased considerably in the lockdown phase throughout the country due to the fatal corona-virus and the impacts on the economy are immensely high. The matter of the study is the associated social and market forces and their role in contributing to the sustainability of the growth. From the statistical data, it has been concluded that there is a significant effect of the post COVID-19 factors, the safety concerns, overexposure to the online world, and government restrictions that have been imposed and continue to be there with an uncertainty of 3
Limitations
There are certain limitations to the study as the volume of data is smaller in size, but the idea can be developed to improvise further research. Since the survey has been conducted during the pandemic situation, the longitudinal study might also be carried out to uncover the other determinants and reclassify the effects of the studied determinants on the enhanced buying behaviour and, hence, the prediction model’s revaluation.
