What Makes People Read an Online Review? The Relative Effects of Posting Time and Helpfulness on Review Readership

Abstract

This study explores the factors that make online customers select which reviews to read among the various ones on the Web. While most of literature on online consumer reviews has conveniently assumed that more helpful reviews would be read by more customers, no empirical study has tested whether the helpfulness assessment actually increases readership. Hence, this study explores various factors affecting consumer review readership and proposes that although helpfulness assessment promotes the readership of a review, the most dominant factor contributing to readership is the time of posting. A review posted late loses a significant chance of being read by consumers even if it is assessed as helpful by other readers. The hypotheses are tested using the data collected from Amazon.com, and the result of the study advises practitioners to display reviews in a manner that lessens the impact of posting time while enhancing the helpfulness voting systems.

Introduction

As the number of online consumer reviews increases, deciding which reviews to read among the thousands posted on the Web has become a strategic issue among online customers. For example, as of July 2012, more than 1,300 reviews have been posted for the Apple iPod Nano 16 GB 6th Generation media player on Amazon.com. A customer who is considering buying this iPod has to decide which reviews to read among the thousand posted because reading all the reviews is practically impossible.

To assist those consumers, major online shopping malls have adopted a number of review display policies, such as helpfulness voting systems. Online shopping malls encourage customers to assess the helpfulness of online reviews, and then the reviews are displayed based on assessment results. This process is a democratic evaluation method of product information quality that brings new value to online shopping malls. For example, an additional revenue of US$2.7 billion was generated from the helpfulness voting system of Amazon.com.¹ Nowadays, it is rare to find a major online shopping mall that does not use this system.

However, the question remains as to whether reviews assessed as helpful are actually being read by more consumers. Do reviews assessed as helpful attract more customers? Do customers read reviews based on the results of the voting system? Do customers significantly benefit from this voting system? Studies on review helpfulness are significantly increasing.² However, most of these studies focus on the contextual features of helpful reviews³ and not on the effectiveness of helpfulness voting systems on customer behavior.

The present study, therefore, examines the impact of review helpfulness on review readership. Specifically, this study (a) explores factors that influence consumers to read a specific review, and (b) compares the impact of review helpfulness on review readership with that of other factors, such as review rating and posting time. This study begins with a literature review on review helpfulness. Then, factors that influence reading of reviews are proposed. Based on data collected from Amazon.com, test results of the hypotheses are discussed. Finally, the implication and contribution of the study are presented.

Literature Review

Previous studies on review helpfulness mainly focused on the quality of contents or on contextual patterns.⁴ For example, Ghose and Ipeirotis⁵ analyzed various aspects of review text, such as subjectivity levels, informativeness, measures of readability, and extent of spelling errors, to examine the influences of these text-based features on the perceived usefulness of the review. Mudambi and Schuff⁶ also identified factors that contribute to review helpfulness, such as review extremity and review depth. Cao et al.³ investigated basic, stylistic, and semantic characteristics of online user reviews, and examined their influence on the level of helpfulness. In this previous study, reviews that provided extreme opinions obtained more helpfulness votes than reviews with mixed or neutral opinions.

The approaches used in these studies were explorative, thus methodologies employed were data driven, such as data mining and econometric analysis. Large scale data of real reviews collected from major Web sites, such as Amazon.com, contain extensive information about a particular product and reflect consumer perceptions. Such advantages of empirical data analyses have supported information system researchers in their examination of helpfulness factors at an exploratory level.

Despite the relatively short history of the field, studies on review helpfulness are significantly increasing.^7,8 However, most of these studies have assumed that customers prefer reading more helpful reviews, and that review helpfulness voting systems encourage customers to read reviews with the highest vote, without verifying its content. Therefore, the present study establishes hypotheses to evaluate the aforementioned fundamental assumptions by identifying factors that influence customers to read reviews.

Hypotheses Development

Review helpfulness is viewed as the perceived value measure or an information quality assessment of a certain review.^4,9 In several cases, review helpfulness is perceived as the truthfulness or readability of reviews, which largely embodies subjective judgment on the assessment of customers.⁵ In spite of all these opinions and views, classifying which kinds of review are perceived as more helpful remains complicated, probably because of the highly context-dependent nature of the perceived diagnosticity level.¹⁰

As such, this study adapts a technical method for defining review helpfulness, that is, the proportion of the consumers who vote that a particular review was helpful for their purchase decision making. For example, if 72 out of 90 consumers agree and vote that a certain review was helpful, the helpfulness of that review would be measured as 0.8. This technical definition is effective in quantifying review helpfulness¹¹ because, nowadays, most major online shopping malls are implementing and practicing this review voting system.

With this system, when a review is assessed as helpful, it will influence customers to make purchase decisions quickly and confidently. A review voted as helpful is acknowledged as having quality information and is likely to be read by many customers. Reading a helpful review first can reduce the time and effort exerted by a consumer when shopping. As such, the following hypothesis is formulated:

H1: The helpfulness of a review has a positive impact on the readership of the review.

Another factor that influences consumers to read online reviews is criticism of a particular product. When a consumer posts a review, he/she also rates the product, using, for example, a certain number of stars on Amazon.com. A consumer may give five stars if he/she is very satisfied with the product, or two stars if he/she is not very happy. However, the number of stars he/she deducted to show his/her criticism represents an important feature of the product, which is quantified as the negativity of the rating. In this study, negative rating is defined as the degree (i.e., the number of stars) that a consumer deducted from five. So, if a four-star review is compared with a three-star review, the latter would be considered as more negative.

Reading critical reviews is important in consumer decision making because obtaining opinions from the minority broadens perspectives and increases variety of information.¹² Critical reviews describe product features that most reviewers might overlook.¹³ Reviews with negative ratings often contain rare information and show perspectives that differ from those of the majority of reviewers. For example, as of October 2012, among the 288 reviewers of the Canon EOS Rebel T3 digital SLR camera on Amazon.com, 240 rated the product with five stars, and only 12 gave fewer than four stars. In this case, prospective customers will tend to read the negatively rated reviews and the positive ones for information variety.

Therefore, the unit value of information in critical reviews can be considered higher in terms of criticality and rarity. Numerous previous studies^14,15 have also emphasized the importance of negative reviews in providing different views and in increasing information variety, thereby enriching content. Amazon.com displays both favorable and critical reviews, thus consumers can read balanced views. Hence, the second hypothesis is as follows:

H2: A negative rating in a review has a positive impact on the readership of the review.

If a customer posts his/her review earlier than other reviewers, his/her review will have a higher chance of being read by consumers because of its longer exposure. While major online shopping malls have their own preferred display rules, such as recency and content quality, a review posted earlier still apparently has a higher chance of being read than those posted later. This study captures this advantageous aspect of early posted reviews and calls it “posting time.” Specifically, posting time is defined as the number of months that the review has been posted online. It plays an important role in online review readership.

Posting time of a review would exhibit a stronger influence on the readership of the review than helpfulness or negativity because of the absoluteness of its advantage. Helpfulness and negativity of a review are relative values. As more customers vote, the classification of helpful or negative changes over time. A review can be most helpful at one point in time but might not be the most helpful at the next moment.

A similar logic can be applied to the conceptualization of review negativity. When there is a slightly negative review and if all other reviews are strongly positive, the slightly negative review will be considered a negative review. However, if all other reviews are strongly negative, the slightly negative review might be considered rather positive. In other words, the negativity of a certain review is influenced by other reviews, which implies that it changes as more reviews are posted.

However, posting time (i.e., the fact that a review is posted earlier than others) does not change despite more reviews being posted, but instead becomes more robust in its advantage. Hence, posting timing is expected to have a relatively stronger impact on review readership than helpfulness or negativity. Based on this assumption, the following hypothesis is proposed:

H3: Posting time of a review has a stronger impact on the readership of the review than helpfulness or negativity of the review.

Data Analysis

Data collection

Data were collected from the world's largest online shopping mall, Amazon.com, which generates sales of more than US$6 billion per month.¹⁶ Cameras and toys were selected as target categories because these items are the most popular items sold online.¹⁷ From each category, five items were selected based on sales rankings. However, an item was omitted if the number of product reviews was fewer than 100. Also, for similar products with different colors, only the highest ranked item was used. For example, among the three available Syma helicopters with different colors, only the red one was selected because it ranked the highest. A total of 10 items were selected, as described in Table 1.

Table 1.

Data Description

Category	Rank	Name of product	Number of total reviews	Average rating	Average review length in lines ^a	Date of first review posted	Date of data collection
Camera and photo	1	Canon EOS Rebel T3i 18 MP CMOS Digital SLR Camera	280	4.68	7.4	2011-Feb 27	2012-Jan 28
	3	Canon PowerShot ELPH 300 HS 12.1 MP Digital Camera (Black)	529	4.16	7.6	2011-Jan 1	2012-Jan 29
	6	Canon EOS Rebel T3 12.2 MP CMOS Digital SLR with 18-55mm IS II Lens	164	4.77	5.6	2011-Apr 08	2012-Feb 3
	9	Carson MM-200 Micromax LED 60X-100X	123	3.87	3.5	2009-Aug 9	2012-Feb 3
	10	Foscam FI8918W Wireless/Wired Pan & Tilt IP/Network Camera	537	3.50	7.9	2010-Oct 19	2012-Feb 4
Toys	4	Rory's Story Cubes	200	4.69	4.4	2010-Jun 13	2012-Jan 28
	5	Syma S107/S107G R/C Helicopter—Red	574	4.10	4.7	2010-Dec 09	2012-Feb 3
	6	Angry Birds: Knock On Wood Game	161	3.95	3.4	2011-May 23	2012-Feb 4
	11	Air Swimmer Remote Control Inflatable Flying Shark	291	3.38	4.7	2011-Aug 18	2012-Feb 3
	12	V for Vendetta Mask	224	4.11	2.7	2006-Nov 09	2012-Feb 4

One line contains approximately 22 words.

Hypotheses tests and results

A regression equation for hypotheses testing was formulated as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*}y = \beta_0 + \beta_1 \cdot x_1 + \beta_2 \cdot x_2 + \beta_3 \cdot x_3\end{align*} \end{document}

The y value was assigned a rank, thus increase in readership (i.e., the number of votes) results in the decrease of y. For example, if the number of votes increases from 10 to 30, and vote ranking rises from 10th to 2nd, then the actual value of y decreases from 10 to 2 as a result. The detailed parameterization process of all variables is shown in Table 2.

Table 2.

Variable Description

Variables	Description	Example in Figure 1
y	– Ranking of a review's total readership. The value is normalized from ratio variable (i.e., total number of votes) to ordinal variable (i.e., ranking of the votes) to reduce the variance effect.– Reviews with fewer than five votes are excluded to reduce the noise effect.	As shown in Figure 1, if a review received a total of 1,649 votes and it is the highest number among the reviews in a product category, the value assigned to y would be 1. If the second highest voted review received 1,438 votes, that review would be assigned with 2.Expectedly, the more people read the review, the actual y value of the review would decrease because the ranking of the review increases. For example, if a review's ranking increases from 10th to 2nd, the value of y would decrease from 10 to 2.This parameterization process explains why the signs of β coefficients in the H1 and H3 tests are negative and not positive, whereas H2 is positive and not negative.
x ₁	Helpfulness level of the review	As shown in Figure 1, if a review received 1,598 helpfulness votes out of a total of 1,649 votes, the helpfulness level (i.e., the value assigned to x₁) would be 0.97.
x ₂	Rating of the review	The rating the review in Figure 1 received is 5 out of 5.
x ₃	Number of months that that review has been published online.	If the review was posted on July 2011, the value would be 6 (months) because the data were collected in February 2012, which is 6 months after it was posted.

FIG. 1.

A review from Amazon.com.

Because of the inverted normalization used in the variable y parameterization process, the ±signs of β₁, β₂, and β₃ in Table 3 are reversed compared with the tone of the hypotheses. For example, when the positive impacts of review helpfulness (H1) and posting time (H3) were hypothesized, negative signs for β₁ and β₃ were found. By contrast, when the negative impact of rating (H2) is hypothesized, a positive sign for β₂ was presented. These reversed signs of the β coefficients support the hypotheses proposed.

Table 3.

Hypotheses Test Results

	Helpfulness (H1)			Rating negativity (H2)			Posting time (H3)
Product	Standardized β	t Value	(Sig)	Standardized β	t Value	Sig.	Standardized β	t Value	Sig.	n used (total)
Canon EOS Rebel T3i 18 MP	−0.396	−4.213	0.000	0.979	9.423	0.000	−0.822	−9.638	0.000	71 (280)
Canon PowerShot	−0.191	−1.718	0.089	0.322	3.103	0.002	−0.391	−3.786	0.000	105 (529)
Canon EOS Rebel T3 12.2 MP	−0.307	−1.874	0.071	0.456	2.630	0.013	−0.623	−3.780	0.001	34 (164)
Carson MM-200	0.217	0.523	0.612	−0.140	−0.309	0.764	−0.419	−1.314	0.218	14 (123)
Foscam FI8918W	−0.316	−2.064	0.046	0.065	0.430	0.670	−0.581	−4.790	0.000	41 (537)
Rory's Story	−0.372	−1.504	0.146	0.706	2.814	0.010	−0.754	−4.943	0.000	27 (200)
Syma—Red	−0.012	−0.049	0.962	−0.526	−2.059	0.054	−0.154	−0.746	0.466	22 (574)
Angry Birds	−0.171	−0.848	0.405	0.066	0.301	0.766	−0.408	−1.909	0.068	29 (161)
Air Swimmer	0.027	0.193	0.848	0.253	1.842	0.073	−0.509	−3.597	0.001	46 (291)
V for Vendetta Mask	−0.342	−1.989	0.056	0.194	1.166	0.253	−0.418	−2.657	0.013	33 (224)

Bold indicates significance at 0.1 level; Underline indicates cases are supported.

Table 4 summarizes the hypotheses test results. H1 is supported in 5 out of 10 cases. H2 is supported in 6 out of 10 cases. H3 is supported in 7 out of 10 cases, which is validated by the higher value of β₃ than those of β₁ or β₂. Although not all hypotheses are supported, most of the cases clearly show the consistent impacts of review helpfulness, negativity, and posting time, as hypothesized.

Table 4.

Summary of Hypotheses Tests

	H1 (Helpfulness)	H2 (Rating negativity)	H3 (Posting time)
Canon EOS Rebel T3i 18 MP	Supported	Supported	Supported
Canon PowerShot	Supported	Supported	Supported
Canon EOS Rebel T3 12.2 MP	Supported	Supported	Supported
Carson MM-200	Not supported	Not supported	Not supported
Foscam FI8918W	Supported	Not supported	Supported
Rory's Story	Not supported	Supported	Supported
Syma—Red	Not supported	Supported	Not supported
Angry Birds	Not supported	Not supported	Supported
Air Swimmer	Not supported	Supported	Supported
V for Vendetta Mask	Supported	Not supported	Supported
Total	5 cases are supported	6 cases are supported	8 cases are supported

A post hoc analysis of review concentration rate

The concentration rate of review readership is further examined to support the implications of the study. This parameter investigates how many readers have been attracted by the highest ranking reviews. It also reflects the current environment of online consumer review readership from a relative perspective. For example, if a high concentration level is observed, it will advise practitioners to discuss the balanced use of the review assessment systems, and the possibility of reducing the concentration level in the future.

Table 5 shows the high level of overall concentration in most products. In Carson and Angry Birds, one top review (i.e., CR1) captures more than 40% of the total readership, whereas the top four reviews (i.e., CR4) possess more than half of the total readership. Figure 2 shows the rapidly decreasing concentration rate after the first review. For example, in camera categories, the second highest ranking review captures less than half of the total readership compared with the highest ranking review (i.e., if the most read review is read 100 times, then the second or third most read review is read fewer than 50 times). In the toy category, the third highest ranking review has less than half of the total readership compared with the second highest ranking review. Such a high concentration level shows that most people only stop reading after reading several reviews. This result highlights the implication of the present study that displaying helpful reviews in an appropriate manner must be an important concern for current practitioners.

FIG. 2.

Vote distribution among highly voted reviews.

Table 5.

Voting Concentration

Product	Total number of reviews	Total votes	Average votes	Votes for top 5% review acquired (%)	Review with at least one vote (%)	CR1	CR4
Canon EOS Rebel T3i 18 MP	280	6,140	21.93	79.79	86.07	35.39	62.28
Canon PowerShot	529	4,706	8.90	64.58	88.28	26.54	41.35
Canon EOS Rebel T3 12.2 MP	164	1,920	11.71	66.72	78.66	33.18	54.69
Carson MM-200	123	541	4.40	71.35	52.03	43.62	65.06
Foscam FI8918W	537	1,859	3.46	62.13	64.99	20.60	41.26
Rory's Story	200	1,365	6.83	77.44	55.50	28.64	64.84
Syma—Red	574	767	1.34	74.58	26.13	15.12	43.02
Angry Birds	161	947	5.88	70.12	44.10	41.82	60.19
Air Swimmer	291	2,185	7.51	80.59	37.11	25.03	63.71
V for Vendetta Mask	224	1,090	4.87	61.56	52.23	17.16	42.94

Discussion

Summary of findings

The following discussion points are derived from the results. First, a strong impact of posting time on review readership is observed, as in the H3 result. This finding implies that the most helpful review is actually the most helpful only among early posted reviews. Under the current display systems, if a review is posted late, then the review will hardly be recognized as helpful by customers because of the robust impact of posting time. To build more balanced and reasonable helpfulness voting systems, a method to reduce the strong impact of posting time should be discussed.

Second, negativity of a review shows a stronger impact than helpfulness (β₂>β₁) in most cases. This finding implies that consumers may be more influenced when a review contains negative or different opinions than when it only shows favorable but general opinions. The reason is that the helpfulness of a review is a value awarded by the majority of consumers, whereas negative reviews represent minority opinions that can offer readers new information in a different manner.

Third, in search goods such as cameras, variables show more consistent and robust associations than in experience goods such as toys. As shown in Table 3, 11 out of 15 associations are found to be significant in camera categories, whereas only eight are significant in toy categories. Furthermore, cases where all three hypotheses were supported are all found in camera categories (i.e., Canon 18 MP, Canon Powershot, and Canon 12.2 MP), whereas no product in the toy category demonstrates such a case. This is because the opinions of other consumers are more influential when buying cameras than when buying toys.¹⁸ Measures reflecting the opinions of others, such as rating and helpfulness, are found to be less important in experience goods than in search goods, according to our study.

Academic contribution

This study raises the question of whether more helpful reviews are read by more customers as an outcome of review voting systems. While most previous helpfulness studies have analyzed content, style, and wording of helpful reviews, few have discussed the effectiveness of the helpfulness voting systems, even though it is a fundamental question that should be answered with justification.^5,19 This study deepens understanding on the effectiveness of review helpfulness voting systems by measuring and comparing how much customers are influenced by signals such as helpfulness and negativity in these reviews when they decide which reviews to read.

This study also extends the scope of IS researchers from the review assessment to overall online review readership. While the scope of previous online review studies has often been limited to the content of reviews, the present study provides an integrative view on online review readership and its assessment system by exploring and comparing the impact of helpfulness and other factors, such as posting time and negative ratings, on review readership.

Practical implication

This study calls the attention of practitioners to the fact that the current review display policy can be improved for effective utilization of the review assessment system. The display of online consumer reviews has always been of interest to practitioners because consumer perception and behavior are significantly influenced by how reviews are displayed and presented.⁹ Hence, most major online shopping malls now implement their own review display rules, such as sorting by recency or quality of the reviews²⁰ so that customers can easily find the reviews they are looking for. However, the result of the present project is alarming because, under the current review display system, posting timing, not helpfulness, remains the most influential factor on review readership, Practitioners are therefore advised to use helpfulness assessment systems in a manner that prioritizes the impact of helpfulness, not timing.

Another practical implication of the study is that it describes the influence of current review assessment systems through data collected from Amazon.com. Data used in this study reflect the current circumstance more realistically than other data types, such as surveys, and enhance the persuasiveness of the result interpreted from data. The data broaden the scope on online consumer review management, especially among practitioners.

Limitations and future study

Several limitations in this study can be resolved in future research. For example, the number of products could be increased to more than 10. Factors that affect consumer review readership, aside from timing and rating, could be explored more thoroughly. In an online review system, a helpfulness voting system applied in various topics would be valuable in future studies when explored from different perspectives.

Footnotes

Author Disclosure Statement

No competing financial interests exist.

References

Spool

. 2009. The magic behind Amazon's 2.7 billion dollar question. www.uie.com/articles/magicbehindamazon/2009. 2012 Jul. 1.

O'Mahony

, Smyth

. Learning to recommend helpful hotel reviews. Proceedings of the Third ACM Conference on Recommender Systems, 2009; 305–8.

Cao

, Duan

, Gan

. Exploring determinants of voting for the “helpfulness” of online user reviews: a text mining approach. Decision Support Systems, 2011; 50:511–21.

Korfiatis

, García-Bariocanal

et al. Evaluating content quality and helpfulness of online product reviews: the interplay of review helpfulness vs. review content. Electronic Commerce Research & Applications, 2012; 11:205–17.

Ghose

, Ipeirotis

. Estimating the helpfulness and economic impact of product reviews: mining text and reviewer characteristics. IEEE Transactions on Knowledge & Data Engineering, 2011; 23:1498–512.

Mudambi

, Schuff

. What makes a helpful online review? A study of customer reviews on Amazon.com. MIS Quarterly, 2010; 34:185–200.

Danescu-Niculescu-Mizil

, Kossinets

, Kleinberg

et al. How opinions are received by online communities: a case study on amazon.com helpfulness votes. Proceedings of the 18th International Conference on World Wide Web, 2009; 141–50.

, Tsaparas

, Ntoulas

et al. Exploiting social context for review quality prediction. Proceedings of the 19th International Conference on World Wide Web, 2010; 691–700.

Connors

, Mudambi

, Schuff

. Is it the review or the reviewer? A multi-method approach to determine the antecedents of online review helpfulness. 44th Hawaii International Conference on System Sciences (HICSS), 2011; 1–10.

10.

Ngo-Ye

, Sinha

. Analyzing online review helpfulness using a regressional relief F-enhanced text mining method. ACM Transactions on Management Information Systems, 2012; 3:10.

11.

Dellarocas

. The digitization of word of mouth: promise and challenges of online feedback mechanisms. Management Science, 2003; 49:1407–24.

12.

Berger

, Sorensen

, Rasmussen

. Positive effects of negative publicity: when negative reviews increase sales. Marketing Science, 2010; 29:815–27.

13.

, Van der Heijden

, Korfiatis

. The influences of negativity and review quality on the helpfulness of online reviews. Proceedings of the International Conference on Information Systems, 2011Paper 1.

14.

Forman

, Ghose

, Wiesenfeld

. Examining the relationship between reviews and sales: the role of reviewer identity disclosure in electronic markets. Information Systems Research, 2008; 19:291–313.

15.

Pavlou

, Dimoka

. The nature and role of feedback text comments in online marketplaces: implications for trust building, price premiums, and seller differentiation. Information Systems Research, 2006; 17:392–414.

16.

Melanson

. Amazon announces Q4 2011 results: sales jump to $17.43 billion, but profits drop 58 percent. www.engadget.com/2012/01/31/amazon-announces-q4-2011-results-sales-jump-to-17-43-billion/. 2012 Jul. 1.

17.

Noni. Why buy toys online? www.streetarticles.com/babies-toddler/why-buy-toys-online. 2012 Jul. 1.

18.

Lee

, Lee

, Shin

. The long tail or the short tail? The category-specific impact of eWOM on sales distribution. Decision Support Systems, 2011; 51:466–79.

19.

Kuo

, Chen

. A behavioral model of the elderly Internet consumer: a case study. International Journal of Innovative Computing, Information & Control, 2010; 6:3507–18.

20.

Kuo

, Chen

. Application of quality function deployment to improve the quality of Internet shopping website interface design. International Journal of Innovative Computing, Information & Control, 2011; 7:253–68.