Abstract
Background
The general public is using the Internet to look for health information more and more often. Information found on the Internet can complement specialized resources; the Internet becomes a valuable tool for patients and their families, mainly because it is so easily accessible. 1 As it can be assumed that the number of patients who consult the Internet to find health information will continue to rise, the information they find has to be reliable as well as relevant.
Half of the adults in the United Kingdom and up to 75% of the adults in the United States accessed the Internet to source healthcare-related information in 2011. 2 –5 In 2006, 19% of the population of the European Union had gone online to find health information, and in 2010, this figure increased to 40%. In Spain, 41% of Internet users looked for health information on the Web in 2007, and this had increased to 54% in 2009. More women (59.2%) use it for this purpose than men (49%). The most common health topics looked for by Internet users are related to disease diagnosis (80%), followed by nutrition (57%) and medication (45%). 6
Users use the Internet for this purpose because of the wide range of information available, because of the fact that they are anonymous on the Web, and because it provides an alternative to seeing a medical professional. 5 A survey carried out in Spain found that most people who use the Internet to research health topics do not usually find all the information they need and do not trust what they read. 7 –10 Sixty-one percent of users feel that finding information on the Internet has helped them to cope with their illness, although 52% felt confused or anxious after their search. 7,8
The most searched diseases on the Internet are cancers, followed by muscular problems, dermatology, allergies, gynecology, and sexually transmitted diseases. 6 Some studies have been carried out among cancer patients to ascertain their information needs during the diagnosis and treatment stage and have found that most Internet users look for information about treatment or alternative treatments and review the information given to them by their doctor. 11 However, little is known about whether or not the information included on these Web sites adequately meets these needs.
There is growing concern in the public health sector about the reliability of information available on the Internet. This information is not regulated, and its quality, accuracy, and readability vary. 9 Few countries impose any restrictions on Internet content. This lack of regulation means that there is a lot of information available and that some of it is incorrect, which could be dangerous to patients. 12 This has led to the development of various tools to help Internet users to find reliable sites. These tools include quality seals, codes of conduct, accreditations, and recommendations. 13 In 1997, Hayward et al. 14 looked at various methodologies for measuring healthcare quality on the Internet (the instruments considered authorship, attribution, and disclosure as part of their rating criteria). They identified 47 different ones. In 2002 Gagliardi and Jadad 15 revisited the study and updated it, finding another 51 unvalidated. However, the proliferation of tools for assessing quality continued, fueled by anxieties about patient harm. The current situation is that many countries are trying to develop national quality initiatives. This should hopefully slim down the number of quality control initiatives in use.
With this in mind, the aim of this study is to assess the reliability, accessibility, readability, and popularity of cancer Web sites in Spanish using the tools available. It will also analyze the suitability of Web site content in accordance with the specific information needs of cancer patients. 11
The interest in analyzing this information in Spanish stems from the fact that it is the primary language of between 322 and 400 million people in the world, the same population as English in terms of use. 16 Although information provided by Web sites in English can be translated with one click, the translation is always inexact and is very likely not to make sense. Besides, it has been reported that people tend to search for information in their own language. 17
Materials and Methods
A two-phase, cross-sectional, descriptive study was designed. During the first phase of the study, data were gathered using online searches to find out which Web sites contained information in Spanish. Quantitative and qualitative data were then collected using the DISCERN
18,19
questionnaire as a reference. The Web sites were evaluated, overall, for quality of content by using the validated DISCERN rating instrument and the Journal of the American Medical Association (JAMA) benchmarks,
20
the Web Accessibility Test (TAW) (
The Google Advanced Search engine was used to identify Web sites about cancer aimed at patients. The searches were carried out in June 2012. This was the only search engine used as it is the most popular one (99% of searches in Spain are carried out using Google). 22 The search was carried out using the option to find Web sites with “one or more words,” and the search terms (in Spanish) were “cancer” OR “oncology” OR “neoplasia.” Only Spanish results were returned. The “related search option” was also used to find other pages similar to the ones found. The first 100 Google search results were sorted and categorized to eliminate any duplicates. In total, 64 Web sites were found.
Once the Web sites had been found, the ones that did not meet the following inclusion criteria were excluded: (1) information in Spanish; (2) aimed at patients; and (3) do not require a password. The following exclusion criteria were applied: (1) contain book reviews or summaries of publications; (2) used only to sell a product; (3) only provide links to other sites or contain broken links; and (4) do not focus on cancer directly. Most of the sites excluded were wikis, news sites, videos, horoscopes, and personal blogs. In the end, in total, 29 Web sites were selected. All of the Web sites were saved for subsequent analysis to avoid any changes during the analysis period. The surveys were used to identify information needs by cancer type. Doctors invited all patients with prostate, kidney, bladder, and breast cancer who attended review and follow-up appointments to take part in the study.
Variables That were Analyzed
The following general characteristics were categorized:
• User sex: male/female
• User age: under 45 years/45 years or over
• Level of education (graduate school/school/no college/college graduate)
• Most frequent place of Internet access (home/school)
• Affiliation (commercial/nonprofit organization/university or medical center/government)
• Specialization (exclusively related to cancer/part of the site dedicated to cancer)
• Content type (medical facts/clinical trials/question and answer/human interest stories)
The following reliability standards were assessed:
• JAMA benchmarks. The quality of information of the selected Web sites was assessed using criteria known as the JAMA benchmarks. JAMA benchmarks consisted of the following four concepts: (1) authorship, affiliations, and relevant credentials of author contributors should be provided in the Web sites, (2) attribution, references, and sources for all the contents to be clearly listed and all the relevant copyright information to be noted in the Web sites, (3) disclosure, meaning that the Web site “ownership” should be prominently and fully disclosed, as well as any sponsorship, advertising, underwriting, commercial funding arrangements or support, and any other potential conflicts of interest, and (4) currency, the dates when the contents were posted or updated on the Web sites.
• Authorship: yes (possible to identify authorship)/no (not possible to identify authorship)
• Attribution: yes (possible to identify authorship)/no (not possible to identify authorship)
• Currency: yes (possible to identify recent date of update)/no (not possible to identify date of update)
• Disclosure: yes (possible to identify source of information and/or bibliography)/no (not possible to identify sources)
• Quality seals: HON Foundation Code of Conduct (HonCode)/other seals/none
Degree of accessibility was defined:
• Priority 1 (A): Web developers must satisfy these requirements. Information should be accessible to most groups of disabled users.
• Priority 2 (AA): Web developers should satisfy these requirements, to ensure that there are no major obstacles to access to Web content.
• Priority 3 (AAA): Web developers may satisfy these requirements, which will facilitate access for certain groups.
• W3C mobileOK Basic Tests 1.0: mobile Web best practices test.
Readability was evaluated as follows:
• The Inflesz readability scale
• The Flesch index, adapted for Spanish texts by Fernández Huerta, was used to assess the Web sites, where <40 was very difficult, 40–55 was quite difficult, 55–65 was normal, 65–80 was quite easy, and >80 was very easy.
Impact or popularity was measured:
• Google PageRank
• The Alexa index (traffic rank) 23 is a rough measure of a Web site's popularity, compared with all the others out there on the Internet, taking into account both the number of visitors and the number of pages viewed on each visit.
• Number of bookmarks in the DELICIOUS bookmarking service
• Sites linking in
Information needs were evaluated:
• A questionnaire based on DISCERN was composed of 23 questions on patients' information needs. The DISCERN instrument, in turn, is a validated rating tool of 16 questions that is freely available online (
Data Gathering
All instruments were reviewed by the same investigator. The Alexa Web site was used to gather data about age, sex, and level of education. This site provides information about Web site hits and collects information about users who have installed its plug-in. The JAMA benchmark and quality seal information was assessed by an expert researcher. The accessibility information was assessed using two validation tools, TAW and HERA; the goal is to analyze the level of accessibility in the design and development of Web pages to allow access for all, regardless of their specific characteristics. The TAW is an online (
Statistical Analysis
A descriptive analysis was carried out for the variables studied, including the frequencies of the qualitative variables (absolute value and percentage) and the average and range for the quantitative variables, in order to gain an understanding of cancer patients' needs and to assess whether or not those needs are met by the Web sites studied.
The Spearman's rank correlation coefficient (rho) was used to measure the association or interdependence between two quantitative variables with an abnormal distribution (nonparametric). The data were obtained using SPSS quantitative analysis software (SPSS, Inc., Chicago, IL).
Results
Cancer Web Sites
The characteristics of the cancer Web sites evaluated are set out in Table 1, which provides details about the country of origin and Web master of each site. The sites found are from the United States (27.6%), Spain (62.0%), and Argentina (10.4%).
Description of the Web Sites Found
Thirty-eight percent of cancer Web sites in Spanish are more frequently visited by women, and 31% of sites are visited by people over 45 years of age. Most of the users of these sites have a high level of education (graduate school, 28%). With regard to the affiliation of the Web sites, 69% belong to nonprofit organizations, 14% to universities, and 10% to or medical centers. Most of the Web sites focus exclusively on cancer (86%), and the remaining 14% contain some sections on cancer only. With regard to content type, 52% of the information is about medical facts, whereas 24% is on clinical trials or information about ongoing research (Table 2).
Web Site Characteristics: Affiliation, Specialization, Age, Sex, Content Type, and Level of Education
With regard to Web site quality, assessed using the JAMA benchmarks, 59% of sites provide information about their authors (affiliation, credentials, etc.), 62% provide references and sources for all their content (clearly listed, along with copyright information), 38% identify their sources of funding and advertisers, and 54% provide the date of the last update. Of the Web sites that meet the JAMA benchmark criteria, 17.24% also have some kind of quality seal. Thirty-one percent of the Web sites have a quality seal. Of these, 24% meet the HONCode requirements, and 28% meet other standardization criteria. Twenty-one percent have another seal or seals in addition to the HON seal, 7% have the HON seal only, and 3% have other seals only. Of the sites with a quality seal, 13.8% meet at least the Priority 1 accessibility requirement (Table 3).
Web Site Content: Journal of the American Medical Association Benchmarks, Quality Seals, Readability, and Accessibility
HONCode, Health on the Net Code of Conduct; JAMA, Journal of the American Medical Association.
Twenty-one percent of the Web sites analyzed do not meet the basic accessibility criteria (Priority 1). This means that elderly people and those with any kind of disability will not be able to see, understand, browse, or interact with the Web site effectively or to create or contribute content. The Priority 2 requirement, important for eliminating obstacles to Web site access, is only met by 3% of the cancer Web sites studied. Finally, not one of the Web sites meets the Priority 3 requirement. Forty-eight percent of the sites have a readability score of “normal,” 28% are “quite easy,” and 24% have scores of between 55 and 65 and are therefore “quite difficult” (Tables 3 and 4).
Readability, Impact, and Popularity
Table 4 shows the popularity of the different Web sites based on the Alexa ranking, which calculates Web traffic for the sites selected around the world, as well as their impact or importance calculated by the Google PageRank tool. The “Delicious” column shows the number of times that the sites have been added as favorites, and the “Site linking in” column shows the number of links to that site from other sites. The top sites based on these indicators are the ones developed by the National Library of Medicine, the American Cancer Society, the Centers for Disease Control and Prevention, and the National Cancer Institute. It is interesting to note that the El Mundo health portal, run by a Spanish newspaper, is the top site based on the Alexa rankings.
Patients' Information Needs Versus Web Site Content (Table 5)
People Who Have Needed Information About Each Topic Throughout the Course of Their Illness and Web Sites That Include That Information
The three topics patients needed the most have been highlighted in bold for each cancer type.
Patients' information needs vary depending on the type of cancer they have, although they all want to know about the likelihood of cure, survival rates, and the side effects and risks of treatment.
Based on the cancers studied here, patients with breast cancer are the ones with the greatest need for information about their illness. The topics that breast cancer patients access most frequently are related to the likelihood of cure and survival rates (95%), side effects and risks of treatment (95%), and the effects the treatment may have on their physical appearance (90.0%). A large proportion of Web sites (69.0%) include information about the likelihood of cure and survival rates, but less than half of them provide details about the side effects of treatment or the effects of the treatment on physical appearance (44.9% and 41.4%, respectively). The correlation coefficient between Web site content and patients' information requirements for breast cancer is not very high (0.663).
For patients with prostate cancer, the most frequently accessed topics are the effects of the disease on sex life (66.2%), risks and side effects of treatment (60.8%), and cure and survival rates (50.2%). Only a quarter of the Web sites studied (27.6%) provide information about the effects that the disease can have on the patient's sex life, whereas 69% of sites include sections on the side effects of treatment, and 44.8% include material on the likelihood of cure and survival rates. The correlation coefficient between Web site content and patients' information requirements for this type of cancer is quite high (0.803).
The topics required most frequently by patients with kidney cancer are related to self-care to aid their recovery (66.7%), food and nutrition (66.7%), the likelihood of cure and survival rates (60.0%), and keeping physically well and active (60.0%). More than half of the Web sites studied include information about self-care to aid recovery and food and nutrition (58.6%), whereas topics on the likelihood of cure and survival rates and keeping physically well and active are included in 48.3% and 44.8%, respectively. The correlation coefficient between Web site content and patients' information requirements for this type of cancer is not very high (0.644).
Finally, the topics accessed most frequently by patients with bladder cancer are related to the likelihood of cure and survival rates (62.9%), side effects and risks of treatment (48.6%), and self-care to aid recovery (45.7%). A large proportion of Web sites (69.0%) include information about the likelihood of cure and survival rates, but less than half of them (44.9%) provide details about the side effects of treatment, and just 41.4% provide information about self-care to aid recovery. The correlation coefficient between Web site content and patients' information requirements for this type of cancer is the highest of all the cancers studied here (0.850).
Discussion
There are a lot of tools available today that can be used to assess health Web sites, 12,13,15,27 but the levels of readability and accessibility of Web sites are still largely unknown. Proof of Web site quality would increase users' confidence in the information obtained and minimize exposure to unproven or false information, promoting user autonomy. 28
The study carried out here had certain limitations. For example, the search was carried out on one date (June 2012) and may not be representative of matches from other search engines at other. The changes that the Web sites undergo, with regard to content, design, and even services offered, are manifold; they may even change server or even country. This dynamic character may cause a Web site to change radically in very little time. The information available on the Internet is quite dynamic, and the Web sites may have changed over the course of the data-gathering period. Also, our study is limited by the fact that a single reviewer determined the eligibility for inclusion and the number of elements in each quality assessment instrument. Both of these may, in some cases, be subjective. Other limitations include the failure to identify other reputable Web sites; we studied the first 100 sites (top 100), but each search engine decides its own first 100 sites. Furthermore, there are no universally accepted benchmarks, standards, or quality control criteria. This means that the results of this study depend on the criteria selected.
Web site content and information needs are regularly updated and refreshed. The Internet offers rapid access and a wide dissemination of information. The assessment of patient needs is an essential part of Web site development. Health Web sites should be judged not only by the quality, accuracy, reliable, and readability of the information provided, but also by their match with the target audience.
The DISCERN instrument structure for assessing the sites was very helpful in orienting us to the criteria for assessing information's need. The aim of this article is to be the first to analyze the needs of patients with breast, prostate, bladder, and kidney cancer in relation to the reliability of information in Spanish found on cancer Web sites and thus to find out how much these sources can be trusted. It is important to note also that cancer is one of the most searched diseases on the Internet (65.6% of patients have looked for information on the disease at some point 29 ), and 8% of the world population of Internet users speak Spanish. 30 The study outlines the profile of breast, kidney, prostate, and bladder cancer patients who use the Internet as a source of health information. Internet use is related to the patient's attitude toward decision-making and level of education. Studies published on this topic so far have looked at Web sites in English from the United States and are therefore not applicable to Spain or Europe because of the digital divide (the Internet usage rate in Spain is currently similar to the U.S. rate in 2001 30 ). Another limitation is that the quantitative/qualitative approach using DISCERN and structured surveys do not mimic how the cancer patients selected for interview go about doing a self-search in reality. The results could vary because the first-phase analysis is not based on cancer patients. This study provides information of relevance for healthcare professionals specializing in oncology, in that having an awareness of the factors related to use of the Internet as a source of health information will allow them to encourage that use effectively (e.g., by providing details of the best Web sites and explaining how to evaluate information). Doctors could also make available healthcare information of proven quality, taking advantage of the potential benefits that it can offer patients and avoiding the potential adverse effects of incorrect use of the Internet. The results of this study show that Internet use is not linked to a lack of satisfaction with the information received. This suggests that patients who look for information do not do so because they are dissatisfied with the information given to them by healthcare staff, but that they have other motivations or worries that make them want to find out more about their illness.
Numerous studies have assessed the quality of Web sites focusing on a particular disease (breast cancer, depression, neurology, etc.). 31 –33
Some studies have found that the quality of health Web sites is generally high. 18,19,24 However, other authors have found the opposite, with findings similar to our own. Nevertheless, these studies have used a wide variety of different evaluation criteria, making it difficult to make comparisons. 8,27
In this study, the HON label was found to be linked to higher quality. This finding is consistent with those of some studies 34 but not with others. 35 –38 These varying results could be because very few sites have the HON label, because of the criteria used for the HON label (related more to ethical considerations than to quality), and because this label is only given if the site's Web master requests it. It is important to note that quality seals are still only used by a limited number of sites, and government sites do not use them. These seals are not used to assess the quality of the information provided by a Web site; they merely guarantee disclosure of the information sources used (not their quality) and the aims pursued by publishing that information. These seals are too much effort given their limited application. 29
The results of this study suggest that the health information on cancer available on the Internet in Spanish is not very reliable and often of variable quality. These findings are consistent with those of previous studies on cancer and other health topics for other languages. 34 –39 Generally speaking, the health information found on the Internet is not reliable enough, and it is important to highlight the absence of information provided on the Internet by government bodies. 29 Most of the sites evaluated had very limited reliability as a result of factors relating to responsibility for information and readability. The exactness, completeness, and consistency of information also varied a great deal. This in itself illustrates the limited usefulness of Google PageRank and DELICIOUS as a means of classifying Web sites based on their impact or quality, as already discussed by other authors. 26,33 The results of Table 4 are inherently unreliable, as page rankings are well known to be easily manipulated by Web masters and advertisers. DELICIOUS linking is inherently subjective ("I liked this site") and does not in any way measure quality of information because some homeopathy sites are highly linked, for example. Although Alexa, Google PageRank, and DELICIOUS may have met our very loose definition of technical quality criteria, our previous work 26 shows that popularity and quality as represented by domain-independent technical quality criteria are not related. Therefore, we do not consider Alexa to be a quality assessment tool that can benefit consumers.
With regard to gender differences, women look for the most health information on the Internet, and they do so more frequently than men. This finding is consistent with those of Red.es, 40 the Pew Internet & American Life Project, 41 and other studies. 42 –44 Other studies 34 –38,42 –44 did not find any differences between information searches by different age groups, but differences were detected here (over 45 years of age). With regard to accessibility, the Spanish Web sites studied were found to have limited accessibility for disabled users. These results are consistent with those of several other studies 33,45,46 that found that disabled people would find it difficult or impossible to access these sites.
The readability of the sites studied was just as bad as that of sites from the United States, the United Kingdom, and other countries, 25 suggesting that readability could limit patients' ability to understand Web content. There was no significant link between Web site size and readability, which suggests that difficulty reading these sites is not linked to word count, and that Web sites with a small number of words had a similar level of readability as sites with more words.
The Internet has improved access to information, but the quality and currency of content are not guaranteed. The accuracy of information is often not guaranteed by any independent and/or public institution. Furthermore, quality seals do not guarantee scientific rigor or the currency of contents, so they provide no solution either. Some authors have suggested the implementation of regulations so that only scientifically proven information is published on the Internet, 47 but this is a utopian idea, because there will always be countries with more liberal regulations where information can be published in Spanish and will be easily accessible to Spanish users. The best solution would be to create Web sites run by the healthcare authorities containing information for the general public, 10 like the Virtual Health Library, which provides information for healthcare professionals and is endorsed by the Carlos III Health Institute. 48
Patients' current preferred source of information is their doctor, followed by the Internet. However, because of the changes that have taken place in other countries we can infer that the use of the Internet as a source of health information will increase, becoming the top source, although not necessarily the one that inspires the most confidence. 37 So, the Internet aims to become the public's main source of information, with content adapted to patients' needs, which would change patients' habits and their relationship with their doctor. As such, the health services must be aware of Web site content. Web sites should give patients the opportunity to discuss any questions or queries that may arise during their search, to ensure that they interpret the information correctly and to reduce any of the confusion or anxiety that often results. Web sites could also provide a good opportunity to strengthen the doctor–patient relationship. Furthermore, reliable Web sites should be actively recommended to patients.
Future research could study the effectiveness of interventions that aim to educate patients about effective use of the Internet as a source of health information. These studies should be designed taking into account the factors related to Internet use found in this study, as this would make it possible to consider the peculiarities of each patient and his or her own specific needs.
Footnotes
Acknowledgments
We would like to thank all the doctors and healthcare staff in the Urology Department at the Virgen de las Nieves Hospital in Granada. We would also like to thank the patients who took part in our research and completed the questionnaire. Thanks to everyone who helped with or supported our study in some way.
Disclosure Statement
No competing financial interests exist.
