Associations between diagnostic patterns and stages in ovarian cancer

Abstract

Ovarian cancer (OvCa) is the fifth leading cause of cancer deaths in women and remains the deadliest gynecological cancer. Our study goal is to examine associations between diagnostic patterns and OvCa stages. We used the data from a web-based survey in which more than 500 women diagnosed with OvCa provided both free text responses and staging information. We employed text mining and natural language processing (NPL) to extract information on clinical diagnostic characteristics, together with 21 dichotomous symptomatic variables, patient-centered advocacy, and polytomous disease severity, with internal validation. We conducted multivariate analyses and developed tree-based classification models with the confirmation of Random Forest to determine important factors in the relationships of the clinical diagnostic characteristics with OvCa stages. Models including the symptoms, patient advocacy tendency, disease severity and doctors’ responses as predictors, had a much better predictive power than those limited to doctors’ responses alone, indicating that OvCa stage at diagnosis depends on more than just doctors’ responses. Although effective early stage diagnosis and treatment remains a challenge, our analysis of patient-centered clinical diagnostic characteristics and symptoms shows that self-advocacy is essential for all women. The frontline physician is critically important in ensuring effective follow-up and timely treatment before diagnosis.

Keywords

Ovarian cancer diagnosis survey follow-up multivariate analysis text mining data mining tree-based classification random forest structural and non-structural missing data patient advocacy symptoms doctor’s responses

1. Introduction

Ovarian cancer (OvCa) is the fifth leading cause of cancer deaths in women and remains the most deadly gynecological cancer. The disease places a debilitating burden on the US population, in terms of mortality, morbidity, individual suffering, and loss of productivity for all women with OvCa. National expenditures for OvCa care were estimated at $5.12B in 2010 (http://costprojections.cancer.gov/expenditures.html). The high fatality of OvCa is attributed to the fact that most patients are diagnosed at a late stage, with 63% diagnosed at Stage III or above. Effective early-stage diagnosis is challenging because there are no approved screening procedures for the general population, which has led to OvCa being termed the “silent killer”.

We have previously shown that public awareness and knowledge about OvCa is poor among the general population (Carter et al., 2014). It has also been reported that ovarian masses have often been misdiagnosed (Pomeranz & Sabnis, 2004), although there was some association of pre-diagnostic symptoms with OvCa (Goff et al., 2007) and with OvCa diagnostic stages (Sun et al., 2015). The motivation for the current study was to examine the association of diagnostic patterns (determined by the responses from ‘frontline’ clinicians, specifically primary care physicians (PCPs) and emergency room (ER) doctors, together with follow-up by specialists), with OvCa stages.

2. Methods

Our data source was patterns of diagnosis extracted from reports by a large cohort of women in a web-based survey that we created (http://stat.case.edu/ovac). The database from this survey contains records for more than 940 women diagnosed with OvCa, of which 516 provided both free text responses and staging information. The survey was conducted between 2009 and 2013 and includes over 1500 interactive fields grouped into 15 sections, including a free text field for symptoms that led to diagnosis, and other 1500 $+$ fields for demographics, medical history, treatments and pre- and post-diagnosis lifestyle (Sun et al., 2015). The complete survey is available at the Ovarian Cancer Survivorship Survey archive (http://stat.case.edu/ovac). Our statistical comparisons for the distributions of main clinical/demographical variables, specifically the age at diagnosis, stage of ovarian cancer at diagnosis, and marital status, showed that the subset (516 cases) is a good representation of the full data set (p-values between 0.408 and 0.996). The comparison of our data with the most recent SEER Cancer Statistics (SEER Cancer Stat Facts: Ovarian Cancer, 2017; Howlader et al., 2017) showed that there were no statistically significant differences for stages but the age distributions between our data and SEER’s. Our data had a comparable shape to that of the SEER’s national data, but had a larger percentage of women diagnosed younger than 64 years old than the same cohort in the SEER database. Our data also has a smaller percentage of women between 64 and 74 years old, and a much smaller percentage of women older than 75. This is expected for internet web-based, or crowdsourced data (Carter et al., 2014). Thus the association of diagnostic patterns with stages found in the current study, applies to women less than 75 years old.

A primary predictor set of clinical diagnostic characteristics, including physicians both initial contact and follow-up efficiency, was defined as shown in the first three columns in Table 1.

Table 1
Primary predictor set of diagnostic characteristics – doctor’s responses

Predictor variable	Abbreviation	Possible values	Combined variable	Combined values
PCP’s response	PCP-R	Y, N, N/A
PCP’s efficiency	PCP-E	1, 2, 3, N/A	PCP	Y1, Y2, Y3, Y.NA, N, NA
ER visit	ER-V	Y, N, N/A
ER Dr discovery	ER-D	Y, N, N/A	ER	YY, YN, Y.NA, N, NA
Dr follow up efficiency	FWUP	1, 2, 3	FWUP	1, 2, 3

Key: Y $=$ yes, N $=$ No, N/A $=$ not applicable; 1 $=$ bad, 2 $=$ moderate, 3 $=$ excellent.

Here, we define that a PCP responded, i.e., PCP-R $=$ YES, if a primary care physician (PCP) correctly suspected OvCa and/or the patient was sent down the correct path for diagnosis, specifically CA125 assessment, vaginal ultrasound and then gynecological oncology. A PCP was considered not to have responded to a patient’s symptoms if the patient was sent to a psychiatrist or dismissed, which in some cases happened repeatedly. Further, some responses were found to be relational, i.e., sequentially dependent; for example, an emergency room (ER) visit was a necessary precursor to ER’s doctor discovery. Thus for a respondent who did not go to the ER, the variable for ER doctor’s discovery was not relevant and there was no response. Similarly for respondents who did not see a PCP, the efficiency of diagnosis was not applicable. In order to ensure sequentially dependent variables were not considered missing in our modeling, they were combined and recoded for further analysis, as shown in columns 4 and 5 in Table 1, in which PCP-R and PCP-E are combined into ‘PCP’, and ER-V and ER-D are combined into ‘ER’.

The relationships of clinical diagnostic characteristics (the primary predictor set) with OvCa stage were investigated, in consideration of an additional predictor set (Table 2) consisting of 21 dichotomous symptomatic variables studied in Sun et al. (2015), one patient-centered variable of interest Advocate (ADVO), and one polytomous variable Severity (SEVTY), which was defined to have three classes (mild, moderate, severe). A patient’s symptom was mild if she had no or only minor/occasional symptoms that caused no sharp or lingering pain. In this case, abnormality was often found in an annual checkup/pap smear, a study one participated in, or a CT/ultrasound for unrelated problems. A patient’s symptom was moderate if she had some level of pain/discomfort, but not as bad as those severe symptoms. For example tumor/mass may exist, but usually only discovered by doctor. Symptoms consistently exist, but do not require immediate attention and may not be obviously linked to cancer. A patient’s symptom was severe if her symptoms required immediate attention. These include: severe pain/pain, usually leading the patient directly to ER, heavy bleeding, fatigue, short of breath, extreme swelling, or abdominal mass felt by the patient herself before she went to the doctor.

Table 2

Second set of predictor variables – clinical characteristics

Name	Definition	Abbreviation	Values
Anemia	Symptoms of anemia	ANEM	1, 0 for Yes, No
Back/thigh pain	Back/thigh pain	BACK	1, 0
Breath, fluids/ascites	Breathing/fluid problem	BTHEA	1, 0
Cancer (non-OvCa)	Breast/colon/uterine cancer	CA	1, 0
Chance/routine exam	Discovered in a routine or an unrelated surgery/exam	CHNC	1, 0
Constipation	Constipation	CNST	1, 0
Eating/tummy swell	Eating related problems	EAT	1, 0
Endometriosis	History of endometriosis	ENDO	1, 0
Family history/gene	Family history or having genes suspected of OvCa	FAMH	1, 0
Fatigue	Fatigue/tiredness	FATG	1, 0
Indigestion	Indigestion	INDG	1, 0
Infection	Signs of infection/fever	INFE	1, 0
Intercourse	Pain or difficulty in intercourse	INT	1, 0
Kidney	Suspected or confirmed kidney symptoms	KDNY	1, 0
Mass/lymph	A mass/lump/tumor	MASS	1, 0
Menstrual	Menstruation-related symptoms	MENS	1, 0
Other	Other uncategorized	OTH	1, 0
Pain (non-back/thigh)	Pain (non pelvis/back/thigh)	PAIN	1, 0
Patient self-advocacy	Self-advocate	ADVO	a, o for being advocate or other
Pelvis	Pelvis related symptoms	PLV	1, 0
Severity	Severity	SEVTY	mi, mo, s for mild, moderate, severe
Urination	Bladder problems	URIN	1, 0
Weight	Weight loss/gain	WGT	1, 0

The free text field of the records in the database were examined using text mining. The review was carried out in two phases by 6 independent observers. In the first phase three independent observers calibrated the data on 200 cases, with post-hoc cross-checking between observers to ensure consistency. Data extraction guidelines were refined and a second-phase review of all cases was carried out by three independent observers and cross-checked to minimize potential observer biases.

3. Data analysis

As with any complex multivariate study, descriptive exploratory statistics were first applied in order to provide an overview of the data. All data were self-reported. There was some missing data, which was characterized as N/A. Data validation and treatment of outliers followed our established methods as previously described (Sun et al., 2015).

The proportion trend test (Dalgarrd, 2006) and proportion difference test were also used to determine significant trends or differences that appeared in the data distribution.

Further detailed analyses were carried out using tree-based modeling. This approach was selected because tree-based models use a non-parametric approach that relies on the evidence presented in the data and minimizes a priori model assumptions imposed on the data. Tree-based models also handle missing data more effectively than standard parametric models and allow generalized interactions between different types of predictor variables. This class of models is also well suited for use on many categorical variables, such as those presented in our dataset. The rpart package (from R) was selected for model development because it allows for ordinal categorical predictors and handles missing values by employing surrogate variables to treat the remaining missing variables after recombination.

When developing an adequate tree-based model, there remain several options to be investigated. In the current study, the options included disease staging, prior probability distribution of the possible outcomes, and missing variable surrogate criterion.

Disease staging: OvCa staging was characterized as either:

(D1) (D1)
Two stages: Early (Stage I and II) or late (Stage III and IV).
(D2)
Four stages: Stage I, II, III and IV.

It is typically harder and requires more/better data to build a good predictive model for a finer classification (D2) than for a coarser classification (D1).

Prior probability distribution:

(P1) (P1)
The default (natural) prior probability mimics the natural frequencies of the response variable from a dataset.
(P2)
The equal prior probability places no bias toward either stage for each record and is based on the symptoms and other predictors, rather than a-priori assumptions.

Missing variable surrogate criterion:

(M1) (M1)
‘Regular (or raw) accuracy’ selects surrogate variables by maximizing the total number of correct classifications for a potential surrogate variable. Thus both missing and non-missing variables contribute to the surrogate variable.
(M2)
‘Percent accuracy’ selects surrogate variables by maximizing the percent correct classification, calculated over the non-missing values of the surrogate at the current node. Thus all non-missing values contribute to the surrogate variable.

It was also considered valuable to compare models with both symptoms, i.e. variables in Table 2, and doctors’ responses (Table 1) included as predictors, against those with only doctors’ responses.

Symptoms inclusion criterion:

(S1) (S1)
Both symptoms and doctors’ responses are included in the predictors’ set.
(S2)
Only doctors’ responses were included in the predictors’ set.

This approach was based on the hypothesis that including both symptoms and doctor’s responses (S1) is necessary to improve the classification rate of the model using doctor’s responses alone.

Therefore, we conducted model building analyses to explore combinations from two modes of disease staging, two types of priors, two missing variable criteria and two types of predictors. This provided the possibility of 16 models. In building each of the 16 models, we conducted a tree-based analysis using the rpart() function in R with appropriate settings. The prediction error of each tree-based model was assessed by averaging the results of 10 independently drawn 10-fold cross-validations (CV) for the model. Specifically, in a 10-fold cross validation, each model was developed by 1) randomly partitioning the complete sample into 10 subsamples of equal sizes, 2) using each of the 10 subsamples for testing and validating the model built usingthe remaining 90% of all records as the training set and 3) averaging the 10 models to produce a single estimation. Since a 10-fold cross validation is based on ‘random’ partitioning, this ‘random’ effect was reduced by repeating partitioning 10 times independently to provide a stable average model and estimated cross-validated error for our model. The important predictors for modeling the diagnostic stages were obtained by examining the final trees (Fig. 3a–e) and computing the ‘variable(s) (of) importance’ using ’random forest’ to confirm the result of each tree model. The important predictors were also cross-checked by fitting a logistic model to the early/late stage response data and running a variable selection procedure, with the available complete data.
4. Results

As shown in Fig. 1 the percentage of severity increases significantly as Stage increases from I to IV ( $p=$ 0.012, by the proportion trend test).

Figure 1.

Severity of symptoms by stages.

Figure 2.

PCP factors across stages.

Figure 3a.

D1, P1, M1, S1: D $=$ D1, 2 stage disease staging; P $=$ P1, natural prior probability; M $=$ M1, Missing variable treatment using ‘regular accuracy’ in choosing surrogates; and S $=$ S1, Symptoms included. Misclassification rate $=$ 32%.

Figure 3b.

D1, P1, M2, S1: D $=$ D1, Disease Staging is 2; P $=$ P1, prior probability is natural frequency; M $=$ M2, Missing variable treatment using ‘percent accuracy’ in choosing surrogates; and S $=$ S1, Symptoms included. Misclassification rate $=$ 33%.

As shown in Fig. 2, the PCPs’ response (PCP-R) varied across stages with an overall slight decrease ( $p=$ 0.084) as disease stage increased. PCP-R at Stage I was significantly greater than at Stage IV ( $p=$ 0.04), indicating that an early response may lead to an earlier stage diagnosis.

While PCPs’ efficiency (PCP-E) for the excellent rating, varied across all four stages ( $p=$ 0.044, by the proportion difference test), it did not appear to have a monotone trend. Our special interests were the differences between early stages (Stages I and II) and late stages (Stages III and IV), together with that of Stage IV and with other stages, respectively. The pairwise proportion tests at these different stages showed that PCP-E for those diagnosed at earlier stages were higher than at late stages ( $p=$ 0.049). PCP-E was poorer for women who were diagnosed at Stage IV. The statistical significance of the reduced PCP-E for Stage IV vs. Stage II was $p=$ 0.004, and vs Stage III was $p=$ 0.046. The statistical significance vs. Stage I was slightly less at $p=$ 0.051. This indicates that Stage 1 symptoms may be less obvious than those of other Stages.

The percentages of missing data for each variable compared across all four stages were similar and were discarded in the above analyses. FWUP at late stages (Stages III and IV) was significantly less than at early stages ( $p=$ 0.036).

Figure 3c.

D1, P2, M1, S1: D $=$ D1, Disease Staging is 2; P $=$ P2, prior probability is equal frequency; M $=$ M1, Missing variable treatment using ‘regular accuracy’ in choosing surrogates; and S $=$ S1, Symptoms included. Misclassification rate $=$ 40%.

Figure 3d.

D1, P2, M2, S1: D $=$ D1, Disease Staging is 2; P $=$ P2, prior probability is equal frequency; M $=$ M2, Missing variable treatment using ‘percent accuracy’ in choosing surrogates; and S $=$ S1, Symptoms included. Misclassification rate $=$ 39%.

Figure 3e.

D2, P2, M2, S1: D $=$ D2, Disease Staging is 4; P $=$ P2, prior probability is equal frequency; M $=$ M2, Missing variable treatment using ‘percent accuracy’ in choosing surrogates; and S $=$ S1, Symptoms included. Misclassification rate $=$ 20% by nearest 1-neighbor rule. The bottom little bar plots provide probabilities for being in the four stages, I, II, III, and IV at each of the terminal nodes. It appears that an ER visit with a known outcome and a symptom PAIN had improved the odds for early diagnosis, so do an efficient FWUP of 3 over (1, or 2), and a good PCP’s response (Y3) even without PAIN but with a BREATH problem.

Figure 4.

Variable importance plot by random forest for 5 trees Tree-A-E in Fig. 3a–e.

From the 16 potential trees examined during model development, five trees were found to produce valid outcomes and have reasonable classification rates (Fig. 3a–e). The classification rates for trees with a dichotomous outcome ranged from 60–68%. Among these models, Tree A (Fig. 3a) is the best. The classification rates for Tree-E (Fig. 3e) with a 4-stage outcome were poor, only slightly better than the equal probability guess. However, if we relax the definition of misclassification to allow for the ‘nearest 1-neighbor correctness’, i.e. the decision is counted as correct if the difference between the classified stage and actual diagnostic stage is none or equal to 1, then the classification rate for Tree-E is 80%. These classification rates are acceptable because there are other factors, such as genetics and tumor subtypes, which also help to determine the diagnostic stages. Using the ‘nearest 1-neighbor correctness’ rule also allowed us to account for the transition from one stage to another in diagnosis. The models that included only doctors’ responses as predictors did not have good classification rates, reflecting that diagnostic stages depend on more than the doctors’ responses alone.

Doctors’ follow up efficiency (FWUP in the top node) was an important factor in changing the odds of an early diagnosis in all dichotomous outcome models (Fig. 3a–d). If the PCP or ER doctor did not send a patient to an adequate follow-up path, follow up efficiency could be compromised in patients’ reports. Indeed, the PCP and ER doctors’ responses were shown more clearly as top factors (in addition to FWUP) in the 4-staging outcome model (Fig. 3e). The variable importance chart given in Fig. 4 also showed the importance of the doctors’ responses in association with diagnostic stages, given the disease symptoms, severity and patient’s self-advocacy tendency about her health.

As a secondary check for the association of doctors’ responses with diagnostic stages, we also fitted a logistic model to the early-late stage response data and ran a variable selection procedure to the subset of data that has complete records. The variables found to be important from this secondary check also included FWUP, as both Tree-A (Fig. 3a) and Tree-B (Fig. 3b) did. The fitted classification rate for the full logistic model fit was 73% and the predicted classification rate based on leave-one-out cross validation was 68%. However, it is important to note that the standard logistic modeling does not allow for missing data automatically as the tree-based models would. Thus, without re-programming and running an imputation that would need to be carefully planned and justified for missing categorical variables, the logistic regression was run with the missing values excluded. This is in contrast to the tree-based models that considered the missing information.

5. Discussion

The goal of our analysis was to identify diagnostic patterns from a complex multivariate data source and investigate their association with OvCa stages, given disease symptoms, severity and patient’s self-advocacy. In addition, we considered the possibility of complex interactions and patterns within our data. Our analytic approach therefore used tree-based models, which provide a versatile robust approach without requiring restrictive parametric assumptions about the model that may not fit the data well or is subject to challenges in dealing with missing categorical values. This method is being increasingly used in the analysis of complex biomedical data, particularly in the field of oncology (Fenton et al., 2013; Barlin et al., 2013).

We have found that the models including only doctors’ responses as predictors did not have a good discriminating power, indicating that OvCa stage at diagnosis depends on more than just doctors’ responses.

Tree-building uses a recursive-partitioning algorithm to carry out an exhaustive search of all options for splitting variables to maximize the accuracy in classification and prediction of outcomes. This process produces terminal nodes (or leaves), at which point the nodes cannot be divided anymore and need to be pruned to avoid over-fitting and to optimize efficiency in prediction.

In our analysis we were interested in the differences between diagnostic patterns for women whose OvCa was identified at a Stage I or II (early) compared to those whose OvCa was diagnosed at Stage III or IV (late). We investigated the potential advantages of classifying using a four stage outcome. However we found the misclassification rates were poor for these trees; only slightly better than the equal probability guess. If we relax the definition of misclassification to allow for the nearest 1-neighbor correctness, i.e. the decision is counted as correct if $|$ classified stage $-$ actual diagnostic stage $|$ $\leqslant$ 1 then the misclassification rate improved to 20%.

It is known that OvCa is symptomatic; however, until recently symptoms were considered non-specific. This has limited the reliability of primary diagnosis and/or appropriate referral by specifically PCPs and emergency room (ER) doctors, which has led to the majority of women being diagnosed at late stage for OvCa. Goff et al. proposed a scoring system for women who had already been identified as being at risk based on six symptoms (Goff et al., 2007). A study by Hamilton et al. in a UK population identified seven symptoms independently associated with OvCa at all stages (Hamilton et al., 2009). In a secondary analysis of this data, Grewal et al reported on the development of a scoring system for primary care physicians (Grewal et al., 2013).

These studies provided some insight regarding symptoms that were retrospectively identified from the clinical record as being related to a diagnosis of OvCa. However previous studies did not extend to consideration of patient-centered criteria, such as self-advocacy, nor did they examine the effects of physician awareness and doctors’ responses. Our study investigated the association of OvCa stages at diagnosis with doctors’ responses, together with 21 relevant symptoms, a patient-centered self-advocacy response, and disease severity. Misclassification rates for the trees in Fig. 3a–e were 20%–40%, indicating that the factors examined were important, however they cannot fully explain the diagnostic pathway for all women with OvCa indicating that further studies are needed.

As Fig. 4 shows, in all the valid trees for our model development, the efficiency of the PCP response (PCP-E) and/or follow-up (FWUP) are always among the most important variables for effective diagnosis at all stages. Similarly, self-advocacy is a critically important variable impacting effective diagnosis most significantly in two optimal trees (Fig. 3a for the 2-stage and Fig. 3e for the 4-stage classification). It is of note that this variation in importance is less than that of efficient follow-up. The consistent findings in the association of the doctors’ responses with diagnostic stages validate the importance of health providers’ awareness about OvCa and prompt responses to patients’ reports. To date there have been some small qualitative studies of these factors in OvCa, all with study populations of less than 20 (Long Roche et al. 2016; Hagan & Medberry, 2016; Hagan & Donovan, 2013; Stewart, 2016). Our study highlights the on-going need for patient-centered care, specifically listening to patient concerns and continuing improvement in the education of primary care providers.

6. Conclusion

Effective early stage diagnosis and treatment of early stage OvCa remains a challenge. Our analysis of patient-centered clinical diagnostic characteristics and symptoms shows that self-advocacy is essential for all women. Furthermore, the role of the frontline physician is critically important in ensuring effective follow-up and timely treatment. There is an increased movement toward patient-centered care, which includes both self-advocacy and recognition of the important role of frontline physicians. Our study is the first to statistically demonstrate the importance of the self-advocacy and the role of frontline physicians, based on a robust cohort of more than 500 cases.

References

http://costprojections.cancer.gov/expenditures.html. Accessed 05/20/14 and 06/12/17.

Carter

R. R.

DiFeo

Bogie

Zhang

G. Q.

, & Sun

(2014). Crowdsourcing awareness: Exploration of the ovarian cancer knowledge gap through amazon mechanical turk. PLoS One, 9(1), e85508. doi: 10.1371/journal.pone.0085508. eCollection 2014. PubMed PMID: 24465580; PubMed Central PMCID: PMC3899016.

Pomeranz

A. J.

, & Sabnis

(2004). Misdiagnoses of ovarian masses in children and adolescents. Pediatr Emerg Care, 20(3), 172-4. Review. PubMed PMID: 15094575.

Goff

B. A.

Mandel

L. S.

Drescher

C. W.

Urban

Gough

Schurman

K. M.

Patras

Mahony

B. S.

, & Andersen

M. R.

(2007). Development of an ovarian cancer symptom index: Possibilities for earlier detection. Cancer, 109(2), 221-7. PMID: 17154394.

Sun

Bogie

K. M.

Teagno

Sun

Carter

R. R.

Cui

, & Zhang

G. Q.

(2015). Design and implementation of a comprehensive web-based survey for ovarian cancer survivorship with analysis of pre-diagnostic symptoms via text mining. Journal of Cancer Informatics, 13(Suppl 3), 113-23. doi: 10.4137/CIN.S14034.eCollection2014. PMID: 25861211. PMCID: PMC4373720.

http://stat.case.edu/ovac. Accessed 07/08/15 and 06/12/17.

SEER Cancer Stat Facts: Ovarian Cancer. National Cancer Institute. Bethesda, MD, https://seer.cancer.gov/statfacts/html/ovary.html. Accessed 06/10/17.

Howlader

Noone

A. M.

Krapcho

Miller

Bishop

Kosary

C. L.

Ruhl

Tatalovich

Mariotto

Lewis

D. R.

Chen

H. S.

Feuer

E. J.

Cronin

K. A.

(eds). SEER Cancer Statistics Review, 1975–2014, National Cancer Institute. Bethesda, MD, http://seer.cancer.gov/csr/1975_2014/, based on November 2016 SEER data submission, posted to the SEER web site, April 2017.

Dalgarrd

(2006). Introductory Statistics with R, Springer.

10.

Fenton

J. J.

Onega

Zhu

Balch

Smith-Bindman

Henderson

Sprague

B. L.

Kerlikowske

, & Hubbard

R. A.

(2016). Validation of a medicare claims-based algorithm for identifying breast cancers detected at screening mammography. Med Care, 54(3), e15-e22.

11.

Barlin

J. N.

Zhou

St Clair

C. M.

Iasonos

Soslow

R. A.

Alektiar

K. M.

Hensley

M. L.

Leitao

M. M.

, Jr Barakat

R. R.

, & Abu-Rustum

N. R.

(2013). Classification and regression tree (CART) analysis of endometrial carcinoma: Seeing the forest for the trees. Gynecol Oncol, 130(3), 452-456.

12.

Hamilton

Peters

T. J.

Bankhead

, & Sharp

(2009). Risk of ovarian cancer in women with symptoms in primary care: Population based case – control study. BMJ, 339, b2998.

13.

Grewal

Hamilton

, & Sharp

(2013). Ovarian cancer prediction: Development of a scoring system for primary care. BJOG, 120(8), 1016-9. doi: 10.1111/1471-0528.12200. Epub 2013 Mar 21. PMID: 23759087.

14.

Long Roche

Angarita

A. M.

Cristello

Lippitt

Haider

A. H.

Bowie

J. V.

Fader

A. N.

, & Tergas

A. I.

(2016). “Little big things”: A qualitative study of ovarian cancer survivors and their experiences with the health care system. J Oncol Pract, 12(12), e974-e980. Epub 2016 Oct 31. PMID: 27601509.

15.

Hagan

T. L.

, & Medberry

(2016). Patient education vs. patient experiences of self-advocacy: Changing the discourse to support cancer survivors. J Cancer Educ, 31(2), 375-81. doi: 10.1007/s13187-015-0828-x. PMID: 25846573; PMCID: PMC4598253.

16.

Hagan

T. L.

, & Donovan

H. S.

(2013). Ovarian cancer survivors’ experiences of self-advocacy: A focus group study. Oncol Nurs Forum, 40(2), 140-7. doi: 10.1188/13.ONF.A12-A19. PMID: 23454476; PMCID: PMC4021021.

17.

Stewart

S. L.

Townsend

J. S.

Puckett

M. C.

, & Rim

S. H.

(2016). Adherence of primary care physicians to evidence-based recommendations to reduce ovarian cancer mortality. J Womens Health (Larchmt), 25(3), 235-41. doi: 10.1089/jwh.2015.5735. PMID: 26978124; PMCID: PMC5289707.

Associations between diagnostic patterns and stages in ovarian cancer

Abstract

Keywords

1. Introduction

2. Methods

Table 1 Primary predictor set of diagnostic characteristics – doctor’s responses

6. Conclusion

References

Table 1
Primary predictor set of diagnostic characteristics – doctor’s responses