Abstract
BACKGROUND:
Digital phenotyping has been defined as the moment-by-moment assessment of an illness state through digital means, promising objective, quantifiable data on psychiatric patients’ conditions, and could potentially improve diagnosis and management of mental illness. As it is a rapidly growing field, it is to be expected that new literature is being published frequently.
OBJECTIVE:
We conducted this scoping review to assess the current state of literature on digital phenotyping and offer some discussion on the current trends and future direction of this area of research.
METHODS:
We searched four databases, PubMed, Ovid MEDLINE, PsycINFO and Web of Science, from inception to August 25
RESULTS:
Of 10506 unique records identified, we included a total of 107 articles. The number of published studies has increased over tenfold from 2 in 2014 to 28 in 2020, illustrating the field’s rapid growth. However, a significant proportion of these (49% of all studies and 87% of primary studies) were proof of concept, pilot or correlational studies examining digital phenotyping’s potential. Most (62%) of the primary studies published evaluated individuals with depression (21%), BD (18%) and SZ (23%) (Appendix 1).
CONCLUSION:
There is promise shown in certain domains of data and their clinical relevance, which have yet to be fully elucidated. A consensus has yet to be reached on the best methods of data collection and processing, and more multidisciplinary collaboration between physicians and other fields is needed to unlock the full potential of digital phenotyping and allow for statistically powerful clinical trials to prove clinical utility.
Keywords
Introduction
Digital phenotyping refers to the moment-by-moment assessment of an illness state through digital means, by means of mobile devices and their accompanying applications and wearable sensors [1, 2, 3]. Such assessment is of importance, as it could be used to measure human’s behavior and this information will be essential for clinical assessment; such information could be used in the prediction of the clinical status of individuals and could help to facilitate the delivery of on-demand interventions. The data that is collated and utilized for the purposes of digital phenotyping could be classified into two categories, that of active and passive data. “Active data” refers to data that requires individuals’ action or inputs, whereas “passive data” refers to data that is collated without individuals’ action or inputs, and usually by means of sensors like the global positioning system (GPS), other biosensors such as heart rate monitoring, and event data like patients’ call logs or their social media usage [4, 5, 6].
With the penetrance of smartphones ever-increasing worldwide (up from 49.35% in 2016 to 78.05% in 2020) [7], it is expected that more individuals will be using smartphones; and with the advances in smartphone technologies, more data could thus be collated. Digital phenotyping has been well explored in other fields of medicine. Data collated by means of digital phenotyping has been used to quantify ALS progression [8, 9] and even assess mobility and quality of life in spine disease patients [10]. In specialties like psychiatry, there have already been studies reporting the findings using digital phenotyping in various psychiatric conditions. Some examples include mobile phone detection of semantic locations visited and its relationship to depression and anxiety [11], predicting mood disturbance severity with mobile phone keystroke metadata in patients with Bipolar Disorder (BD) [12], prediction of relapse risk in Schizophrenia (SZ) through smartphone derived estimates of mobility, sociability, screen time and sleep [13] or even detecting recovery problems through linguistic analysis in an online substance abuse forum [14]. This is a exciting and fast-growing field in psychiatry, with promise to revolutionize practice with a shift toward “precision psychiatry” – more personalized, targeted treatment with more objective data on patients’ symptoms [15]. The utilization of passive data parameters also reduces common confounders in psychiatric research and evaluation, that of patient recall bias and the need for self-reporting. In general, studies have also reported that patients are mostly willing to use digital phenotyping and share the data with their healthcare providers if privacy and data security is assured [16, 17, 18].
There are, to our knowledge, currently no reviews that have broadly assessed the state of digital phenotyping research in psychiatry. As a rapidly growing field, it is to be expected that new literature is being published frequently, and thus we conducted this scoping review to assess the current state of literature on digital phenotyping and offer some discussion on the current trends and future direction of this area of research.
Methods
Information sources and search strategy
A comprehensive literature search in bibliographic databases was conducted, spanning PubMed, Ovid MEDLINE, PsycINFO and Web of Science. The search strategy was crafted in consultation with a librarian at the Lee Kong Chian School of Medicine, Nanyang Technological University Singapore. The target population was individuals with psychiatric disorders, and the intervention used was that of digital phenotyping. The search strategy was developed on PubMed, and subsequently adapted to the remaining search engines. MeSH headings were used where available, and related terms were included in the search strategy as well (the complete search strategy is listed in Appendix 2). All searches were conducted from the database inception through to August 25, 2021.
Inclusion and exclusion criteria
Reviewers independently screened articles for eligibility and obtained full texts of all potentially relevant papers. We included studies written in English that 1) investigated or applied their findings to diagnosed Psychiatric disorders and 2) utilized digital phenotyping for management or diagnosis. Protocols were excluded.
Data extraction
Search results were exported into EndNote X9 and subsequently uploaded into Covidence, where duplicates were removed, screening conducted, and data extracted.
The following was extracted from included studies: 1) Publication information (author, year of publication), 2) the country in which study was conducted, 3) Study design, 4) Population description (cohort sizes, Psychiatric diagnoses of interest), 5) Types of sensing used (active/passive) and data collected and 6) Outcomes.
Analysis
A narrative synthesis approach was used, and processing of extracted data and creation of figures were handled in Excel.
PRISMA flowchart of the study selection process.
As shown in Fig. 1, there were an initial 10506 records after removal of 9082 duplicates. Titles and abstracts were screened to determine eligibility, and 150 full-text records were obtained and assessed against the inclusion and exclusion criteria. A final 107 were included. The number of published studies per year increased over tenfold from 2 in 2014 to 28 in 2020, illustrating the field’s rapid growth (graph available in Appendix 3).
Study characteristics
Of the included studies, a significant proportion (49% of all studies and 87% of primary studies) were proof of concept, pilot or correlational studies examining digital phenotyping’s potential (Table 1). As such, they typically had smaller cohort sizes, with a mean of 60.2 and median 50. These highlight the fact that the field is still developing and attempting to determine the feasibility of various methods and interventions. Thus, studies validating methods of digital phenotyping and clinical applications are relatively fewer [19, 20].
Number of studies included and type of study
Number of studies included and type of study
When analyzing the country in which studies were conducted, we found that the majority (
Distribution of research among various psychiatric conditions
It appears that most (62%) of the primary studies published evaluated individuals with depression (21%), BD (18%) and SZ (23%).
Types of data collected
Most correlational or proof-of-concept studies included used a combination of both active and passive sensing (
Most studies (
Parameters measured varied greatly, including but not limited to geolocation, accelerometer (and derived contextual cues), call/text/communication app logs, microphone audio, video, ambient light, heart rate, electrodermal activity, screen status and app usage
Discussion
The top three studied disorders were SZ, BD, and Depression as shown in Fig. 2. According to Rehm et al. [75], these diseases constitute an estimated 41% of global mental health disease burden as measured in disability-adjusted life years (DALYs), and it is expected that these conditions are highly evaluated. There are a considerable number of narrative reviews (
Analysis of studies by psychiatric condition.
Our first key finding is a lack of consensus on which data streams are relevant, and in what resolution and for which psychiatric conditions. Most primary studies (30/51) utilized a combination of passive sensing and active sensing, typically Ecological Momentary Assessments (EMAs). This was usually to deliver standardized clinical measures such as the PHQ-9 or GAD-7 in comparable spatiotemporal resolution to assess the clinical utility of passively sensed data more appropriately. As described by Cohen et al. [76, 77, 78], validity against “gold standard” measures is problematic if digital phenotyping and “gold standard” criterion are not matched in time and space. Thus, considering spatiotemporal resolution is of utmost importance when validating correlations and drawing conclusions. There is great diversity on which data is collected, shown in the non-exhaustive list in Section 3.5. The most gathered data was geolocation (
Secondly, there is also a lack of consensus on the best ways to approach data processing, with great diversity on what data is collected, as listed in the non-exhaustive list in Section 3.5. Each of these data streams need to be correlated individually as well as in combination with other collected data, thus presenting the gargantuan problem of data processing. Fortunately, with the advancement of machine learning, there is hope for less labor-intensive yet more efficacious ways of identifying clinically relevant patterns in data collected. A recent review by Antosik-Wojcinska et al. [82] found that “State-of-the-art machine learning solutions have approximately accuracy ranging from 67% to 97%”, proving the feasibility of using machine learning. This is especially relevant when considering that some studies show that correlations between collected data and standardized clinical measures vary significantly on an individual level – weakening or even causing non-significance at cohort level, [37, 67, 68, 73] thus contributing to the heterogeneity of results. Thus, a shift toward individualized methods of data analysis and validation of passively sensed data should be explored, as demonstrated by Wisniewski et al. [73]. Machine learning can be leveraged upon to train individual-specific models, possibly conferring further improvements in the clinical utility of digital phenotyping.
Although this study was intended to be a scoping review, PRISMA guidelines [84] were followed, with the exception of quality assessments. Our search, strategy was also broad and comprehensive. However, as this review was meant to be a broad overview of the state of literature, we did not elaborate excessively on current findings in each type of data or psychiatric illness due to the large variety of articles included. The project time frame was also limited to 6 weeks, and we did not hand-search references of identified articles to be included to ensure that all relevant studies were captured. We also did not search other databases such as CINAHL and Embase.
There are several research and clinical implications arising from this study. With regards to research implications, there seems to be a consensus forming on which data are more relevant than others, and which data processing techniques are promising. We must consider validating these established relationships with more statistically powerful methods. However, due to possible poor generalizability and high inter-individual variance [11], there is a need for RCTs that are specific and targeted in scope. Further research should investigate the feasibility of creating individual correlation models to maximize monitoring accuracy and thus clinical utility. Multidisciplinary collaboration between physicians and other fields, such as software engineers and computer scientists, must continue to refine methods of analysis and gain deeper understandings of gathered data to apply them to the clinical context [83]. We believe that with better subcategorization of studies and more data on specific correlations in defined spatiotemporal contexts, the field can begin to move toward validating said correlations with standardized measures and thus improve the clinical utility of digital phenotyping.
Most primary studies show promise in certain domains and relationships, which have yet to be fully elucidated, requiring more investigation into sub-categories and reasons behind this.
Data availability
All available data have been included in the manuscript.
Author contributions
AC & MZ jointly conceptualized the study. AC and MZ reviewed the literature and obtained the primary data. AC performed the analysis with guidance from MZ. MZ verified the data-analysis. AC wrote the first draft of the manuscript, which MZ provided critical input. All authors read and approved of the manuscript prior to submission.
Funding
MWZ is supported by a grant under the Singapore Ministry of Health’s National Medical Research Council (grant number NMRC/Fellowship/0048/2017) for PhD training. The funding source was not involved in any part of this project.
Supplementary data
The appendices are available from https://dx-doi-org-s.web.bisu.edu.cn/10.3233/THC-213648.
Footnotes
Conflict of interest
None of the authors have any competing interests.
