An umbrella review of effectiveness and efficacy trials for app-based health interventions

Table of Contents

Study selection

The study selection process according to PRISMA requirements¹⁴ is summarized in Fig. 1. The database search yielded a total of 1895 records, with additional 2513 records identified through forward and backward citation searching of records from the initial search deemed eligible after full text screening by the first author. After de-duplication, 4253 articles were screened by title and abstract. Of these, 3892 records were excluded, and 361 records were included for full text screening. The final number of included articles was 48. Inter-rater reliability (IRR) for title-/abstract screening and full-text screening was κ = 0.3469 and κ = 0.9326, respectively. A list of the 313 studies excluded after full-text screening with exclusion reasons for each study can be found in Supplementary Table 1.

**Fig. 1: PRISMA flow chart of retrieved, screened and included articles.**

Review characteristics

Included reviews were published between 2013 and 2023, with the highest number of reviews published in 2020 (n = 10) and the first three quarters of 2023 (n = 9) (see Fig. 2).

**Fig. 2: Number of included reviews by publication year.**

All included reviews considered articles without geographic restrictions, except one focusing on China¹⁵. The number of RCT studies included in a review ranged from two to 36. Out of the 48 included reviews, 35^{15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49} conducted data pooling and meta-analyses whereas 13 reviews^{50,51,52,53,54,55,56,57,58,59,60,61,62} provided a narrative synthesis without meta-analysis. Median follow-up periods ranged from 1 to 10 months, with no respective information reported in six reviews^{15,31,32,33,50,53}. A summary of review characteristics is shown in Supplementary Table 2.

Methodological quality

Figure 3 summarizes the frequency of each AMSTAR2 rating for each domain across reviews. Supplementary Fig. 1 additionally presents the domain-specific methodological quality ratings for each review.

**Fig. 3: Frequency of risk of bias for each domain.**

Sixteen reviews stated that they had registered or otherwise published a review protocol^{17,19,25,34,35,38,40,41,42,47,48,51,54,56,60,62}. After checking these protocols, thirteen were rated as incomplete as they missed information on the search terms defining the search strategy (item 2)^{17,19,34,38,40,41,47,48,51,54,56,60,62}. All reviews searched at least two databases and provided their full search strategy in the final report, but 25 reviews^{16,17,19,21,22,27,28,29,33,38,39,41,43,45,47,48,49,50,52,53,54,55,58,59,61} failed to justify publication restrictions, for example regarding language, entailing a “no” on item 4. Six reviews provided a list of studies excluded at full-text screening stage (item 7)^{26,37,42,48,56,57}. Overall, a satisfactory assessment tool for risk of bias was used (item 9). Three reviews reported conflicts of interest (item 16)^16,42,49. We rated one review as moderate quality⁵⁶. IRR for quality assessment across all items and reviews was κ = 0.6671. Item-specific IRRs can be found in Supplementary Table 3.

Extraction results

Included RCTs covered populations from all continents, with a majority of studies conducted in high- or middle-income countries such as the United States, China, Australia, United Kingdom, Spain, Norway and Japan. Seven reviews^{33,38,45,46,48,49,50} did not report countries of included studies.

An overview of covered health indications is displayed in Supplementary Fig. 2 and, in more aggregated disease groups, Fig. 4. Most reviews targeted specific indications, including type 2 diabetes (T2DM) (n = 5)^{19,20,22,23,26}, hypertension (n = 4)^15,27,31,38, depression (n = 3)^33,53,61, overweight/obesity (n = 3)^40,41,52, chronic obstructive pulmonary disease (COPD) (n = 2)^35,39, urinary incontinence (n = 2)^56,62, asthma (n = 1)⁵⁷, autism spectrum disorders (n = 1)³², post-traumatic stress disorder (PTSD) (n = 1)⁵⁹, type 1 diabetes (n = 1)⁴⁷, Parkinson’s disease (n = 1)⁴⁵, knee arthroplasty (n = 1)⁴⁶ and lower back pain (n = 1)⁵¹. Twenty-two reviews covered multiple conditions within their scope, such as diabetes of various types (n = 7)^{18,21,24,25,36,37,50}, chronic non-communicable diseases (n = 2)^55,58, anxiety and depression (n = 2)^43,49, conditions requiring rehabilitation (n = 2)^42,44, pediatric diseases (n = 1)⁵⁴, diseases requiring medication (n = 2)^17,34, cardiovascular diseases (n = 2)^16,30, pain conditions (n = 2)^48,60, mental illnesses (n = 1)²⁸, or a combination of diabetes and hypertension (n = 1)²⁹.

**Fig. 4: Frequency of aggregated disease indications addressed in the included systematic reviews.**

Information on pooled sample size was provided by all except three reviews^31,45,46 and ranged from 282 to 7669 patients. Further information on extracted population characteristics can be found in Supplementary Tables 4 and 5.

The health apps performed a wide array of functions including symptoms monitoring and assessments, medication reminders, real-time biofeedback, personalized programs and education, tailor-made motivational messages or cues and feedback, social support, communication with healthcare professionals, goal setting, data storage, and visualization.

A summary of reported app characteristics and functionalities is documented in Supplementary Table 4.

Comparator conditions were described in 43 out of the 48 reviews. Some reviews included usual care comparators only, others varied between usual care or other control apps, to lighter technological features, text messages, paper-based monitoring diaries, in-person and standard education, and no treatment. A summary of reported comparators is shown in Supplementary Table 4.

Eleven reviews reported results on T2DM patients. Five focused on T2DM alone^{19,20,22,23,26}, while six included broader populations but conducted (subgroup) analyses specifically on T2DM^{21,24,25,36,37,58}. All eleven reviews except one¹⁹ assessed glycemic control, operationalized as change in glycated hemoglobin (HbA1c) reduction, as main or secondary outcome. Further outcomes comprised changes in body weight, waist circumference or body mass index^19,20,22,23, fat mass or percentage of body fat¹⁹, lipids, blood pressure, lifestyle changes, medication use^20,22,23, psychological symptoms and quality of life (QoL)²³. All studies that focused on other types of diabetes (e.g., type 1 diabetes, mixed types, prediabetes, gestational diabetes)^{18,36,37,47,50} focused on HbA1c changes as main outcome, while only a few included adverse events^37,54 and QoL⁵⁴. Another outcome reported for diabetic populations was medication adherence, but it was reported in samples that did not exclusively include diabetes patients (patients with prescription drugs¹⁶, chronic disease patients^34,54).

Reviews including patients with hypertension focused on evaluating the impact of health app interventions on medication adherence^27,31,38, systolic and diastolic blood pressure^15,27,38, and health behaviors^27,38. Three reviews^16,17,34 reported on medication adherence, and two reviews^16,29 on systolic and diastolic blood pressure, lipids and anthropometric outcomes in samples that did not exclusively include hypertensive patients.

Reviews focusing on patients with depression measured improvements of depressive symptoms^33,53,61, and self-esteem and QoL^53,61. Two reviews additionally reported results for medication adherence^17,61, one⁶¹ on psychiatric admissions, medication adherence and side effects, resilience, attitudes, sleep disturbances and further psychological and behavioral outcomes⁶¹. Further reviews reported on depressive^28,43,49, mania and psychotic symptoms as well as adverse events²⁸ and anxiety symptoms^43,49 in samples that did include depression patients, however not exclusively. Outcomes evaluated in other mental health indications included symptoms related to PTSD⁵⁹, positive and negative psychotic symptoms including hallucinations or delusions and absence of experience (in schizophrenia), mania and depression symptoms (bipolar disorder)²⁸, and autism-related outcomes based on the Mullen Scales of Early Learning, MacCarthur-Bates Communication Development Inventory and Communication and Symbolic Behavior Scales³².

Reviews focusing on overweight and obesity used the following outcomes: weight loss^40,41,52, waist circumference, blood pressure, lipids, HbA1c, energy intake^40,41, physical activity, body fat, BMI⁴⁰, motivation and adherence⁵².

Outcomes reported in other indications can be found in Supplementary Tables 4 and 6.

Figure 5 illustrates the types of outcomes reported in the systematic reviews by aggregated groups of investigated health conditions. More details on the uncategorized outcomes can be found in Supplementary Tables 4 and 6.

**Fig. 5: Distribution of outcome types reported by categorized disease indications.**

Twenty-three out of 35 meta-analyses conducted subgroup analyses^{18,19,20,21,23,24,25,26,27,28,29,33,34,36,37,40,41,43,47,48,49,53,57}. Investigated subgroups were defined by number, types and intensities of app features, differentiation between standalone or integrated interventions, baseline demographic or disease-related participant characteristics, follow-up duration, intervention duration, study quality, type of comparator, sample size, attrition, analytic approaches, and outcome assessment methods. Summaries of the subgroups investigated are in Supplementary Table 7.

Overall, 41 out of the 48 reviews concluded that app-based health interventions were effective in improving health outcomes. The seven systematic reviews which did not conclude that app-based health interventions were effective reported inconclusive results as some studies showed effectiveness and others did not^{35,51,53,54,57,61}, or reported clinically irrelevant improvements⁴¹. Reported synthesized outcomes, types of effect estimates, and number of underlying individual studies were heterogeneous. A complete overview of extracted results and summaries of author’s conclusions is shown in Supplementary Table 6. For example, for medication adherence, meta-analysed effect estimates reported in 6 systematic reviews ranged between 0.38 and 0.8 standardized mean difference, with 2−14 studies summarized, 6 out of 6 meta-analysed point estimates suggesting an increase in medication adherence, and 6 out of 6 meta-analytic results suggesting statistically significant effects. Three reviews additionally expressed effect estimates for medication adherence in terms of Odds Ratios or mean differences. For HbA1c, meta-analysed effect estimates from 13 systematic reviews ranged between 0.06% and −0.6% (weighted) mean difference, with 2−24 studies summarized, 27 out of 28 meta-analysed point estimates suggesting a reduction in % HbA1c, and 18 out of 28 meta-analytic results suggesting statistically significant effects. For systolic blood pressure (SBP), meta-analysed effect estimates from 9 systematic reviews ranged between 0.1 and −8.12 mmHg (weighted) mean difference, with 2−13 studies summarized, 8 out of 10 meta-analysed point estimates suggesting a reduction in SBP, and 4 out of 10 meta-analytic results suggesting statistically significant effects. Two reviews additionally expressed effect estimates for SBP in terms of Odds Ratios or mean differences. In two reviews with meta-analysed results on SBP the outcome unit was unclear.

link

An umbrella review of effectiveness and efficacy trials for app-based health interventions

Study selection

Review characteristics

Methodological quality

Extraction results

Study Finds Health Apps Still Struggle With Data Transparency

Bremen researchers uncover mobile health apps with a bad case of data leakage

Where to look and red flags

Leave a Reply Cancel reply

Study Finds Health Apps Still Struggle With Data Transparency

Bremen researchers uncover mobile health apps with a bad case of data leakage

Where to look and red flags

Rising Trends of Mobile Healthcare Services Market Generated

Effectiveness of mobile application interventions for stroke survivors: systematic review and meta-analysis | BMC Medical Informatics and Decision Making

Study selection

Review characteristics

Methodological quality

Extraction results

More Stories

Leave a Reply Cancel reply

You may have missed