Characterising user engagement with mHealth for chronic disease self-management and impact on machine learning performance
App usage and engagement
App usage was quantified by the fraction of days that the user was active (i.e., registered a symptom score) out of the 70-days prior to an exacerbation. A 70-day window was chosen empirically to be long enough to define the user’s typical engagement with the app while still demonstrating trends linked to exacerbation. In myCOPD, a symptom score must be registered before accessing further app functionality (on the first opening per day). A registered symptom score therefore represents a 1-to-1 relationship with app use on a given day. Figure 1 (left) shows the distribution of app usage prior to the 727 registered exacerbations. App usage is divided into three groups: frequent users (green, N = 438) who register app activity ≥66% of the possible days, intermediate users (orange, N = 156) who use the app between 33% and 66% of possible days, and infrequent users (red, N = 132) who are active <33% of possible days.
Reasons for engagement were explored in semi-structured interviews. Despite some participants noting limited use of the app, most found it helpful for logging their medication use and acting as a reminder to take medications regularly. Participants also noted that the app was a source of education around self-management, which motivated engagement.
“I used to take my medicine at all different times, and now I use it at the same time every day. And the breathing exercises and how to clear your chest and that, I didn’t know any of that before I started using the app so that’s been a great help.” [P7–male]
A further motivator was the opportunity the app offers to monitor symptoms, which provides reassurance that they are not deteriorating.
“I do like to look back when I’ve done the COPD assessment test, am I getting worse, am I getting better, and the answer is usually ‘no, you’re just the same’. It’s a bit of a comfort thing to have around.” [P4–male]
This is also evidenced by in-app data with over 60% of in-app interactions being related to medication or symptom monitoring.
Self-reported data quality and transitional engagement
Figure 2 provides a schematic of user groups divided by engagement (as in Fig. 1) and self-reported data quality prior to an exacerbation. The size of each vertical segment is proportional to the size of the group. Engagement and data quality is characterised by self-reported symptom scores.
Frequent users provide self-reports with a range of data quality (i.e., use for predictive models). ‘Reporting with Signal’ corresponds to users who show sufficient variability in their self-reports that the deterioration in condition is clear leading up to the exacerbation (i.e., gradually reporting higher scores). Conversely ‘Fixed Reporting’ corresponds to users who register consistently low or high scores (i.e., only 1 s or 3 s) prior to the exacerbation. Similar proportions of reporting with signal are found for intermediate and infrequent user groups.
Intermediate and infrequent user groups can ‘transition’ to become more engaged closer to exacerbation. We find 21.8% of intermediate and 14.4% of infrequent users (classification based on 70 days prior) transition to increased engagement groups in the 21 days immediately prior to the exacerbation (i.e., ‘Engaged Near Exacerbation’). We note that most infrequent users (69.7%) are ‘Retrospective Reporting’ a rescue pack, registering the medications in-app more than 10 days after the event and providing minimal self-reported symptom scores around the actual exacerbation.
Transitions in behaviour immediately before exacerbations were also reported in semi-structured interviews. Notably, participants reported increased app use when their symptoms were worse, as a way of refreshing their memory on self-management techniques such as breathing or relaxation exercises. This was also true of those who had more mild symptoms and had yet to experience an exacerbation, who believed they would use the app more when necessary.
“when I do get worse I’ll use it a lot more I think.” [P6-male]
Conversely, others instead said that they use the app less when they are particularly unwell, as they do not have the capacity to engage with it.
“if I need my salamol I don’t even think about it. It doesn’t, it doesn’t even occur to me to write that down or record it” [P1-male]
Despite several users becoming more engaged immediately prior to an exacerbation there is no strong evidence that this increased engagement remains short-term after the exacerbation. Figure 1 (right) shows the distribution of app use in a 70-day window post exacerbation. The shading represents the original groupings (i.e., in the 70-days prior) with the histogram being stacked so the overall area matches the left panel. We note a slight increase of infrequent users (16.7%) post exacerbation. For 9% of exacerbations there is either a notable gap in self-reports directly after exacerbation, and/or a registered symptom score of 4 (i.e., needed to seek emergency care) highlighting possible disengagement due to a deteriorated condition.
Users’ confidence in recognising risk
A key theme from user interviews was a lack of confidence around exacerbations and how to identify one. Particularly, the difficultly to differentiate between an exacerbation, a heavy cold, a chest infection, or otherwise was discussed.
“if that’s what an exacerbation is, i.e., it’s just a chest infection. Or does it mean that, I don’t know, it’s difficulty breathing and you need to take the inhaler? So I don’t know what it is no” [P1-male]
This was especially true for those who also suffer from other health complications, such as asthma or bronchiectasis. Participants noted that sputum changes are not always a reliable indicator.
“I had two exacerbations, late last year, both hospitalised and I didn’t have the normal triggers that you’d have with changes, like increased volume, coughing and things like that”
A key barrier identified was a lack of explanation from health care professionals (HCPs), with most asserting that they had never had it explained to them.
“That is all you hear is an exacerbation. You’re not actually told what it is. Well, they haven’t in my circumstances. Yes, it, you know, the nurses said ‘Ohh, it’s an exacerbation’ but it doesn’t explain what it actually is.” [P3–female]
Moreover, issues accessing HCPs means that myCOPD users had minimal opportunities to clarify or ask questions. Issues accessing HCPs also led to hesitancy about medication adherence (Supplementary Note 4).
“trying to contact your GP is, well I can’t think of a similarity but I could probably get in contact with Madonna better or more easily” [P4–male]
Confidence in identifying risk was also reflected in self-reported data. Figure 3 compares self-reported symptom scores and salbutamol use for those registering their first exacerbation in-app relative to those reporting exacerbations having experienced one before (i.e., ‘Subsequent’). Those registering their first exacerbation consistently report lower average symptom scores (top panel; chi-square statistic=726.9, P = 3.05 × 10−157) demonstrating that users with previous experience of an exacerbation are more likely to be aware of their symptoms and report them in future events. As users increase confidence in recognising their symptoms, they also engage more frequently with the app in the longer-term (bottom panel). Users experiencing their first exacerbation also typically report lower salbutamol usage (middle panel). Salbutamol (classified as a SABA) is commonly used for immediate relief of symptoms including coughing, wheezing, and breathlessness. Increased usage reflects that the individual has experienced more breathlessness through a given day and may be indicative of a more acute condition. Regardless of experience, peak salbutamol use occurs on the first day of the exacerbation whereas average symptom scores peak days later. This indicates users self-report a deterioration through medication before typically self-recognising the deterioration in symptom scores.
Higher engagement for more experienced users is also found when considering the proportion of frequent, intermediate, and infrequent users by GOLD group (Supplementary Fig. 7). The proportion of users with a history of exacerbations (C and D) increases with engagement, reflecting that users are more likely to engage as their condition becomes more of a burden to self-manage and confidence to identify risk increases with experience of previous exacerbations.
Disease acuity and engagement
Figure 4 shows the proportion of frequent, intermediate, and infrequent users by GOLD group. The GOLD 2022 guidelines use a combined COPD assessment approach to group patients according to exacerbation history and symptoms (Fig. 6B). Overall, the majority of users reporting exacerbations are in higher risk groups, predominately represented by group D. The proportion of users with a history of exacerbations (C and D) increases with engagement, reflecting that users are more likely to engage as their condition becomes more of a burden to self-manage and confidence to identify risk increases with experience of previous exacerbations.
How engagement impacts machine learning
Figure 5 compares the performance of our XGBoost model predicting exacerbation up-to three-days in advance. Model performance, measured by AUROC and AUPR, has been computed from the hold-out test sets of simulated exacerbations for each of the following user groups (darker shaded in Fig. 2): Frequent, Intermediate (Consistent), Intermediate (Engaged Near Exacerbation), Infrequent (Consistent), Infrequent (Engaged Near Exacerbation). Predictions are made daily per user (from 55 days before to 70 days post exacerbation) and exacerbation is the positive class. Performance should only be used for contrastive purposes due to simulation of self-reported symptom scores (see Methods).
Both AUROC and average precision improve with 70-day engagement (i.e., infrequent to frequent), however, for transitional users (Engaged Near Exacerbation) the drop in performance relative to frequent users is minimal. This demonstrates that transitional engagement is more important for the safety of ML models than increasing overall engagement (i.e., regardless of current condition).
link