In the developmental phase of adherence measurement in our clinic, we constructed a 5-item instrument whose individual items were selected from the 51-item ACTG adherence battery [2] on the basis of factor structure and internal consistency reliability. In the manuscript presenting this developmental work, we showed that responses on the 5-item adherence index, administered on one occasion 30 days after initiating a new antiretroviral regimen, were moderately correlated (Spearman rho 0.40 – 0.48) with measures of electronic drug monitoring (EDM) and were predictive of HIV viral load responses at 3 and 6 months after start of treatment in models controlling for baseline viral load and prior antiretroviral experience. We also showed that a cut point of 5 or more on the index distinguished those with viral load suppression (≤ 400 copies/ml) at 3 and 6 months from those failing to suppress at the same time points [1]. The currently reported analyses were conducted to evaluate whether the same 5-item index, when administered repetitively under longitudinal follow up, predicted initial viral suppression and maintenance of suppression while patients continued the index regimen. We found, conditional upon the study eligibility criteria and analytic methods, that the self-report adherence index scores were predictive of both outcomes in models controlling for prior antiretroviral treatment experience and baseline plasma viral load. For the time to initial viral suppression outcome, adherence scores ≥ 5 were associated with an approximately 60% reduced hazard of achieving a plasma viral load ≤ 400 copies/ml. For the maintenance of viral suppression outcome, adherence scores ≥ 5 predicted an approximately 80% lower chance of maintaining viral suppression relative to scores less than 5.
These findings are not directly comparable to the effects demonstrated in our earlier study for several reasons including: (1) period effects (1998 – 1999 vs. 2003 – 2006) associated with changes in potency and simplicity of antiretroviral regimens; (2) differences in prior treatment experience (22% vs. 60% antiretroviral naïve comparing the earlier to the current study); (3) conditions of adherence measurement (written completion [earlier study] vs. computer assisted [current study]); and (4) differences in analytic approach (outcomes analyzed cross sectionally at fixed time points [earlier study] vs. longitudinally in continuous time [current study]). Nonetheless, the current results contribute to the predictive validation of the instrument as it has been used in routine clinical care of patients on antiretroviral therapy.
In a recent review of the status of HIV adherence measurement, Chesney presented a conceptual model of adherence assessment and intervention, distinguishing research from clinical applications, and resource-rich from resource-poor settings. In discussing the "elusive gold standard" of adherence measurement, she emphasized that "efforts should continue to develop a portfolio of different valid and reliable self-report measures with varying strengths and weaknesses that can be optimally applied, depending on the situation [3]." In that spirit, we discuss a number of challenges that emerged in exploring the relationship between routine longitudinal adherence measurement using the Owen Clinic instrument and viral suppression.
First, adherence score distributions in the current (Figure 2) and previous study were highly skewed, with most observations clustered in a range reflecting good adherence and the remainder of observations distributed in the long tail of the distribution reflecting poorer adherence. The clustering of observations toward the excellent adherence end of the distribution creates ceiling effects [4]. Others have noted the same phenomenon for other self report measures [5–8]. The clustering of scores toward excellent adherence likely represents a mixture of responses from truly adherent patients and from others exhibiting social desirability bias [9]. Simoni et al have commented on approaches to minimize both ceiling effects and social desirability bias in adherence assessment [10]. Comparison of self report scores to independent and hopefully more objective measures of adherence (e.g. pharmacy refill data, pill counts, EDM) offer an opportunity to assess the effect of social desirability bias. In other contexts, the use of measures designed to measure social desirability as a construct have been used as covariates to explain self reported health behaviors subject to such response bias [11, 12]. With regard to ceiling effects not contaminated by social desirability bias, designing items to capture more challenging aspects of adherence behavior, such as timing of doses or dose taking at inconvenient times (e.g. at work, on weekends, or in the presence of persons not knowing the patient's diagnosis), has been recommended to mitigate the strict ceiling commonly observed in self reported adherence. It should be noted, however, that our instrument included three items (Figure 1: items 2–4) dealing with such recommended approaches.
Second, the modeling of adherence score is not straightforward. As constructed its scale of measurement is discrete numerical with a possible range of 0 – 25 with skewness not amenable to a normalizing transformation. Although cut point selection for an underlying numerical measure may introduce bias in effect measurement [13] and may reduce power to detect effects in comparison with use of the numerical measure [14], cut point models are often preferred because of simplicity of data summarization and interpretation. Post hoc cut point selection, as pointed out by the authors of the STARD initiative [15] (Item 9), may not be replicable with other datasets. In our modeling of the effect of adherence score, we employed an approach adapted from Williams et al [16], first exploring the functional form of the relationship between adherence score as a numerical measure using smoothing regression splines as implemented by Royston and Sauerbrei in STATA followed by cut point examination adjusted for multiple comparisons [17]. Cut points alternative to what we have described as the lowest detectable cut points could be recommended if alternate methodologies of correction for multiple comparisons were employed (e.g. cross validation or split sample approaches, or examination in independent data sets). It is of interest that in our earlier study, a similar cut point on the same instrument (≥ 5/< 5) was felt to be the most discriminating cut point [1]. After examining the regression spline plots for both outcome metrics (Figures 3 and 4) in the current study, we felt that a cut point around 5 identified a region above which a monotonic relationship between adherence score and functions of the outcome metrics was suggested. In clinical care settings, we believe, based on these data, that our clinicians should be alert to clinically significant problems with adherence for scores at or above 5.
Third, because of the observational nature of the data, measurements of adherence and HIV plasma viral load were not scheduled to occur simultaneously. Typically clinicians order viral loads every 3 – 6 months depending on clinical factors. Adherence in contrast is measured in our clinic at all routine visits. Conceptually, adherence is a construct representing a daily health behavior for which various self-report indicators have been developed and mapped to estimates of percentage adherence over a defined period or, as in the case of the Owen Clinic instrument, given interpretability primarily through demonstrated association with viral suppression. Because of the staggered nature of data accrual in the clinic, decisions must be made regarding how to line up sequential viral load and adherence measures. At a conceptual level, it is a non-trivial question to decide over how long a period an adherence measure based on a limited recall period (4 days in the case of our instrument) can be extrapolated with regard to preceding and future adherence behaviors for which the self-report data represents an imperfect indicator. In our primary analysis, we made the assumption that a given adherence assessment carried forward no longer than 90 days from the antecedent adherence measurement. Whether the observations that are not temporally matched represent truly missing observations is debatable since the very nature of the data accrual process in clinical care did not require temporal matching of adherence and viral load measurement. Because the LOCF principle has been criticized in recent years [18], we explored alternate analyses to evaluate the robustness of our findings. First, to determine if the frequency of adherence measurement was related to adherence scores such that longer intervals between measurements were associated with better or poorer adherence, we calculated rates of adherence measurement per 100 days of follow up. We then divided the adherence measurement rate distribution into quartiles and used analysis of variance to test for equality of mean adherence scores across the quartiles, finding no significant difference (p = 0.89). This provided limited evidence that, in our data set, adherence scores were not systematically related to frequency of measurement, although others have found that missing adherence values were associated with nonadherence [19]. Second, we restructured the data set by grouping follow up time in 6 month intervals, taking the median adherence score for the interval as representative, the last viral load in the interval as the outcome, and repeating the panel regression for longitudinal viral suppression. In a model comparable to that shown in Table 3 controlling for prior treatment experience and baseline log10-HIV viral load, the adjusted odds ratio for viral suppression was 0.14 (95% CI: 0.06 – 0.33, p < 0.0001) for a 6-month median adherence score greater than 5. Finally, in a third analysis of maintenance of longitudinal viral suppression, mean adherence scores were calculated for the period immediately prior to each viral load measurement, creating a score for each interval between viral load measurements. This operationalization of adherence was then fit in a GEE logit model for maintenance of viral suppression, again controlling for prior treatment experience and baseline log10-HIV viral load. The adjusted adherence odds ratio for maintaining viral suppression for a mean interval adherence score greater than 5 was 0.28 (95% CI: 0.14 – 0.57, p < 0.0001) Therefore, although the adherence effect estimates were model dependent, the direction of effect was consistent and significant across models.