Measuring Hypotheticals: "Agent Starling, do you think you can dissect me with this blunt little tool?"

February 2, 2010
0 Comments

Everyone agrees that outcome measures in behavioral healthcare are desirable, long overdue, and ultimately the only way to produce the evidence that validates Evidence Based Practices. Funders are eager to put them into practice to improve the quality of services and more cynically to find a justifiable method to deny payment or eligibility.

Realistically when we are paying for something, we always want the most effective version possible. It’s like Jerry Seinfeld said in regard to over-the-counter medications—we only want the maximum dose possible. Jerry says, "I want you to find a dose that will kill me and then back off a little bit."

To measure treatment effectiveness, we have devised all kinds of multidimensional functional assessment scales. These instruments tend to measure phenomena that can be observed (or less reliably reported). Our state would like to use such scales as a managed care instrument to determine who should qualify for services. The problem with this is that then we have to measure how the person would function without the benefit of treatment in cases where they are already receiving treatment. The question is, how do you tease out the difference without stopping treatment?

To address this we have been told, "If services are in place that mask a need, the ratings should reflect the need, not the fact that the service is masking it. The purpose is to rate the needs of an individual and caretakers, not how they are functioning with services in place." So we are asking people to rate what the score would have been without services. Unless the clinician has actually seen the client without services, they must rely on reports, records, or more likely guesses. And of course since the client may have actually improved or regressed while recieving services, no one can really tell what they would be like without services—this is a "total hypothetical." What sort of inter-rater reliability can you expect from that sort of rating? I wonder what would a scale developer say to this tweaking? It sort of reminds me of the old adage for literary reviewers—"One should always review the book that is there, not the one that the author should have written instead."

And to add to challenge, in order to continue to be authorized for services clinicians must show progress using the same rating scales. In this case I think you are trying to measure how much the client progressed from the hypothetical untreated need level. Basically you have to now figure out how to show a new and improved imaginary basal need.

Unfortunately the services not only mask need, they also confound progress. So in theory a person could be functioning exactly the same at two points in time. However at Point 1 the need level was high and at Point 2 the need level was moderate, but they looked identical due to the masking effect of the service. Theoretically at Point 3 they even could be disqualified from services because their imaginary need has been met.

The simplest solution would be to withdrawal all services and just see what happens. The experimental analysis of behavior has provided "the reversal design," the most powerful of the single-subject research designs—sort of like the old "drug holidays." Unfortunately, service interventions cannot be reversed, because of those darn ethical reasons.

Page
of 2Next
Topics