within study conditions and at the individual level were similar and led to the conclusion that differences could be attributed to inherent difficulty of the claim, complexity of the sequential evaluation process, or the medical standards and guidelines themselves. Analyses further explored these sources of disagreement.
An analysis examined the agreement between study conditions when difficult claims were eliminated and found that the proportion of agreement between study conditions was 0.96 (kappa of 0.78), suggesting that problems in evaluation can be attributed largely to the inherent difficulty of claims. If the standards and guidelines led to markedly different disability judgments than the statute, then a far greater degree of disagreement would have been found between the two study conditions. Nonetheless, the findings do not rule out the possibility that improvements could be made to the standards and guidelines that would result in less difficulty and disagreement about disability status in the determination process.
The next series of analyses examined the agreement between study conditions for each of the seven categories of mental impairment of the Listings as reviewed using the PRTF. Three of the Listings categories—Organic Mental Disorders; Schizophrenia, Paranoid and Other Psychotic Disorders; and Anxiety Disorders—were found to work well. Affective Disorders showed a statistically greater chance of disagreement about disability decisions than other disorders. Personality Disorders and Mental Retardation and Autism9 had low agreement rates but were not statistically significantly. The sample size for these two disorders was small. The sample size for Somatoform Disorders was too small to interpret its high disagreement rate properly.
Within the Listings, analyses explored the agreement for use of the A, B, and C criteria. Moderately high rates of agreement on the selection of which A criteria to adjudicate a claim were found for Organic; Schizophrenic, Paranoid and Other Psychotic; Affective; Mental Retardation and Autism; and Anxiety Disorders. There was less concurrence in panelists’ selection of Somatoform Disorders and Personality Disorder as Listing categories in which to adjudicate a claim. Ratings of agreement for the B criteria were reasonably high in the aggregate, but less reliable for the individual B criteria. Additional analysis found the first three B criteria (activities of daily living, social functioning and deficiencies in concentra-