The National Academies Press: Home The National Academies: Home
Read more than 4,000 books online FREE! More than 1900 PDFs now available for sale
HOME ABOUT NAP CONTACT NAP HELP NEW RELEASES ORDERING INFO Questions? Call 888-624-8373 cart icon Items in cart [0]
Browse by topic
View special offersEmail this pageSign up for email updates
Appendix D | Coverage Matters: Insurance and Health Care | Committee on the Consequences of Uninsurance | Board on Health Care Services | Institute of Medicine




Committee on the Consequences of Uninsurance

Board on Health Care Services

Institute of Medicine

 



D

Multivariate Analyses



In the two tables that follow, the Committee reports estimates of how much uninsured rates may be influenced by specific socioeconomic, demographic, and geographic characteristics alone. These estimates were prepared for comparison with the uninsured rates presented in the body of this report. They were derived by means of multivariate statistical analysis, using data from the 2000 Current Population Survey (CPS) in the form of a derived variable file made available to the Committee by Paul Fronstin and the Employee Benefit Research Institute.1 Four sets of analyses were performed to estimate and predict differences in uninsured rates by:

  1. socioeconomic characteristics;
  2. race and ethnicity;
  3. immigrant and nativity status, both alone and specifically by race and ethnicity; and
  4. geographic areas.

Tables D.1 and D.2 each present two sets of results for these analyses. The first column of each table reports comparisons of the likelihood of being uninsured between a group of interest and a reference group. For example, Hispanics are compared with non-Hispanic whites, with the difference in uninsured rate between the two groups reported in terms of percentage points. The second column reports a comparison between the same two groups as in the first column but taking into consideration, or adjusting for, population characteristics that are known to affect the likelihood of being uninsured and which often are closely related to, or highly correlated with, the group's identifying characteristic, for example, race and ethnicity.

For all four sets of comparisons, a series of logistic regression equations were prepared to estimate and predict uninsured rates, with the method of adjusting for population characteristics differing for each of the four sets of comparisons.2 Except for the analysis of immigrant and nativity status, no interaction terms were used, on the assumption that change in the value of a measured characteristic (or covariate) is unlikely to lead to change in the values of other covariates in unique ways.

For the analyses by race and ethnicity, to arrive at the estimated difference reported in the second column of Table D.1, a logistic regression model was created to estimate the likelihood of being uninsured for the reference group and all comparison groups, taking into consideration or adjusting for all measured characteristics other than race and ethnicity.3 This regression model also yielded a set of values, or coefficients, each of which describes the relative influence on the uninsured rate of a characteristic included in the model, for the reference group (e.g., non-Hispanic whites). To estimate the predicted differences reported in the second column of Table D.1, the same logistic regression model was used, combining the coefficients generated by the regression model for the reference group and the values (or covariate data) of the population characteristics (e.g., age, gender, health status) that describe the comparison group (e.g., non-Hispanic whites). The difference between this predicted likelihood of being uninsured (reported in column 2) and the reference group's estimated likelihood of being uninsured reflects differences between the comparison and reference group other than those reflected in the values for each population's measured characteristics.4 Linear regression was used to evaluate the size and statistical significance of the difference (reported in column 2) between the predicted likelihood and the comparison group's estimated likelihood.5 Because the results are presented in terms of differences between comparison and reference groups, an estimated uninsured rate of minus 1.1 percent, for example, is a rate that is 1.1 percentage points below the uninsured rate for the reference group.

For example, the logistic regression model based on population characteristics in our CPS data set gives an estimated uninsured rate for Hispanics that is 22.2 percentage points higher than the estimated uninsured rate for non-Hispanic whites, a difference that is statistically significant. If the differences in each population's measured characteristics influenced the likelihood of being uninsured in identical ways for both groups, and if there were no other influences on uninsured rates, the predicted difference between the uninsured rates for Hispanics and non-Hispanic whites should be zero. Instead, the predicted difference is 7.2 percentage points, a difference that is both statistically significantly different from zero and about 15 percentage points smaller than the unadjusted difference between the two groups. Therefore, two-thirds of the difference in estimated uninsured rates between Hispanics and non-Hispanic whites reflects differences in the values of measured population characteristics between these two groups, while approximately one-third of the difference reflects other factors that were not measured by the CPS data set or modeled in the multivariate analysis.

The analyses by immigrant and nativity status were similar to the analysis by race and ethnicity. For each group identified by race and ethnicity (e.g., Hispanic, non-Hispanic African American, and other), a logistic regression model was prepared to estimate an uninsured rate and a set of coefficients for a reference group of U.S. born citizens. The difference in estimated uninsured rates between each comparison group (e.g., foreign born, short-term resident, long-term resident) and the reference group is reported in column 1, stratified by race and ethnicity. To estimate the predicted differences in estimated uninsured rates reported in column 2, logistic regression models were prepared for each racial and ethnic group in which the coefficients for the reference group were combined with covariate data for each comparison group.6 A preliminary analysis of the data suggested that stratifying the multivariate analysis by race and ethnicity would allow for the observation of important differences among populations of immigrants and naturalized citizens, especially useful for understanding uninsured rates within the Hispanic population.

The analyses by poverty level and education level of primary wage earner and the analysis by state were conducted using an approach that differed only slightly from the analysis by race and ethnicity. To obtain the estimated differences reported in the second column, a logistic regression model was created to estimate the likelihood of being uninsured for both the reference groups (e.g., families earning greater than 200 percent of the federal poverty level, and primary wage earner with postcollege education) and all comparison groups. This model took into account or adjusted for all the measured characteristics save for the characteristics of poverty level and education or state (in Table D.2). Linear regression analysis was used to evaluate the size and statistical significance of the difference (reported in column 2) between the adjusted likelihood of being uninsured for each comparison group and that of each reference group.7

The estimates reported in Tables D.1 and D.2 indicate that there is considerable variation, both in how much specific characteristics may influence a group's uninsured rate, independently of other measured characteristics, and in how much variation between uninsured rates is not accounted for by the measured characteristics used in the models. For example, the average uninsured rate for members of families earning less than 100 percent of FPL is estimated to be 24.2 percentage points higher than the average uninsured rate for members of families earning at least 200 percent of FPL. If members of families earning less than 100 percent of FPL as a group resembled members of families earning at least 200 percent of FPL, the uninsured rate for family members earning less than 100 percent of FPL would be predicted to be 15.3 percentage points, a 9 percentage-point or 37 percent diminution in the difference between uninsured rates. The 63 percent difference that remains cannot be attributed to differences in the measured characteristics (other than poverty level and education) and is not addressed by the models in this specific analysis. One would expect fairly large proportions of the differences in uninsured rates to remain unaccounted for by or associated with the specific characteristics evaluated, because there are many aspects of socioeconomic status, demographic characteristics, health status, and geography that are not measured in this analysis.

In every case, controlling for other correlated factors that influence insurance status reduces the estimated effect of a factor examined in the simple bivariate comparisons. In no case were the effects of those factors completely related to additional covariates.


Notes

1 The Committee's analysis considers family units defined in terms of kin relationships, which may give different estimates than other analyses cited in this report, and based on CPS data, in which family units are defined in terms of insurance eligibility.

2 The Committee's analysis follows the method used by Ku and Matani, 2001, for using logistic regression models to estimate the probability of being uninsured, with comparisons between reference groups and comparison groups reported in terms of percentage point differences (in the case of Ku and Matani, in estimated mean change in the probability of having a specified source of coverage or being uninsured).

3 The first step of the adjustment process included state fixed effects to control for state policy and other differences that would generate intra-state cluster effects.

4 An alternative approach would be to prepare a single logistic regression with covariates for the characteristic of race and ethnicity and all other characteristics, plus interaction terms to describe the relationships between the characteristic of race and ethnicity and all of the other characteristics. Our adjusted comparison would consist of the difference in the probability predictions between what happens for the reference group and each of the comparison groups. This approach would require that the full model (including all covariates) be estimated for each subgroup, which is difficult given the size of some subgroups. The decision was made to limit the number of terms in the full model, because the main concern of the analysis is to evaluate the overall effect, or differences between estimated uninsured rates, rather than values of specific coefficients.

5 The linear regression includes weights to account for differential sampling at each stage of the analysis.

6 Linear regression was used to evaluate the adjusted comparison, with correction for oversampling and robust standard errors.

7 For the linear regression analysis of the multivariate analyses by poverty and education and by state, weights were included to account for differential sampling.



Previous Table of Contents Next


Buy this book

Buy this book



Copyright 2001 by the National Academy of Sciences


">