PSID FAQ

Skip Navigation LinksHome > Documentation > FAQs

All FAQs
Restricted use data:
- Obtaining the data
- Existing contract holders

Sample weights

1.	Why does the PSID provide weights for analysis?

The PSID sample combines the SRC (Survey Research Center) and SEO (Survey of Economic Opportunity) samples. Both samples are probability samples (i.e., samples for which every element in the population has a known nonzero chance of selection). Their combination is also a probability sample. The combination, however, is a sample with unequal selection probabilities, and as a result, compensatory weighting is needed in estimation, at least for descriptive statistics. Weight adjustments are also needed to attempt to compensate for differential nonresponse in 1968 and subsequent waves. Weights supplied on PSID data files are designed to compensate for both unequal selection probabilities and differential attrition.

In 1997, the Panel Study of Income Dynamics (PSID) underwent several important design changes that would affect weighting. Leading these changes was a roughly 1/3 reduction in the number of PSID Core families that will be eligible for continuous longitudinal data collection. A second important change to the 1997 PSID was the addition of a nationally representative sample of immigrant households and individuals that would not be eligible for PSID under the original 1968 sample recruitment and sample family "following rules". The 1997 data collection year also began the transition to every second year data collection for PSID. Finally, the 1997 PSID data collection included a special supplemental study of children age 0-12 in PSID Core and Immigrant Supplement families. Additional documentation describing the weights is provided on the documentation page.

2.	What variables should I use for complex sample survey variance estimation?

Variables ER31996 and ER31997 are used for computing complex sample design corrected standard errors/variance estimates via the Taylor Series Linearization or Repeated Replication methods. These variables may be used with a variety of software programs that incorporate the complex sample design into variance estimation, including Stata, SAS, Sudaan, SPSS and others. The Sampling Error Stratum variable (ER31996) may be specified as the "Stratum variable" in the design specification and the Sampling Error Cluster variable (ER31997) may be specified as the "Cluster Variable". Sampling error estimation in design-based analysis of the PSID data can be found here.

3.	Why do some cases exist where there is data available for certain variables, but the family weight is equal to zero?

These are families that contained no sample members. These are not mistakes in the data, but rather show cases where information was gathered about individuals not directly linked to a sample member. The PSID purposely followed some nonsample individuals, e.g., the nonsample elderly (1990-1996), nonsample parents (1994-2003). In some cases, families are response and contain only followable but nonsample individuals and therefore all the individual weights and thus the family weight for these cases are zero.