Sunday, Nov 22
Tutorials | Overview | User Guide | FAQ | Contact/Help | News | Data Quality | File Structure | CDS R/D | Sponsorship | More...
  

PSID Frequently Asked Questions

  (see also Data Center FAQ, CDS FAQ)
   
 
  1. How has the occupation-industry code classification system changed?

  2. How is the PSID sponsored and funded?

  3. The data files that are posted for each new wave are called Public Release. What does Public Release mean?

  4. In some older documents there is reference made to Public Release II (formerly Final Release) and Public Release I (formerly Early Release) data -- what does this mean?

  5. How can I get started analyzing PSID data?

  6. How do I go about ordering a CD-ROM?

  7. How do I merge family- and individual- level files?

  8. How does the amount of data collected in each wave vary by family unit members?

  9. What is the definition of a main or reinterview family?

  10. What is the difference between a family unit (FU), a household unit (HU), and a family unit member?

  11. What data are available in the area of housing?

  12. Who is a Sample Member and what is Follow Status?

  13. What is the difference between response and nonresponse family unit members?

  14. How can I identify families from year to year?

  15. Do family ID numbers vary from year to year?

  16. How is Head defined in the PSID?

  17. When is a new Head selected?

  18. How can I tell the current Head and Wife from mover-out Head and Wife? Why is this important?

  19. Who are the Husbands of Heads?

  20. How do I assemble a Head/Wife file from an individual file?

  21. How do I bring forward background items such as education, race, occupation and industry, etc. for Heads and Wives/"Wives"?

  22. What is the definition of a splitoff family?

  23. How can I identify splitoffs from the main family?

  24. How is an individual uniquely identified?

  25. How can I determine if data will be collected about an individual who is not present in an interviewed family?

  26. How can I determine which variables are comparable across years?

  27. Where can I obtain information regarding release dates for files?

  28. How can I tell if a variable value is actual or imputed?

  29. How can I identify the SEO (Survey of Economic Opportunity) sample and the SRC (Survey Research Center) sample?

  30. For what years are Latino data available? How do Latino data differ from immigrant data?

  31. Where can I find information on the 1997-1999 Immigrant Sample?

  32. Are cohabitors treated differently from legally married couples?

  33. Why does the PSID provide weights for analysis?

  34. What variables should I use for complex sample survey variance estimation?

  35. Why are the values for Heads and Wives/"Wives" employment status "0" in the 1994-2007 Individual Files? Where can I find non-missing data for employment status for Heads and Wives/"Wives" 1994-2007?

  36. How does the PSID distinguish between main and secondary jobs in the data files?

  37. What information about physical and mental health is collected by the PSID?

  38. How has the occupation-industry code classification system changed?

  39. When is data collected for each wave of the study?

  40. How often are data collected for the PSID study?

  41. How does the geographic information in the public release files differ from the information available in the restrictive Geocode Match files?

  42. How do I go about obtaining special permission to use the Geocode files?

  43. Where can I get information on the kinds of research that has been done using the PSID data?


  1.   How has the occupation-industry code classification system changed?

    The PSID used a one-digit occupation code, and later a two-digit, until 1981 when the three-digit 1970 Census code became standard for the main jobs of employed Heads and Wives. It was also used for the most recent jobs held by Heads and Wives who were currently unemployed and looking for work and for any job held in 1980 by a Head or Wife who was currently retired or no longer in the labor force. Starting in 2003, all occupation-industry data have been coded using the three-digit 2000 Census code. A retrospective coding project was completed that used the 2000 Census to code beginning occupation and industry of all Heads and Wives as of 2003 and those of their fathers and mothers; these variables are included in the background portion of the family data file.


    [Top]

  2.   How is the PSID sponsored and funded?

    Over the life of the project, funding for the Panel Study of Income Dynamics has been provided by a number of government agencies, foundations, and other organizations. While the PSID's original funding agency was the Office of Economic Opportunity of the United States Department of Commerce, the study's major funding sources are now the National Science Foundation and the National Institute on Aging. Substantial additional funding has been provided by the National Institute of Child Health and Human Development, the Office of the Assistant Secretary for Planning and Evaluation of the United States Department of Health and Human Services, the Economic Research Service of the United States Department of Agriculture, the United States Department of Housing and Urban Development, and the United States Department of Labor and private foundations.

    The long-term viability of the Panel Study of Income Dynamics and other publicly-supported data resources depends in large part upon recognition provided to funding agencies. Therefore, PSID study staff encourage researchers to acknowledge the PSID's major funders as well as the agencies responsible for funding question sequences relevant to their research domains.

    This material is based upon work supported by the National Science Foundation under Grant No. 9515005 and the National Institute on Aging under Grant Nos. R01AG6671 and Y1AG9188.

    Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation, the National Institute on Aging, or any other sponsoring agency.



    [Top]

  3.   The data files that are posted for each new wave are called Public Release. What does Public Release mean?

    Public Release Data
    All Public Release data files have been processed and edited, and should meet the research needs of all users.  Data are subject to revision based on the most recent information received from individuals and families, and contain many computed variables. Most of the computed variables (i.e., Income Plus, and Wealth) can be easily merged by using the Data Center.


    Over the past several years the PSID staff, using Computer Assisted Telephone Interview (CATI) technology and companion processing software, have significantly improved the quality and reliability of the timely release of data files. We now refer to the files as posted for each new wave as Public Release Data. Note that:

    1. Longitudinal data are subject to revision based on the most recent information received from individuals and families (i.e., family composition and economic editing) and as additional data are collected through time on our two year collection cycle, prior files may be edited in light of the new information. Both the values of the variables themselves and the relationships of individuals to the families to which they are connected may be edited. These are normally for a small number of cases.
    2. An extensive set of computed or generated variables are included in the Public Release Data. As time and resources allow we occasionally add selected new generated variables for later release.

    Since the PSID data files, as with the data files from any complex longitudinal study, are subject to minor changes and subsequent updated releases, due primarily to economic and family composition editing activities, it is therefore highly recommended that users retain and save all data files that are downloaded from this site and upon which individual research analysis is dependent. Only the most current data files can be retained by PSID staff for distribution.



    [Top]

  4.   In some older documents there is reference made to Public Release II (formerly Final Release) and Public Release I (formerly Early Release) data -- what does this mean?

    The term 'Public Release I' used to be used to refer to those files released for general public use after they have been reviewed for data quality checks and for consistency in the reported family listing and relationships among family members (family composition editing).

    The term 'Public Release II' was previously used to refer to files which had undergone additional data checks to correct a very small number of cases and had been formatted in a more convenient form.

    Because of successive improvements in our Computer Assisted Telephone Interviewing (CATI) software, the quality of the Public Release I files improved dramatically in recent waves, allowing the use of these data with confidence. The differentiation between Public Release I and Public Release II has now been dropped altogether.

    In addition, we are making available as supplemental files for recent years a set of key economic variables. These are variables which require extensive effort in their construction and, therefore, were traditionally released as part of the Public Release II files. Namely, these are Family Income and its components, Family Needs Standards see Family 'Income Plus' Files, Wealth Files, Annual Work (Unemployment, and other weeks), Market Work Hours of the Head and Wife, Average Annual Hourly Earnings of the Head and Wife {Work Hours and Wages Files}, and Family and Individual Weights {1994-2001 Weights}. As these supplemental files become available, they are posted with the necessary ID's.

    Another improvement in these Public Release files from 1999 forward is the editing for inconsistencies in family composition using our new (graphical user interface) PSID EDITING SYSTEM. The PSID EDIT SYSTEM allows editing of the complex relationship files in the PSID. This is known as "Family Composition Editing." The PSID follows a bloodline, which means that essentially individuals and their lineal descendents are followed. This gives rise to potentially great power for intergenerational analysis--but only if these relationships are recorded accurately with ID's and the right persons are followed. In addition, the CATI application collects data for generic entities, such as "Head of household" or "Wife of Head of household." Yet an individual's status as Head or Wife can change from year to year, through such events as divorce, marriage, and death of a spouse, for example. For this reason, data quality depends very heavily on the family composition editing we have done for both the Public Release I and II files.



    [Top]

  5.   How can I get started analyzing PSID data?

    Reading an overview of the PSID is a great place to start learning about the study.  The PSID User Guide provides additional information and an historical context.  There are 5 tutorials which provide step-by-step instructions on downloading and analyzing the data in a variety of ways, including:  Cross-sectional analyses, tracking longitudinal changes, creating and analyzing a balanced panel, creating and analyzing a customized subset of CDS modules, and merging PSID and CDS data together for intergenerational analysis.

    The Data Center is the most popular means for obtaining PSID data, and it delivers about 10,000 customized data files to researchers and quantitative social science students each year.  The Data Center is fully automated and allows for user-specified subsetting criteria when downloading and merging main family and main individual data files.  ASCII or SAS data files can be generated, along with OSIRIS, SAS, SPSS, and Stata data definition statements. 

    In addition to the Data Center, data and documentation can also be obtained from the PSID Web site in the form of prepackaged files. Data files and SAS and SPSS data definition statements are available via the Questionnaires and Packaged Data and Documentation page.

    Data from the Geocode files, Death files, and Medicare files are available by special contract.  See the PSID Overview page or contact PSID Help for more information.  



    [Top]

  6.   How do I go about ordering a CD-ROM?

    Almost all of the PSID data can be downloaded directly from our website via the Data Center. CD-ROMs are no longer available.



    [Top]

  7.   How do I merge family- and individual- level files?

    The Data Center provides automatic and customized merges of family- and individual-level files. For the analyst who prefers to write their own programming code to merge data downloaded from our zip packages, sample SAS and SPSS programs have also been prepared to assist users with creating cross-year analysis files.

    The Data Center automatically adds appropriate identification variables to your data cart, based on the types of data you have selected.



    [Top]

  8.   How does the amount of data collected in each wave vary by family unit members?

    In general,

    • Most detail collected for the Head
    • Substantial detail collected for the Wife/"Wife"
    • Some detail collected for other family unit members (OFUMs)


    [Top]

  9.   What is the definition of a main or reinterview family?

    A main family is a family unit that was interviewed in the prior wave. In a divorce or separation, the main family could be the family unit of either spouse, depending on who was found first for the interview. In the case of children leaving home, the main family is almost always the parental family.



    [Top]

  10.   What is the difference between a family unit (FU), a household unit (HU), and a family unit member?

    In the PSID study, we are attempting to learn about people who are living within a family unit (FU). The FU is defined as a group of people living together as a family. They are generally related by blood, marriage, or adoption, but unrelated persons can be part of a FU if they are permanently living together and share both income and expenses. Families change from year to year. The household unit (HU) is defined as the physical boundary, such as a house or apartment, where members of the PSID FU reside. Not everyone living in a HU is automatically part of the FU.

      The PSID survey is about FU Members only



    [Top]

  11.   What data are available in the area of housing?

    The PSID collects many data about housing, including housing type, characteristics, ownership, tax, insurance, etc. For a complete list, see this PDF document.



    [Top]

  12.   Who is a Sample Member and what is Follow Status?

    Sample Members are individuals who were living in the original FU at the time of the very first interview or his/her offspring born since then. For subsequent samples, such as the immigrants, the year of the first interview serves as the base for determining who is an original sample member, and all individuals present in the family at that time qualify. Original Sample Members receive Person Numbers in the range of 001-019. A second group of original Sample Members are children of the Head/Wife who were under age 25 and who were in institutions the first year of interviewing. These persons receive Person Numbers of 021-029. Similarly, the Head's spouse, if in an institution the first year, is considered an original sample member and receives a Person Number of 020. Individuals who were born into a sample family after the first interviewing year and have a sample parent are considered as born-in Sample Members and receive Person Numbers in the range of 030-169. All other people who have ever lived in a PSID family are not sample individuals and receive Person Numbers of 170 or greater.

    Follow status indicates whether we are interested in continuing to interview an individual. In general, sample members are always considered Followable. Non-Sample Members can be Followable too, if they represent a population of current interest. For example, we have in the past, followed such people as Non-Sample parents of sample children who were aged 25 or younger.



    [Top]

  13.   What is the difference between response and nonresponse family unit members?

    Response family unit members are those residing in an interviewed family at the time of interview. Nonresponse family unit members are those not residing in an interviewed family at the time of interview; they may have attrited, not yet appeared in the study, or not yet been born by a particular wave.

    The phrase "main family nonresponse" means that both the individual and his or her family have at that time become lost to our study, although either or both may reappear in the study in subsequent waves. In the wave just prior to becoming nonresponse, the individual was connected with a family interviewed by our study; thus, both family and individual data are available for that prior year, and the individual's Sequence Number at that time was 01-59. However, data were collected for neither the individual nor his or her family in the nonresponse wave. The data for the wave in which nonresponse occurs (and all subsequent waves if and until the individual reappears as a member of a responding family unit, including a recontact family) are zeroes excepting the variables for type of individual record and reason for nonresponse, and if an individual was selected for recontact, follow status and reason for following the individual.

    In contrast, mover-out nonresponse individuals have left a family that was still in the study. Since such individuals were usually present in that family for at least part of the calendar year preceding nonresponse, they have some additional nonzero data for the wave in which they became nonresponse, such as part-year income information. In later waves, mover-out nonresponse individuals are treated in two ways, depending on why they left the family. Those who moved out to institutions have several variables (Sequence Number, age, sex, Relationship to Head, type of individual and reason for nonresponse) with nonzero values, although income, housework, and other individual-level variables are filled with zeroes. Eventually, such an individual may (a) become response by moving into a family or by becoming a splitoff, (b) move from the institution and remain mover-out nonresponse (shown when Sequence Number=71-89), or (c) become main family nonresponse because the family itself became nonresponse. (See the preceding paragraph for an explanation of main family nonresponse data records.) The other type of mover-out nonresponse individual has either moved out, but not to an institution, or died. Later waves of data contain zeroes, as described above for main family nonresponse, unless they subsequently rejoined a responding family or were selected for recontact.

    The data are released as one file, which includes not only those individuals with nonzero data records in the current data collection year (i.e., current response plus mover-out nonresponse), but also all other individuals-those who have zero data records for the current year (i.e., current year main family nonresponse and all nonresponse of either kind from earlier waves.



    [Top]

  14.   How can I identify families from year to year?

    Each family unit in a specific wave is assigned a unique family ID e.g., 1980 family ID (interview number). The most critical family ID is the one assigned in 1968 to families in the original sample. According to our following rules, we follow "split-offs" as children and others set up their households. One family in 1968 can become 3-4 or more families by 1985. All of these families will have the same 1968 ID, since they originated from the same family in 1968, but will have different family ID's in 1985, since they are separate family units in 1985. The largest 1968 family had more than 50 persons in it by 1990!



    [Top]

  15.   Do family ID numbers vary from year to year?

    For each family, the family ID number will most certainly vary from year to year. For example, a 1968 family ID of 1234 will not likely be 1234 in 1969 or any other year for that matter. Yearly IDs are assigned based on receipt of the interview--the first interview in from field is numbered 1, the second, 2, and so on. For information about linking families across years see the file structure web page.



    [Top]

  16.   How is Head defined in the PSID?

    Within each wave of data, each FU (family unit) has one and only one current Head. Originally, if the family contained a husband-wife pair, the husband was arbitrarily designated the Head to conform with Census Bureau definitions in effect at the time the study began. The person designated as Head may change over time as a result of other changes affecting the family. When a new Head must be chosen (see conditions for selecting a new Head below), the following rules apply:

    The Head of the FU must be at least 16 years old and the person with the most financial responsibility for the FU. If this person is female and she has a husband in the FU, then he is designated as Head. If she has a boyfriend with whom she has been living for at least one year, then he is Head. However, if the husband or boyfriend is incapacitated and unable to fulfill the functions of Head, then the FU will have a female Head.



    [Top]

  17.   When is a new Head selected?

    A new Head is selected if any of the following conditions apply:

    • last year's Head moved out of the HU (household unit), died or became incapacitated; or
    • a female Head has gotten married; or
    • if this is a splitoff family.


    [Top]

  18.   How can I tell the current Head and Wife from mover-out Head and Wife? Why is this important?

    To tell the current Head and Wife from mover-out Head and Wife, use the Sequence Number from the individual file to identify an individual's status with regard to the family unit and determine family composition change. It's important to understand family composition change to avoid spurious correlations in a longitudinal analysis where you are looking at variables pertinent to the same person(s) over time.



    [Top]

  19.   Who are the Husbands of Heads?

    Husbands of Heads are usually living in the family unit, although they can be living in institutions as well. They are usually disabled, although in a few cases, the female half of the pair insists on being the Head. The Relationship to Head (RTH) variable, on the individual file, will indicate if a respondent was Husband of Head (codes 9 and 90).



    [Top]

  20.   How do I assemble a Head/Wife file from an individual file?

    The easiest way to do this is by visiting the PSID Data Center which will create a customized dataset for you automatically.

    Instructions for creating Head/Wife file from an individual file by writing your own programming code:

    To create a single year Head/Wife file: Select individuals with Relationship to Head of "Head" (a code value of 1 for 1968-1982; code 10 from 1983 onward) and with values for Sequence Number in the range 1-20. The reason for using the Sequence Number variable is that non-response movers out have relationships to the PREVIOUS YEAR's Head, so two individuals within one family may have relationships of Head. One, however, is the real, current Head; the other is a mover out. (The type of mover-out can be determined from the value for Sequence Number. Refer to the individual file codebook for details.) To illustrate the importance of Sequence Number, assume that in the last wave we have an elderly married couple. He is the Head and she is the Wife--Sequence Number=1 and Relationship to Head=10 for him, Sequence Number=2 and Relationship to Head=20 for her. When we find them for the new interview, he has died and she has become the new Head--his Sequence Number=81 and Relationship to Head=10, her Sequence Number=1 and Relationship to Head=10. All the family data items about Head in the current wave refer to HER, not to him. Information about his income, etc. is located in OFUM (other family unit members) variables only. Similarly, to subset Wives or "Wives" in a current wave--select Relationship to Head=20 or 22 and Sequence Number=1-20.

    To create a cross-year Head/Wife file: These concepts can be expanded to subset persons who have been Heads over a period of years--the yearly values for Sequence Number must be 1-20, and 1 or 10 for Relationship to Head. As a corollary, to select individuals who have been either Heads or Wives/"Wives", yearly Sequence Numbers must equal 1-20 and yearly Relationships to Head must be in the range 1, 2, 10, 20, or 22. Once that subset is made and family data are merged, information about an individual can be found in Head variables (Head's work hours, Head's labor income, etc.) when his or her Relationship to Head=1 or 10. When Relationship to Head is 2, 20, or 22, then her information is found in variables about the Wife/"Wife".



    [Top]

  21.   How do I bring forward background items such as education, race, occupation and industry, etc. for Heads and Wives/"Wives"?

    Background information is traditionally asked of new Heads and Wives/"Wives" only. These variables include the education series, location where Head's and Wife's/"Wife's" parents grew up, their occupations and industries, the number of Head's/Wife's/"Wife's" siblings, his or her first occupation and industry, and race and ethnicity questions, among others. The background series is not reasked in each and every wave because presumably most responses to these background questions do not change over time, and respondent burden is reduced by asking them only when a new Head or Wife/"Wife" enters the family unit.

    During processing, we have traditionally "brought forward" background information from previous waves for Heads or Wives/"Wives" who are the same persons as in the prior year. Background information is complete for 1993 and for 2003 and succeeding waves on the family files. For 1994 through 1996, information is present only for new Heads or Wives/"Wives". In 1997, most but not all of the background questions, including some new or altered items, were asked of all Heads and Wives/"Wives". The main omission was the education sequence, although all 1997 variable values were brought forward for every wave beginning in 1999. That is to say, each item that was asked in 1997 has been brought forward, but many Heads and Wives/"Wives"--all those who have stably remained Heads or Wives/"Wives"--do not have actual values for education questions, the siblings sequence, and a few other variables. In order to recoup the missing information for these persons, you must search 1996, 1995, 1994 or 1993 family data.

    Carefully compare the background variables item for item and code for code in the 1993 family file and the1994-1996 family files before you attempt to bring forward prior-wave background information. You should be aware that the 1997-2001 background questions are not completely identical to their 1994-1996 counterparts.

    For more detailed instructions about "bringing forward" background variables, see the 1997 documentation Section II Part D



    [Top]

  22.   What is the definition of a splitoff family?

    A split-off family consists of a person or group of people (at least one of whom is a "follow" person of any age) who moved out from a main family since the prior wave's interview to form a new, economically independent family unit. Several criteria must be met for a split-off to occur. In addition to having moved out since the prior wave, and to being 'followable' (i.e., having an orginal 1968 family identification), the person or group of people in general may not have moved to an institution such as college or prison or to another family unit within the panel study. Moreover, the person or group of people who have moved out and formed their own family unit must be economically independent (i.e., they must be paying their own living expenses) from the family unit from which they split off. It should be noted that these are general rules, and that sometimes unique situations arise that determine whether a person or group of persons becomes a split-off. For example, while moving to an institution such as college does not generally meet the criteria for becoming a split-off, if the person is working, paying their own living expenses, and paying their own educational expenses in addition to attending school, then this person would be considered a split-off. The living situation and interview data for each and every possible split-off case are first reviewed before split-off status is granted.



    [Top]

  23.   How can I identify splitoffs from the main family?

    Select only current heads (sn=1 and rth=10) from the individual file for the wave in question. Then, if head's moved in/out indicator=1 and month moved in/out=0, it's a splitoff. Otherwise it's a main family.



    [Top]

  24.   How is an individual uniquely identified?

    For personal identification, use the 1968 ID (V30001 or ER30001) and the person number (V30002 or ER30002) as the unique identifiers of each individual.



    [Top]

  25.   How can I determine if data will be collected about an individual who is not present in an interviewed family?

    It depends on the situation.

    • For persons who are mover out deceased, some OFUM (other family unit member) information is collected for the wave they are reported to have died.
    • For persons moving out to an institution, some OFUM information is collected during the wave they are reported to have moved out from the family.
    • For persons moving out to another household but no interviews are conducted with the new family unit, some OFUM information is collected the wave they are reported to have moved out.
    • For persons already in institutions, no new information is collected.
    • For persons who attrited from the study, no new information is collected.  However, a large recontact effort was initiated in 1992.
    • For persons not yet born or not yet appearing in the study, no information is collected that wave.


    [Top]

  26.   How can I determine which variables are comparable across years?

    The family and individual alphabetic and numeric indices can be accessed via the Questionnaires and Packaged Data and Documentation page. You can use these indices to determine how variable names have changed across years and also to determine if specific topics were asked of respondents in a given year. Additionally, the Data Center provides a cross-year variables index (based on the alphabetic index) for some files.



    [Top]

  27.   Where can I obtain information regarding release dates for files?

    File release information is available through the Data News section of our website.



    [Top]

  28.   How can I tell if a variable value is actual or imputed?

    A missing data value is either identified as such (value=9) or an imputed value is assigned in lieu of a missing data code. If an imputed value is assigned, an associated "accuracy code" variable describes the nature of the assignment. For more specifics, examine the Questionnaires, Data, and Documentation for the year(s) and file(s) of interest.



    [Top]

  29.   How can I identify the SEO (Survey of Economic Opportunity) sample and the SRC (Survey Research Center) sample?

    You will need to look at the 1968 family interview number available in the individual-level files (V30001 and ER30001). Families from the SEO sample have values greater than 5000 and less than 7000 while those from the SRC sample have values less than 3000. The new Immigrant sample, added in 1997 and 1999, have values greater than 3000 but less than 5000.



    [Top]

  30.   For what years are Latino data available? How do Latino data differ from immigrant data?

    In 1990 the PSID added 2,000 Latino households consisting of families originally from Mexico, Puerto Rico, and Cuba. But while this sample did represent three major groups of immigrants, it missed out on the full range of post-1968 immigrants, Asians in particular. Because of this crucial shortcoming, and a lack of sufficient funding, the Latino sample was dropped after 1995, and a sample of 441 immigrant families was added in 1997. In 1999, an additional 70 families were added in for a total of 511 immigrant families as of 1999. These families are included on the files along with the core PSID families.



    [Top]

  31.   Where can I find information on the 1997-1999 Immigrant Sample?

    Information on the Immigrant Sample is available in the 1997 and 1999 documentation on the Questionnaires and Packaged Data and Documentation.


    [Top]

  32.   Are cohabitors treated differently from legally married couples?

    In the PSID, a cohabitor is labeled a boyfriend or girlfriend (code 88) the first wave he or she appears in the study. If that cohabitor is still in that same family unit at the time of the subsequent interview, the cohabitor's label switches to "Wife" (code 22 in 1983 and onward) if the cohabitor is female; if the circumstances are otherwise the same and the cohabitor is male, his label switches to Head and his female partner (who has been Head) becomes a "Wife." Boyfriends and girlfriends are treated like family members who are not Heads or Wives/"Wives," and some information is obtained about them. In waves since the late 1970s, information typically gathered for Wives has been gathered as well about "Wives."

    Starting in 1983, the Relationship to Head (RTH) code allowed for differentiation between legal Wives and long-term female cohabitors. However, first year cohabitors can be detected prior to 1983 with a little bit of work. For example, their RTH would be 8 (nonrelative), their gender would be opposite that of Heads, and in subsequent years they may become Wives or Heads, while the Head would stay as Head or become a Wife. Anyone fitting this pattern can be decisively identified as a cohabitor.



    [Top]

  33.   Why does the PSID provide weights for analysis?

    The PSID sample combines the SRC (Survey Research Center) and SEO (Survey of Economic Opportunity) samples. Both samples are probability samples (i.e., samples for which every element in the population has a known nonzero chance of selection). Their combination is also a probability sample. The combination, however, is a sample with unequal selection probabilities, and as a result, compensatory weighting is needed in estimation, at least for descriptive statistics. Weight adjustments are also needed to attempt to compensate for differential nonresponse in 1968 and subsequent waves. Weights supplied on PSID data files are designed to compensate for both unequal selection probabilities and differential attrition.

    In 1997, the Panel Study of Income Dynamics (PSID) underwent several important design changes that would affect weighting. Leading these changes was a roughly 1/3 reduction in the number of PSID Core families that will be eligible for continuous longitudinal data collection. A second important change to the 1997 PSID was the addition of a nationally representative sample of immigrant households and individuals that would not be eligible for PSID under the original 1968 sample recruitment and sample family "following rules". The 1997 data collection year also began the transition to every second year data collection for PSID. Finally, the 1997 PSID data collection included a special supplemental study of children age 0-12 in PSID Core and Immigrant Supplement families. Weighting procedures for are available through Documentation for Panel Study of Income Dynamics Analysis Weights for Sample Families and Individuals.



    [Top]

  34.   What variables should I use for complex sample survey variance estimation?

    Variables ER31996 and ER31997 are used for computing sampling errors via balanced repeated replications. The same variables may be used with programs based on the Taylor Series linearization method (STATA SVY commands, SAS PROC SURVEY commands). BRR Stratum (ER31996) may be specified as the "Stratum variable" in the design specification. The BRR SECU variable (ER31997) may be specified as the "Cluster Variable".

    Note that despite the "BRR " prefix on theses variables they do constitute a correct specification of the stratum and cluster codes for an analysis with software that uses the Taylor Series linearization method.



    [Top]

  35.   Why are the values for Heads and Wives/"Wives" employment status "0" in the 1994-2007 Individual Files? Where can I find non-missing data for employment status for Heads and Wives/"Wives" 1994-2007?

    Please note that the current versions of the 1994-2007 Individual Public Release I files have incomplete information for the employment status variables listed below. Employment status for Heads/Wives/"Wives will be generated for the final version of the file, but in the meantime we want you to be informed of the following known issue with the data. From 1968 through 1993, Heads and Wives/"Wives" have values for the equivalent employment status variables, but for 1994 through 2007 they were not generated and therefore are filled with zeros.

    Employment Status on the Individual file- where values for Heads/Wives/"Wives" are zero:

    2007-ER33913
    2005-ER33813
    2003-ER33712
    2001-ER33612
    1999-ER33512
    1997-ER33411
    1996-ER33311
    1995-ER33211
    1994-ER33111

    Employment status for Head/Wife/"Wife" for these years are found on the corresponding years' Family file. The Family file may contain up to three mentions each for both Head and Wife/"Wife" and may be found in the following variables:

    Employment Status variables for Heads/Wives/"Wives" on the Family file

    2007-Head's employment status ER36109, ER36110, ER36111
    2007-Wife/"Wife"'s employment status ER36367, ER36368, ER36369

    2005-Head's employment status ER25104, ER25105, ER25106
    2005-Wife/"Wife"'s employment status ER25362, ER25363, ER25364

    2003-Head's employment status ER21123, ER21124, ER21125
    2003-Wife/"Wife"'s employment status ER21373, ER21374, ER21375

    2001-Head's employment status ER17216, ER17217, ER17218
    2001-Wife/"Wife"'s employment status ER17786, ER17787, ER17788

    1999-Head's employment status ER13205, ER13206, ER13207
    1999-Wife/"Wife"'s employment status ER13717, ER13718, ER13719

    1997-Head's employment status ER10081, ER10082, ER10083
    1997-Wife/"Wife"'s employment status ER10563, ER10564, ER10565

    1996-Head's employment status ER7164, ER7165, ER7166
    1996-Wife/"Wife"'s employment status ER7657, ER7658, ER7659

    1995-Head's employment status ER5068, ER5069, ER5070
    1995-Wife/"Wife"'s employment status ER5561, ER5562, ER5563

    1994-Head's employment status ER2069, ER2070, ER2071
    1994-Wife/"Wife"'s employment status ER2562, ER2563, ER2564

    In order to create an equivalent employment status variable for Heads/Wives/"Wives" using the above mentions, a user must consider the code values of all three mentions and prioritize them in order as follows: 2,1,3,4,5,7,6,8, 9. For example, if Head had two mentions for employment status, and the first mention was code 1 with the second mention being code 2, then Head's overall employment status would become code 2. If Head had three mentions for employment status with codes of 5, 6, 3, then Head's overall employment status would become code 3.



    [Top]

  36.   How does the PSID distinguish between main and secondary jobs in the data files?

    Main vs. Extra Jobs: Once the existence of real work has been established, the next basic rule for economic editing involves the definitions of main and extra jobs. A quick definition of main vs. extra jobs: very simply, someone cannot have an extra job unless he/she holds a main job during the same time period. We make this distinction between main and extra jobs throughout. If two (or more) employers overlapped, the interviewer was supposed to ask which was the main one during that time and note in an open ended question the overlap and the hours and earnings of both jobs. Then this overlap period was to be included in the extra job sequences (BD82-BD106/CE74-CE98).

    To reiterate, someone cannot have an extra job unless he/she has a main job during the same time period. The extra job must be held simultaneously with the main job. Those who are only temporarily laid off are still employed at a main job and, therefore, could have an extra job during that time period. However, those who are unemployed, whether looking or not, have no main job employer during the time in question. Hence, any small job they may have is considered a main job--since it's the ONLY job. Use the month strings and dates of beginning and ending employment in the work history to tell whether time at B/D72-74a or C/E64-66a is temporary layoff or unemployment.



    [Top]

  37.   What information about physical and mental health is collected by the PSID?

    The PSID contains a wealth of information that can be used to study the health of Americans and their family members at one point in time, or over a long time span.

    The National Institute of Aging funded the most comprehensive set of health measures on the full U.S. adult population in 1986, and 1999-2003. The 2001 and 2003 series of questions, (Section H) asked of all respondents regardless of age, included items on: self-perceived general health (current and childhood), limitations due to health conditions, lifetime occurrence of 12 health conditions, age of onset of each condition, interference in roles due to health conditions, exercise frequency, smoking status, age of onset of smoking, number cigarettes smoked, alcohol consumption, height, weight, poor health of family members, dietary habits (eating fiber, grain, fats, etc.), knowledge of nutrition, health insurance coverage, and health insurance expenditures. For respondents over age 55, a series of questions on activities of daily living is asked. In 2003, this series will be asked of all respondents regardless of age.

    The 2001 data collection also included the addition of a 30-day emotional distress scale, and questions on role interference and severity due to emotional distress. These questions will be included again for 2003. Since this scale was also asked about some of the children of PSID respondents in the 1997 Child Development Supplement (CDS ) to the PSID, and is being asked again in the 2002 CDS, it is possible to examine family linkages in current emotional functioning. The National Health Interview Survey (funded by the National Center for Health Statistics) has used this scale since 1999. In 2003, two stem questions for 12-month major depression from the Composite International Diagnostic Interview (CIDI) were added.

    With the exception of the 30-day emotional distress scale and the addition in 2003 of the CIDI 12-month major depression stem questions, the 1999 PSID collected the same health series as was collected in 2001 and 2003: (Section H ).

    A series of health supplements funded by the National Institute on Aging in 1990, 1991, and 1993-1995 also contain a rich set of questions regarding the health of family members aged 55 and above: general health status, activities of daily living, nursing home stays, home-based care episodes, and major health expenditures. This set of questions, combined with the 1990 RAND Health supplement, provide extensive coverage over a six-year period of the health status of older PSID family unit members. In 1993-1995, the annual Health Care Burden Supplement focused on health care expenditures of the elderly and the extent to which family members spent either time or money taking care of their parents. Please see our supplemental files page for detailed information on the Health Care Burden File, 1993; the Parent Health Supplement, 1991; the Self-Administered Health Supplement, 1990; and the Telephone Health Supplement 1990.

    In 1986, the National Institute of Aging funded the first comprehensive health supplement, containing much of the same information that was later restored in 1999 (see the Questionnaires, Data, and Documentation for more information. ). Respondents were asked questions about their medical condition, physical and work limitations, exercise, smoking, and financial assistance.



    [Top]

  38.   How has the occupation-industry code classification system changed?

    The PSID used a one-digit occupation code, and later a two-digit, until 1981 when the three-digit 1970 Census code became standard for the main jobs of employed Heads and Wives. It was also used for the most recent jobs held by Heads and Wives who were currently unemployed and looking for work and for any job held in 1980 by a Head or Wife who was currently retired or no longer in the labor force. Starting in 2003, all occupation-industry data has been coded using the three-digit 2000 Census code. A retrospective coding project used the 2000 Census to code first occupation and industry of all Heads and Wives as of 2003 and that of their fathers and mothers.


    [Top]

  39.   When is data collected for each wave of the study?

    The interview period (field season) is roughly between March and November, with 1993 and 1994 being exceptions and going into December. If a user is interested in when a specific interview was conducted, there is a variable in the dataset (i.e., Date of Interview) which indicates month and day of interview.



    [Top]

  40.   How often are data collected for the PSID study?

    Between 1968 and 1997, data were collected every year. Starting in 1999, the PSID collected data biennially (i.e., every other year). The PSID Overview provides additional information about data collection. All waves of data 1968-2003 are available on the website. The 2005 data collection will end by the close of 2005, and data will be released by December 31, 2006.



    [Top]

  41.   How does the geographic information in the public release files differ from the information available in the restrictive Geocode Match files?

    The public release files, which can be downloaded directly from the PSID website, contain geographic information of a more generalized nature such as region, state of residence, size of largest city in the county of residence, and the Beale rural-urban code. These data will meet the needs of most users. Users in need of more specialized geographic information may want to request use of the restrictive PSID Geocode Match files. These files include the identification codes necessary to link data from the PSID annual family files to Census data. This linkage allows the addition of information regarding the characteristics of the geographic area (e.g., the neighborhood and/or the labor market area) in which individuals and families lived to the PSID individual- or family-level data. This should in turn allow investigation of the effects of non-family "context" variables on family and individual outcomes.

    In the past, we provided selected variables from the Census in aggregated forms (i.e., Census Extract Files); however, we no longer support these files. In recent years, there has been a rapid growth of external sources that provide an increasing variety of measures of the social environment. Rather than investing our resources in duplicating this effort, we are expecting users to seek out these sources to match them with the PSID files.



    [Top]

  42.   How do I go about obtaining special permission to use the Geocode files?

    Due to our desire and obligation to protect respondent anonymity to the fullest extent allowable by law, the Geocode files are not available in general public release at the PSID Website or through the ICPSR. Rather, special contractual arrangements must be made to ensure that analysts maintain respondent anonymity. Persons interested in such contractual arrangements should contact Donna Nordquist by U.S. mail at Panel Study of Income Dynamics, 3200 ISR, Box 1248, Ann Arbor, MI 48106-1248. You can also inquire via email through PSID Help.

    The process is somewhat lengthy and typically takes a couple of months. The timeframe is dependent on contract language issues and the responsiveness of the requesting institution. The analyst must submit a CV, a research plan, a sensitive data protection plan, a human subjects review clearance/waiver, and a completed signed contract. In addition, there is a non-refundable administrative fee due at the time the contract is submitted.



    [Top]

  43.   Where can I get information on the kinds of research that has been done using the PSID data?

    Check our bibliography web page for a list of citations that use the PSID data. You can search for papers by author, keyword, and/or date for journal articles, books and book chapters, conference proceedings, dissertations, and working papers.



    [Top]





Institute for Social Research | University of Michigan | Privacy | Conditions of Use