![]() |
| ||||
The collection of wealth data for the PSID in the years 1984, 1989, 1994, 1999 and 2001 is sponsored by the National Institute on Aging (NIA).
1. Files. The wealth data are provided in five separate data files, one data file for each of the years, 1984, 1989, 1994, 1999 and 2001. Each supplemental wealth file is derived from the data contained in the respective family files. A working paper, “Five Years Older: Much Richer or Deeper in Debt?’ makes use of wealth changes 1994-1999 and is illustrative of the uses of the wealth data.
2. Files. The five wealth data files are named "WLTHyyyy.DAT" where the "yyyy" represents the year. The data are in raw ASCII form and uncompressed.
|
File name |
Records |
LRECL |
KBytes |
|
WLTH1984.TXT |
6,918 |
93 |
642 |
|
WLTH1989.TXT |
7,114 |
93 |
660 |
|
WLTH1994.TXT |
8,628 |
93 |
801 |
|
WLTH1999.TXT |
6,997 |
103 |
718 |
|
WLTH2001.TXT |
7,406 |
103 |
760 |
NOTE: The byte count is that reported by DOS which, for ASCII files, includes two characters (CR/LF) at the end of each record; byte counts may differ slightly in other operating systems.
The components of wealth were asked as family level variables. The variables included in the respective data files are noted below.
· For 1999 the only difference is that there are some changes in question wording to allow those wishing to study pension holding. [Section P has much additional detail on pensions of the head and wife.] The redefined variables arise because in 1999 the questions on stocks (W15) and bonds (W27) have been reworded to exclude ‘assets held in employed-based pensions or IRAs.’ Family holdings of private annuities and IRAs are asked for separately (W21). For this file ER 418 is and indicator of whether the family reports holding funds in private annuities or IRAs. Later we will make available the information obtained on whether the IRA is ‘mostly stocks, mostly interest earning, or split’ (W21a). Based on the answer, .75, .25 and .50 of the value of the IRA may be assigned to stocks and the balance may be assigned to bonds.
Frequencies of the categorical variables and univariate statistics of the continuous variables are provided in Section V The variables used in the SAS imputation program to construct the wealth variables are listed in Section VI .
Also included are five corresponding files named "WLTHyyyy.SAS", "WLTHyyyy.SPS", and "WLTHyyyy.DO" which contain, respectively, SAS, SPSS, and Stata statements. The statements provide variable names and locations and variable labels. Statements providing missing data codes are not included because all missing data has been imputed.
The statements in these files are not intended to represent completed and full setups for the respective statistical program packages to run extracts, analysis, etc. You must provide all other SPSS, SAS, or Stata commands needed to complete a setup and modify directory and file specifications as appropriate for your computing environment. You may also opt for dBase output using the Data Center (http://simba.isr.umich.edu), which may be easily imported into numerous formats, including Microsoft ®, and Stata®.
3. Weights. In working with these data we recommend the user apply family weights which may be obtained directly from the data center here: http://simba.isr.umich.edu/. Also, 1994-2001 weights are downloadable from a zip file http://simba.isr.umich.edu/Zips/ZipMain.aspx#xyr,
4. Merging Other Information from Annual Family Files. Most analysts will want to merge additional information available in the annual family files. The best way to achieve this is through the user friendly PSID data center, here: http://simba.isr.umich.edu/, where it is easy to merge data from multiple collection years and across different types of data files, such as individual, family, supplemental, as well as the Child Development Supplement data. The table below lists the number of cases in the wealth files and in the corresponding annual family files as well as the matching ID variables.
|
Wealth File |
Annual Family File |
||||
|
File Name |
ID |
N |
File Name |
ID |
N |
|
WLTH1984.TXT |
S101 |
6,918 |
84FAM.DAT |
V10002 |
6,918 |
|
WLTH1989.TXT |
S201 |
7,114 |
89FAM.DAT |
V16302 |
7,114 |
|
WLTH1994.TXT |
S301 |
8,628 |
ER94FAM.DAT |
ER2002 |
10,771 |
|
WLTH1999.TXT |
ER401 |
6,997 |
ER99FAM.DAT |
ER13002 |
6,997 |
|
WLTH2001.TXT |
S501 |
7,406 |
ER01F.DAT |
ER17002 |
7,406 |
Note that in 1994 the Latino sample did not receive the wealth questions. Therefore the 1994 wealth file contains fewer cases than the 1994 family file (8,628 vs. 10,771). There is not a way to identify Latino families from information included in the 1994 family file. When merging with the 1994 wealth file with the 1994 family file, you will want to be sure not to include the Latino families found in the 1994 family file in the merged output file. Your merged output file should contain 8,628 cases. Using SAS, your code might look something like this:
PROC SORT DATA=wlth.wlth94 OUT=t1; BY S301;
PROCSORTDATA=wlth.er94fam(RENAME=(ER2002=S301)) OUT=t2; BY S301;DATA wlth.wlth94; MERGE t1(IN=W94) t2; BY S301; IF W94;
The specification of "IN=W94" in the MERGE statement and the subsetting IF will restrict the cases in the output dataset to just those occurring in the 1994 wealth dataset.
And using SPSS your code might include statements such as the following:
GET FILE="er94fam.sys".
SORT CASES BY ER2002.
GET FILE="94wlth.sys".
SORT CASES BY S301.
MATCH FILES FILE=* / TABLE= "erfam94.sys"/RENAME=(ER2002=S301) / BY=S301.
FILE specifies the file that supplies the cases. TABLE specifies a table lookup file. A lookup file contributes variables but not cases to the output file. Variables from the table file are added to the cases from the other file that have matching values for the key variable (the BY variable).
The above procedures will give you three cross-sectional files. To create a file that allows you to look at families across time, see our 1998 Brookings Paper and suggestions noted above.
There is a one-to-one match of the case counts for the wealth file and family file for the years 1984, 1989, 1999 and 2001, making the merge straightforward.
Items marked with an * in the list above plus value of owner occupied real estate comprise our wealth definition for 1984, 1989, 1994, 1999 and 2001.
A. Imputation for Home Equity
Prior
to creating the generated variables for wealth, an imputation procedure was
followed for cases with missing values on housing value and mortgage amounts in
order to generate a value for home equity. There were 307 cases in the 1999
family data and 347 cases in the 2001 family data that received an imputation because
of missing information on housing value or mortgage balance. For homeowners, the housing value for 2001
was imputed using an inflation-adjusted 1999 housing value. If the 1999 value was missing, an
inflation-adjusted value from 1997 was used for the 2001 housing value. If the 1997 value was missing, then the 2001 housing
value was set to the mean value after categorizing by family income and age of
head in the group having non-missing values.
For non-homeowners, the 2001 housing value was set equal to
zero. An identical procedure was
followed to impute remaining mortgage. This procedure is detailed below along with
case counts for each step of the imputation:
1. 2001 Housing value
Step 1.
If Housing
value of 2001 = ‘9999999’ or ‘9999998’ then refer to the previous year’s housing
value or whether own/rent (year 1999)
If own (year 1999), then housing
value of 1999 * 1.076266 substituted
for housing value of 2001
Housing value category=1
If not own (year 1999), then housing
value of 2001 was set to zero
Housing value category=2
Step 2.
If housing
value of 2001 still has ‘9999999’ or ‘9999998’ then refer to the housing value
or whether own/rent (year 1997)
If own (year 1997),
Then housing
value of 1997 * 1.076266 * 1.04528
substituted for housing value of 2001
Housing value category=3
If not own (year 1997), then housing
value of 2001 was set to zero
Housing value category=4
Step 3.
If housing
value of 2001 still has ‘0’, ‘9999999’ or ‘9999998’ then imputation of housing
values 2001 was from the mean after categorizing by family income and age of
head in the group having proper values (N=4136).
Housing value category=5
2. 2001 remaining mortgage 1 or 2
Step 1.
If
remaining mortgage 1 or 2 of 2001 have “9999998” or “9999999”,
Then mortgage
of 1999 was substituted for 2001’s
Mortgage category=1
Step 2.
If remaining
mortgage 1 or 2 of 2001 still have missing, “9999998” or “9999999”,
Then mortgage
of 1997 was substituted for 2001’s
Mortgage category=2
Step 3.
If
remaining mortgage 1 or 2 of 2001 have missing, “9999998” or “9999999”,
Then imputation of mortgage 2001 was from the mean
after categorizing by family income and age of head in the group having
proper values
(N=4136).
Mortgage
category=3
3. The
number of cases in each category, 2001 year
Imputation of housing value
Imputation of mortgage
N/A Cat 1 Cat
2
Cat 3 Sum
N/A 0
143 23
47
213
Cat 1 42
8
0
2
52
Cat 3 14
1
2
2
19
Cat 5 43
9
2
9
63
Sum 99
161 27
60
347
4. The
number of cases in each category, 1999 year
Imputation of housing value
Imputation of mortgage
N/A Cat 1 Cat
2
Cat 3 Sum
N/A 0
112 25
47
184
Cat 1 38
8
6
0
52
Cat 3 8
2
3
0
13
Cat 5 20
7
1
10
38
Sum 66
129 35
57
287
* Home equity was set equal to 0 for an additional 20 cases
where there was no basis for knowing if they owned their house in 1999, for a
total of 307 cases.
To work with the original wealth data it was necessary to process the values which were not provided in a dollar amount but rather as bracket ranges (i.e., the 'unfolding brackets') in a consistent fashion for the wealth components across all years. This was accomplished with a SAS program. The extent of item non-response was surprisingly low in the PSID, and this helps provide an extra measure of quality; we believe this is because the PSID respondents have confidence in the interviewers and have been interviewed on numerous prior occasions. Research by Robert Ferber (Collecting Financial Data by Consumer Panel Techniques, Urbana, Bureau of Economic and Business Research, University of Illinois, 1959) has underscored the importance of reinterviews in gathering family wealth measures.
A standard series of unfolding bracket questions is as follows, using "real estate other than own home" as an example. See the questionnaires for the complete text of these questions in here: http://simba.isr.umich.edu/Zips/ZipMain.aspx. Questions from the 1989 questionnaire are provided in Section VII below.
(1) "Do you (or your family living there) have any real estate other than your main home... ?"
(2) If Yes in (1), "If you sold all that and paid off any debts on it, how much would you realize on it?"
If a dollar value is given in (2), then the questions skip to the next wealth component (value of property 'on wheels' - cars trucks, motor home trailer, boat - net of any debt on these assets).
Respondents who did not report an exact amount in (2) were asked a series of three or four questions (starting with "would it amount to $50,000 or more?). These questions ultimately yielded the following categories:
Some respondents gave only partial brackets such as below $25,000, above $25,000, etc.; other respondents would not give any bracket value at all.
Assuming that respondents who could not or would not provide an amount have holdings distributed in the same way as respondents giving an exact amount, a hot deck method was used to impute the missing values. This imputation process consists of three levels.
First, Respondents who answered "Don't know", "Refusal" or simply missing in question (1) are assigned to "Yes" or "No" with those who are imputed as "Yes" being treated as not having bracket information at the next level.
Second, respondents who only gave a partial bracket or no
bracket are randomly assigned to one of the four brackets ($1-$999,
$1,000-$24,999, $25,000-$99,999, and over $100,000) with probability in
accordance with the distribution of brackets realized from respondents who gave an
exact bracket.
Note that this does not
include people who gave an exact amount despite the fact that they fall into a
bracket. The assumption is that people who give partial bracket information or
no bracket information at all are more like individuals who gave exact bracket
information than those that gave exact amounts. Juster and Smith (Journal of the
American Statistical Association, 1997) argue that the distribution of bracketed
individuals is different from those who give exact amount.
After the second level, every respondent has an index of the exact bracket to which the respondent belongs. (Note: the dollar values defining the bracket ranges in the questions differ some between 1984 and 1989, but are identical between 1989 and 1994, 1999 and 2001. The choice of these bracket values was informed by the empirical distribution of the values of the assets in question in combination with ‘round’ numbers and restricting the questions to a parsimonious set of bracket categories.
Third, respondents who did not give exact amount (including those with reported exact brackets and imputed brackets) are assigned a dollar value with a probability derived from the distribution of amounts from respondents who reported exact values and which fell within the range of the same bracket.
The specific procedures are detailed in sample SAS code, which can be provided upon request.
Frequencies of the categorical variables and univariate statistics of the continuous variables included in each supplemental wealth file 1984, 1989, 1994, 1999, and 2001 are provided below. For the categorical variables a value of "1" means the designated asset was owned by the respondent or family members living there and "0" means it was not. Due to data collection and processing conventions, in 1984 and 1994 negative values for component wealth variables were coded as zero. The modest number and amount of negative values reported for component wealth variables in the 1989 leads us to believe that 1984 and 1994 net wealth values would not change significantly had negative values been recorded.
In the variable labels, the notations VBUSIyr, VDEBTyr, ..., WEALTH1yr, WEALTH2yr refer to the dollar values coming from the imputation program (no inflation adjustments - the CPI-U with 1982-84 as a base is 1984=103.9, 1989=124.9, 1994=148.2 and April 1996=156.2). Recall that WEALTH1yr does not include home equity whereas WEALTH2yr includes home equity (market value of the owner-occupied housing less outstanding mortgage balances).
RELEASE NUMBERCumulative Cumulative
S100 Frequency Percent Frequency Percent
1 6918 100.0 6918 100.0
BUSI84 WHETHER FARM/ BUSINESS (QK122)
Cumulative Cumulative
S102 Frequency Percent Frequency Percent
1 656 9.5 656 9.5
0 6262 90.5 6918 100.0
CHEC84 WHETHER CHECKING/SAVINGS (QK132)Cumulative Cumulative
S104 Frequency Percent Frequency Percent
1 4716 68.2 4716 68.2
0 2202 31.8 6918 100.0
DEBT84 WHETHER CREDIT CARD OTHER (QK145)
Cumulative Cumulative
S106 Frequency Percent Frequency Percent
1 3237 46.8 3237 46.8
0 3681 53.2 6918 100.0
REAL84 WHETHER OTHER REAL ESTATE (QK113)Cumulative Cumulative
S108 Frequency Percent Frequency Percent
1 1064 15.4 1064 15.4
0 5854 84.6 6918 100.0
STOC84 WHETHER STOCK/MF/IRA (QK127)Cumulative Cumulative
S110 Frequency Percent Frequency Percent
1 1229 17.8 1229 17.8
0 5689 82.2 6918 100.0
VALU84 WHETHER OTHER SAV/ASSETS (QK137)Cumulative Cumulative
S114 Frequency Percent Frequency Percent
1 1302 18.8 1302 18.8
0 5616 81.2 6918 100.0
Variable Label Minimum Maximum
-----------------------------------------------------------------------------
S101 FAMILY ID 1984 (V10002) 1 6918
S103 VBUSI84 (1984$) 0 5000000
S105 VCHEC84 (1984$) 0 999996
S107 VDEBT84 (1984$) 0 500000
S109 VREAL84 (1984$) 0 1750000
S111 VSTOC84 (1984$) 0 1500000
S113 VTRAN84 (1984$) 0 250000
S115 VVALU84 (1984$) 0 9000000
S116 WEALTH184 (1984$ NO MAIN HOME EQUITY) -497990 9360000
S117 WEALTH284 (1984$ MAIN HOME EQUITY INCL) -497990 9560000
-----------------------------------------------------------------------------
Variable Label Mean Std Dev
-----------------------------------------------------------------------------
S101 FAMILY ID 1984 (V10002) 3460 1997
S103 VBUSI84 (1984$) 10958 116054
S105 VCHEC84 (1984$) 7842 27739
S107 VDEBT84 (1984$) 1692 10745
S109 VREAL84 (1984$) 9072 56387
S111 VSTOC84 (1984$) 4457 32908
S113 VTRAN84 (1984$) 4722 8174
S115 VVALU84 (1984$) 11571 286613
S116 WEALTH184 (1984$ NO MAIN HOME EQUITY) 46931 329404
S117 WEALTH284 (1984$ MAIN HOME EQUITY INCL) 67594 339898
-----------------------------------------------------------------------------
Variable Label N
-------------------------------------------------------S101 FAMILY ID 1984 (V10002) 6918
S103 VBUSI84 (1984$) 6918
S105 VCHEC84 (1984$) 6918
S107 VDEBT84 (1984$) 6918
S109 VREAL84 (1984$) 6918
S111 VSTOC84 (1984$) 6918
S113 VTRAN84 (1984$) 6918
S115 VVALU84 (1984$) 6918
S116 WEALTH184 (1984$ NO MAIN HOME EQUITY) 6918
S117 WEALTH284 (1984$ MAIN HOME EQUITY INCL) 6918
------------------------------------------------------- RELEASE NUMBERCumulative Cumulative
S200 Frequency Percent Frequency Percent
1 7114 100.0 7114 100.0
BUSI89 WHETHER OWN FARM/ BUSINESS (QG124Cumulative Cumulative
S202 Frequency Percent Frequency Percent
1 805 11.3 805 11.3
0 6309 88.7 7114 100.0
CHEC89 WHETHER CHECKING/SAVINGS (QG135)Cumulative Cumulative
S204 Frequency Percent Frequency Percent
1 4997 70.2 4997 70.2
0 2117 29.8 7114 100.0
DEBT89 WHETHER CREDIT CARD OTHER (QG146)Cumulative Cumulative
S206 Frequency Percent Frequency Percent
1 3591 50.5 3591 50.5
0 3523 49.5 7114 100.0
REAL89 WHETHER OTHER REAL ESTATE (QG115)Cumulative Cumulative
S208 Frequency Percent Frequency Percent
1 1110 15.6 1110 15.6
0 6004 84.4 7114 100.0
STOC89 WHETHER STOCK/MF/IRA (QG129)Cumulative Cumulative
S210 Frequency Percent Frequency Percent
1 1500 21.1 1500 21.1
0 5614 78.9 7114 100.0
VALU89 WHETHER OTHER SAV/ASSETS (QG141)Cumulative Cumulative
S214 Frequency Percent Frequency Percent
1 1592 22.4 1592 22.4
0 5522 77.6 7114 100.0
Variable Label Minimum Maximum
------------------------------------------------------------------------------
S201 FAMILY ID 1989 (V16302) 1 7114
S203 VBUSI89 (1989$) -2000 9999999
S205 VCHEC89 (1989$) 0 1000000
S207 VDEBT89 (1989$) 0 780000
S209 VREAL89 (1989$) -10000 9999999
S211 VSTOC89 (1989$) -1000 5000000
S213 VTRAN89 (1989$) -4000 200000
S215 VVALU89 (1989$) 0 1100000
S216 WEALTH1 89 (1989$ NO MAIN HOME EQUITY) -779892 14359999
S217 WEALTH2 89 (1989$ MAIN HOME EQUITY INCL) -779892 14609999
------------------------------------------------------------------------------
Variable Label Mean Std Dev
------------------------------------------------------------------------------
S201 FAMILY ID 1989 (V16302) 3558 2054
S203 VBUSI89 (1989$) 16436 192432
S205 VCHEC89 (1989$) 11813 38668
S207 VDEBT89 (1989$) 2714 12743
S209 VREAL89 (1989$) 15767 156651
S211 VSTOC89 (1989$) 8023 73036
S213 VTRAN89 (1989$) 6931 11518
S215 VVALU89 (1989$) 4592 32875
S216 WEALTH1 89 (1989$ NO MAIN HOME EQUITY) 60849 322936
S217 WEALTH2 89 (1989$ MAIN HOME EQUITY INCL) 90784 345835
------------------------------------------------------------------------------
Variable Label N
--------------------------------------------------------S201 FAMILY ID 1989 (V16302) 7114
S203 VBUSI89 (1989$) 7114
S205 VCHEC89 (1989$) 7114
S207 VDEBT89 (1989$) 7114
S209 VREAL89 (1989$) 7114
S211 VSTOC89 (1989$) 7114
S213 VTRAN89 (1989$) 7114
S215 VVALU89 (1989$) 7114
S216 WEALTH1 89 (1989$ NO MAIN HOME EQUITY) 7114
S217 WEALTH2 89 (1989$ MAIN HOME EQUITY INCL) 7114
--------------------------------------------------------
RELEASE NUMBERCumulative Cumulative
S300 Frequency Percent Frequency Percent
1 8628 100.0 8628 100.0
BUSI94 WHETHER FARM/ BUSINESS (QG124)
Cumulative Cumulative
S302 Frequency Percent Frequency Percent
1 899 10.4 899 10.4
0 7729 89.6 8628 100.0
CHEC94 WHETHER CHECKING/SAVINGS (QG135)Cumulative Cumulative
S304 Frequency Percent Frequency Percent
1 5688 65.9 5688 65.9
0 2940 34.1 8628 100.0
DEBT94 WHETHER CREDIT CARD OTHER (QG146)Cumulative Cumulative
S306 Frequency Percent Frequency Percent
1 4065 47.1 4065 47.1
0 4563 52.9 8628 100.0