![]() |
| ||||
User Guide Tutorial #1Cross Sectional Files
Revised: April 2008 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
YEAR |
All Women |
No Children |
1 Child |
2 Children |
3 Children |
4 or More |
|
1969 |
1802 |
1396 |
1706 |
2003 |
2113 |
2379 |
|
1979 |
1492 |
1279 |
1492 |
1579 |
1880 |
1917 |
|
1989 |
1206 |
1075 |
1213 |
1332 |
1441 |
1581 |
|
1999 |
? |
? |
? |
? |
? |
? |
As is evident from the table, there has been a downward trend in housework hours of adult women, 1969-1989. This downward trend can not simply be the consequence of a drift toward fewer children in the family, since the decline is observed conditional on a given number of children. That is, in each of the number of children categories, estimated annual housework hours declines through time. Another interesting aspect of the table is that the dependence of housework hours on number of children is weaker in each of the three successive decades. Consider a simple ratio of housework hours of those with three children to those with none in the household. In 1969 this ratio was, on average, 1.51 (2113 hours/1396 hours). In 1979 it was 1.47 (1880/1279) and in 1989 it fell still further to 1.34 (1441/1075).
Going forward to 1999, did the overall housework hours fall further and within each number of child category? Now we have to get to work.
B. Getting Started
First is to note that housework hours are measured from responses to the 1999 CATI question F2 at the following URL (for later years there will be either CATI documentation of a PDF file which portrays the paper questionnaire which would be the equivalent of the CATI application.):
http://psidonline.isr.umich.edu/Data/Documentation/cai_doc/1999_Interview_Year/Section_F___Housework__Child_Care__Food_Costs.htm
The question asked is:
F2. About how much time (does your wife/do you) spend on housework in an average week? I mean time spent cooking, cleaning and doing other activities around the house? [IF QUALIFIED ANSWER] What is your best estimate?
HOUSEWORK HOURS PER WEEK
0-111 ACTUAL HOURS
112 HOURS OR MORE
If a respondent claims to not know (don't know or D.K.), the numeric code is 998. If the respondent refuses to answer or the number of hours was not ascertained (N.A.), the numeric code is 999. In working with the data we will see below the need to decide what to do with these D.K. and N.A. values. If you mistakenly use them when you calculate arithmetic averages or other statistics, these two codes will present problems for your analysis.
A few brief remarks. This question asks a respondent for a quite complicated variable as a simple estimate, including potentially complex averaging to get to a value for an 'average' week. Hence, the answer is likely to be less than ideal. Notably, respondents generally give somewhat upward biased answers to such questions and occasionally may not have even thought about the fact that there are only 168 hours in a week! So any answer above 112 (two thirds of 168! -and one needs sleep) is automatically truncated at 112 as part of the interviewing process. In processing the data prior to 1994 a further restriction has been that for reported housework hours in excess of 84 (half of 168) the 'weekly average' is set to 84. Then the resulting housework hours are multiplied by 52 (weeks per year) to get the data reported on our table above. Those reporting 'zero' or no housework are assumed to be right. However, another reason for a value of 'zero' is that the question is inapplicable (INAP.) because the family has a single male or single female head. So, again one needs to be careful with the values of 'zero' on the wife's housework hours.
Methodology research using alternative time diary methods shows zero to be plausible in many cases and not predominated by misreporting. For a overview of these methods in PSID (but applied to children age 0-12, who do some housework) visit this PSID website at http://psidonline.isr.umich.edu/CDS/timediary.html
The last thing we will mention here is what to do about a respondent who gives a report of 'I don't know?' or 'refuses'. Such answers get codes of '998' and '999' as noted above. One may want to either exclude such cases or make some simple 'imputation' for these missing data. For example one can assign the median for those who do give answers in the range of 0-84. Or assign at random from quartile values. Or assign at random from quartile values conditional on basic variable values such as number of children. Or, .... We guess you get the picture. One last point (whew!). These housework hours are referred to as a numeric variable. This is distinct from a categorical variable. For example, if male (a category) is coded as '1' and female is coded as '2' or if 'yes' is coded as '1' and 'no' is coded as '5' (a PSID tradition of mysterious origin but alleged to avoid keystroke errors) one would not want to say that 'no' has a five times greater value than 'yes' or that women are (necessarily) twice men. So here the '1' the '2' and the '5' are simply index categories.
C. Using the Data Center
If you are a new PSID user, please note that the PSID requires users to register prior to downloading data in order to impart the Conditions of Use. Through your registration, it also becomes possible to convey the size of the user community to sponsors, allowing the PSID to continue to collect these important data. Registration is very easy, and it is required only once. If you do not already have a PSID login and password, you will want to register at the Data Center before you begin the next few sections of the tutorial. To do such, click on the word "login" that appears under today's date in the far right upper corner of the "Welcome to the Data Center" screen (screenshot 1). Once you have registered, you can return to the Data Center to get going on the tutorial.
Now you need to go to the PSID Data Center Web site (http://simba.isr.umich.edu) and its Set Theoretic Subsetting System. You need to select data for 1999 from the Public Release. The screen shot below in this document is not active - it's just there for convenience in this tutorial. This tutorial is designed to show you how to use the "by file" option for selecting variables. Accordingly, you need to click on that option. Next, note that we will be working with both family-level and individual-level data; therefore you want to click on the "+" sign next to the phrases "PSID family-level" and "PSID individual-level" to expand the list of options under these two categories. This will take you to a screen like the one illustrated below in screenshot 2. Here you have the option of expanding your choices even further! Here you have the option of expanding your choices even further, since both the "PSID Main Family Data" and the "PSID Individual Data by Years" options allow the user to select data from specific years (that is to say, from different waves of the PSID). We are studying the 1999 cross-sectional housework pattern, so you want to click on the "+" sign next to these phrases, and to then click on the + sign next to 1999 in each case. In each case, a box should pop up. These boxes list the specific variables that are available for the year 1999 (the first for the variables contained in the 1999 family file, and the second for the variables contained in the 1999 individual file). Leave these boxes visible, and scroll down a bit on your screen until you see the phrase "Summary Variables. You want to click on the "+ sign next to the phrase too, because we will want variables from this file as well.
Screen Shot 1: Selecting Data Groups

Screen Shot 2: Expanding the Family- and Individual-level file choices

Now where are those variables needed for completing the entries in Table 1? After getting to this point, you need to scroll through the the Family Data variables, and then Individual Data variables to make your needed selections. First though, the term 'Family' may benefit from some discussion right here. Fundamentally, the PSID interviewer asks the respondent questions that apply to the whole family of one or more persons, including children. The study goes way back (designed in 1966-1967) and it was common then for such household or family economic surveys to ask financial and income questions about the household head or 'breadwinner' who, back then, was commonly assumed to be the male partner in a married or cohabiting couple. Of course if no husband (or significant male other) was around, the household head or breadwinner would be the economically active adult female. This seemingly restrictive structure was carried forward even though the economic role of women has broadened and become far more influential, especially since about 1975, not long after the study began! And through time far more information about the wife or 'wife' in a cohabiting couple was asked. By now the same information is obtained for both adults in a married couple family. But the basic head-wife structure was necessarily maintained in terms of the data consistency. If not, one could not have the consistency over time to complete Table 1 and it turns out this is a key aspect of the relational data structure of the PSID. What happened to trends in household time allocation and intra-family life would be solely the province of speculation! But we can complete Table 1 through relational data base structures, and not be forced to rely on speculation.
Housework hours and many other PSID family level variables have this head-wife structure, giving rise to two sets of identical variables - one for the head and one for the wife (if present). So, here, there is a single variable in each set: housework of the head, housework of the wife. In other content areas there are numerous questions (and corresponding variables) in each set. To illustrate, on health conditions there are (for 2001) 108 health conditions asked of the head (ER19612 based on question H1, overall health status, to ER19719 based on question H23, inches of height) followed by the same 108 health questions asked about the wife (ER19720 based on H25, overall health status, to ER19827 based on question H47, inches of height). For market work activity there are entire CATI sections with one set (B, C) for the head and others (D, E) for the wife.
For housework hours consider a world with only two simple cases for the PSID to handle. Married couples and single adults 'heading' a household - which could be just that single person (or possibly with dependents, such as young children). The precise definition of 'heading' a household is known to the PSID staff but cannot be normally explained to others in a finite lifetime. So, let's just pretend that someone in Ann Arbor has done this for you. If you are really compulsive you can immediately turn to a discussion at the FAQ (Frequently Asked Questions) web page.
Given what we just said, how do you think one would get a good national sample of adult males as of 1999? Surprisingly simple. Just limit the subset to the families with a male head. Some will have a wife, but those who don't will be the single male family heads. No problem in concept, given the traditional male-head-by-default assumption. But, given what we just said, how do you think one would get a good national sample of adult females, married or otherwise, as of 1999? Still pretty simple in concept, but relational data structures are needed, which means ID links. That is, adult women are either single female heads or the wives of the married/cohabiting men.
As we have said, the PSID primarily collects data that are measures at the family level. What does that mean? Well, for some variables it is straightforward. The family (married couple or single adult with children) shares a residence and economic resources, by definition. So the respondent (commonly the head but not always) reports things such as the housing section information. Does the family rent or own? If it owns does the family have a mortgage on that home? In the household wealth section, does the family own equities? Such family level variables apply to the labor income or work hours of the head of the family or the wife of the head of the family. There are parallel sections with extensive detail for the head and the wife on these topics. Traditionally, and today, this has been questionnaire Section B (employment of those heads active in the labor market - primarily working now or actively looking for work - as of the interview date), and Section C - followed by Sections D and E - parallel to B and C, but for the wife/'wife'.
To create our file we want to scroll though the boxes identified above to make the following selections: From the box that came up when you expanded the"summary variables" options, you need to select SEX OF INDIVIDUAL[1 = male, 2 = female] ER32000. As we know from Table 1, we want the number of children in the family. This is ER13013 # CHILDREN IN FU from the variable list box containing the 1999 family-level variables. Continuing with choices from the 1999 family file, we also want to be able to separate out the housework hours of married women from those of all women so we need ER13021 HEAD MARITAL STATUS, and we need the following additional family level variables: ER13011 SEX OF HEAD[1 = male, 2 = female] ER14229 F2 HOUSEWORK HRS-WIFE [hours code 1-112 and 998 or possibly 999 from the CATI application] and ER14230 F3 HOUSEWORK HOURS-HEAD. [Technical note: to select non-adjacent = non-contiguous variables of the variable list box, hold down the 'control' key and skip those variables you do not want to select by scrolling to the desired variable and highlight (dark blue) with the mouse/cursor. This is mentioned in the text preceding the Selecting Variables box, but this reminder could be helpful.]
The other variables needed are from the individual data so they will come from the 1999 individual file (the box that popped up when you clicked on the 1999 under the "PSID Individual Data by Years" option). Select the sequence number, ER33502 SEQUENCE NUMBER 99 (code 1-20 = individuals who lived in the family at the time of the 1999 interview) and the relation to head, ER33503 RELATION TO HEAD 99 (code value 10 = head and code value 20 = wife). {Note that individuals living in a consensual relationship have a code of '22' for the 'wife'. We are analyzing only the cases with ER33503 = 10 and 20 here.} And since Table 1 is for those under 65 you will need the individual level variable, ER33504, AGE OF INDIVIDUAL 1999 (years of age). These relationship to head codes are in the Data Center codebook. Optional: to see them search for variable name ER33503 in the PSID Individuals data at the Data Center search page. Then, on the results page click [View]. This will familiarize you with the Data Center's search facility. What you will see is the full codebook detail about ER33503 "RELATION TO HEAD 99." Codes 1-20 for ER33502 are those 'individuals in the family at the time [date] of the interview', those 51-59 are those 'individuals in institutions at the time of the 1993 interview', and so on. We want to work with those in the code ranges of 1-20. One last item: weights. The PSID sample is representative of the U.S., but because of the long evolution, weights are needed to allow for the fact that certain groups of individuals are over or underrepresented relative to the U.S. population. For 1999 these are ER33546 INDIVIDUAL WEIGHT 99. More information about PSID weights, view the Frequently Asked Question about weights. (Optional: while we are in the data selection mode, maybe we would want to select a few other variables for later use. Maybe you want to go beyond Table 1 and look at housework hours for those who are actively in the labor market. If so, select from the individual variable list, ER33512 EMPLOYMENT STATUS 99. For the purposes of this tutorial, we assume that you have not selected any additional variables. And the Data Center allows you to save your cart selections and return to add or delete from the prior variable selection session.)
Screen Shot 3: Selecting Variables

After you have made your selections you should scroll back up to the top of the page and hit "add to cart." This instructs the Data Center to make a shopping cart for you that contains the variables you have identified in the steps above (a "data cart" in PSID parlance). This step is hugely important! If you forget to hit "add to cart" then there won't be any variables in your data cart. Completing the "add to cart" step takes you to a new screen, as shown below (Screenshot 4). As you can see, this screen allows you to check to make sure that your variables have been added to your cart. Hit "variables added to your cart" and then expand the list to see the full list of variables. (See screenshot 5.) Note that at this point it is possible to review the documentation for each individual variable (in case you've forgotten why you chose it). For example, if you click on the words "ER32000: Sex of Individual," you get an explanation of the variable and the coding for it (the values it can take on and what they mean). Next click on "ER13021: Head marital status" and you will see that documentation providing the text of the question asked to get this variable is displayed. Note that even if you had not selected them you will get 1999 interview number, 1968 interview number and person number for 1968. These are needed for the PSID record link syem based on the long-term genealogical nature of the PSID.
A quick note in case you find that the list does not include all the variables you want: If you are missing something, you can go back at this point to retrieve whatever variable you are missing. You would do this by moving your cursor to the upper left portion of the screen to the phrase "Data Center." Doing such you will see that the four options for selecting variables are displayed (by file, by index, by search and by cart). If you click on "by file" you will call up a new copy of the screen at which we started when we first wanted to select variables from the different PSID files (data groups). You would then call up the relevant files for year 1999 that contain the variable(s) that you are missing. When you select the missing variables they will be added to the existing data cart. This means you do not need to re-select all the variables you want (just the ones that were missing).
Screen Shot 4: Variable list

Now--once you are satisfied that your list contains all the variables you want--hit "check out" (which appears toward the top of your screen). Doing such will take you to a screen where you will be asked to login. Here you need to enter your login name and password. After doing that the Data Center will take you to a screen that allows you to select your output options. (Why does the Data Center ask for your login information? The Data Center will store your data cart for you in case you want to retrieve it at a later date. To do such, and so that only you can access it in the future, it needs some record of which user is creating the cart.) Once you log in, you are taken to a screen that allows you to stipulate the format in which you want your data and the accompanying codebook delivered to you. This screen is depicted below. (See screenshot 5.)
At this point you also want to focus on the group needed for Table 1. A nice feature of this output options screen (Screen 5) is that it allows you to restrict your dataset to a limited number of cases of interest. For example, because we want to look at housework done by women, it makes sense to tell the Data Center that we only want observations that are female here (instead of having the Data Center send us information about both men and women). To include (subset) only cases where the female individual resides with the family, you need to type into the subsetting box illustrated below in the 'Output Options' box: (ER32000=2) and (ER33502<21). (If you prefer to handle the gender restriction as part of analysis after data exploration however, you need not make that restriction here.) The format on these subsetting statements is important and there is on-line help if needed. A typo in this box will return the message an error when you go to create your analysis file. Your final subsetting criteria should be:
(ER32000=2) and (ER33502<21) and (ER33503 in (10,20))
In line with our discussion of housework of females, you want those whose relation to head is either ((ER33503=10) or (ER33503=20)) [10= head, 20= wife]. You may want to have a customized codebook, even for this simple set of variables. Here we illustrate the choice of an 'HTML codebook'. You want 'all individuals' not just the family heads. Then you need to select Output Options. Since the balance of this exercise will be based on Excel, you should select the Excel choice, as illustrated below. You also want to click on the box instructing the Data Center to send you your data via e-mail (well, a link to the location where you can retrieve your data really). Note also that you may want to check the box instructing the Data Center to compress your files. Then, you can use a program like Winzip to unzip them once you receive them. (Should you make your data public, and should you name your datacart? That depends. These are simply options that can make one's life easier if one envisages using the Data Center on repeated occasions. For example, if you make your dataset public, then you will not need to login to retrieve your datacart if you want to work with it on a future date. Similarly, if you name the datacart then it will be easy to remember what kinds of data it includes if you want to use it again in the future.) IMPORTANT STEP: Hit the submit box when you have finished stipulating your output options.
Screen Shot 5: Censoring and Output Data File Options

You will see a notice indicating that the Data Center is creating your dataset, and when the task is finished a note appears indicating that your dataset has been sent to your e-mail address. When retrieving the dataset, PC users should right click (other users may be required to use alternatives to right click) on the blue text of Excel Data File and select open; the file will be transferred to you. You should do the same for the Variable Labels.
To understand how the effective sample size varies depending on the subset of cases selected in the Excel file we have prepared Table 2. Of the 5,725 rows of data, selecting those equal to or under age 65 produces 4,994 rows. Of the 5,725 rows of data, selecting those with housework hours of 0 -112 produces 5,655 rows. Limiting the analysis to those with positive weights, under age 65, and with valid housework hours (0 - 112 hours) produces 2,754 rows. These 2,754 rows are the basis for the analysis below.
Table 2. Case Counts from Specific Data Partitions
|
N=5725: All females (head or wife only) and living in the FU at the time of 1999 interview. |
|||||||||||||
|
Age <= 65 |
|
||||||||||||
|
0 <= Hours <= 112 |
|
||||||||||||
|
Weight > 0 |
|
||||||||||||
II. Using Excel (2003) on your Output Subset*
* Note: For other uses of the tutorial, keep in mind that Microsoft Excel has limitations on the number of rows and columns that are displayed on a spreadsheet. Please consult the documentation for your version of Excel for more details.
Instructions for using Excel come next!
You should have an Excel file with the variable numbers arrayed across the top row and the variable values running from row 2 to row 5726 (there should be 5725 observations in this subset from the PSID Data Center.) Steps to take:
1. In order to make it easier to identify the variables represented in each column, you may want to add the name associated with each number. Instructions for doing this can be found in the tutorial section of the PSID web site. If you do not choose to add labels, you should add a blank row 1.
2. Our dataset contains 5725 observations, and we will have occasion to scroll through it. A convenient Excel option that makes it efficient to scroll through the data is "freeze panes." To use this option highlight the far left non-data 'row button' at row 3. If you then click on the 'Window' menu and then 'freeze panes,' then the top two rows of labels will remain visible as you scroll down through the data rather than disappearing from sight.
3. Weights! PSID weights are aligned to national totals. Many statistical packages take such weights and normalize them (for you) to 1 in statistical work. Here we will do our own normalizing, so the average weight = 1. Each time we create a new sub-sample in this exercise, it will be necessary to renormalize these weights. In order to save time (and especially to avoid the clutter of many new weight columns), we will incorporate the normalizing step into our steps for finding the average annual housework hours of each sub-sample. This will be done using the following equation: mean = {sum(Vi * Wi) / sum(Wi)}, where V i=housework hours, and Wi=weight.
4. First, you can see the selected variables to be like this
|
A |
B |
C |
D |
E |
F |
G |
H |
I |
J |
K |
L |
M |
|
ER30001 |
ER30002 |
ER32000 |
ER13011 |
ER13013 |
ER13021 |
ER14229 |
ER14230 |
ER33501 |
ER33502 |
ER33503 |
ER33504 |
ER33546 |
|
1968 Interview Number |
Person Number 1968 |
Sex of Individual |
Sex of Head |
Number of Children in Family Unit |
Head Marital Status |
Housework Hours Wife |
Housework Hours Head |
1999 Interview Number |
Sequence Number 1999 |
Relation To Head 1999 |
Age of Individual 1999 |
Individual Weight # 1999 |
Next, let's think about housework of all women in our sample of 5725, not just the wives. To do this we need to use some Excel logic steps. First, we need to create a new column to list the housework hours of all women. So label column N, 'HOUSEWORK HOURS-WOMEN.' In column K, we know the values of '10' mean that the relation to head = head. In cellN3enterthecommand[=IF(K3=10,H3,G3)]. Then 'fill down' by placing the cursor on the lower right hand corner of cell N3 and highlighting downward by pulling the cursor down and stopping at N5727. This should automatically 'fill down', copying the formula from N3 into the rest of the cells in the column (although the cell number in each formula will change to mirror the row number). If not, and the cells are only highlighted and not filled in, click on 'edit, 'fill' and 'down'. This logic command puts into the N3 cell and later N cells the value of the head's housework hours (from H3) if the selected woman is head, the value of the wife's housework hours (from G3) if the selected woman is wife. Get it? OK, 'yes' or 'no', moving on,
5. Now
you want to focus on the group needed for Table 1. Remember Table 1 for those
age 65 and under, so you need to create a dummy variable which can function as
choosing just the cases equal to and under 65. To do this label column O, 'DUMMY
1. AGE' and enter the command [=IF(L3<=65,1,"")] in cell O3 then 'fill down'.
Also these are cases where housework hours are less than 113 (avoiding the
imputation or treatment of extreme cases). To do this, you should create a dummy
variable. Label column P "DUMMY 2.HOURS"and in the cell P3 enter the command
[=IF(AND(N3>=0,N3<=112),1,"")]. Additionally, you should add a dummy for the
individual weight "DUMMY 3. WEIGHT" in column Q and enter [=IF(M3>0,1,"")] in
cell Q3, so you only include those with positive individual weights.
If you would like to see how many cases are selected
withtherestrictionofage,hoursandweightsimultaneously,incellR3enterthecommand[=IF(AND(O3=1,P3=1,Q3=1),1,"")].
This is the dummy variable ‘DUMMY – COMBINED’. In the last row(R5728) of column
R 'DUMMY-COMBINED', you can see the number of observation chosen by restriction,
using the formula [=SUM(R3:R5727)]. This should be in line with the entries in
Table 2, namely, 2,754.
6. Now we can create a restricted sample of only women by listing their weights in column S. To do this, enter the formula[=IF(AND(O3=1,P3=1,Q3=1),M3,"")] in cell S3 and 'fill down.' This command puts in cell S3 and later S cells the weight value from column M but only if there is a female under 65 with housework hours under 112 otherwise, the cell will be left blank. Label column S 'WEIGHTS-WOMEN.' So, how many housework hours do adult women put in per year as of 1999? To find the answer, enter [=52*(SUMPRODUCT(N3:N5727,S3:S5727))/(SUM(S3:S5727))] in an open cell (such as S5730). This command takes each row and multiplies the value in column N (weekly housework hours) by the value in column S (weight) and sums all the products. Then it divides this number by the sum of all the weights in column S which normalizes the weights. Multiplied by 52 this gives the weighted annual housework hours of all women. This should return a value of 863.51. To confirm this, why not open our ‘answer’ spreadsheet (Note: Netscape users must right-click on this link and select 'save link as' in order to view the answer spreadsheet). (Note: It will make it easier to check your answers against our 'answer' spreadsheet if you keep them together in a table with labels similar to ours.)
7. To get housework hours of married women, go to column T and label it 'WEIGHTS-MARRIEDWOMEN'. IncellT3enterthecommand[=IF(AND(O3=1,P3=1,Q3=1,F3=1),S3,"")] and 'fill down.' These will be the weights of married (F3=1) women. By entering [=52*(SUMPRODUCT(N3:N5727,T3:T5727))/(SUM(T3:T5727))] in cell S5731 you will get 1013.22, a modest decline from the 1989 value (1206) in Table 1.
8. To what extent do housework hours of married women depend on the number of children living in the family? Let’s make a few more column labels. Column U can be ‘WEIGHTS-WIVES K=0’ (Kids=0), Column V as ‘WEIGHTS- WIVES K=1’, W as ‘WEIGHTS- WIVES K= 2’, X as ‘WEIGHTS- WIVES K= 3’, and Y as ‘WEIGHTS-WIVES K>3’.To fill in these columns, start with U3. Then reenter the Excel command[=IF(E3=0,T3,"")]. Fill down, you’re expert by now. This will select the weights only for those married women with no children in the family.E3= 0 means no children. Otherwise, a ‘blank’ will be placed in the cell. In cell S5732, enter the command [= 52*(SUMPRODUCT(N3:N5727,U3:U5727))/(SUM(U3:U5727))]. This will return the weighted average annual housework hours for those women with no children in the family (878.28). Continue with Column V. InV3,entertheExcelcommand[=IF(E3=1,T3,"")]. Similar to Column U, E3=1 means one child. Similarly, proceed to complete columns W-Y. You can get the weighted annual averages, by number of children (E=1, E=2,…), by mirroring the 'sumproduct' command in cells S5733 – S5736. Now the impact of more children on housework hours can be seen and you can complete the entries in Table 1.
9. That table was for married women. How about hours of adult men? You can retrace the steps in this tutorial with different subsetting criteria, SEX OF INDIVIDUAL (ER32000=1). To confirm this, why not open our ‘answer’ spreadsheet. The second sheet shows the housework hours for men.
10. Further analysis. Excel has numerous statistical functions. One could see if there are significant differences across groups defined by numbers of children. Maybe the number of families with more than 3 children present is so small that the difference in housework hours of those with >3 children is not statistically different from those where there are 3 children or 2 children. Or one could look at the regression of married women’s housework hours as a function of number of children. On average, by how many hours does weekly housework rise as there is one additional child?
Institute for Social Research | University of Michigan | Privacy | Conditions of Use