Wednesday, Nov 25
Tutorials | Overview | User Guide | FAQ | Contact/Help | News | Data Quality | File Structure | CDS R/D | Sponsorship | More...
 

User Guide Tutorial #4

 
   

Linking Children and Caregivers from the Child Development Supplements

Ngina Chiteji, Mohammad Mushtaq, and Frank Stafford

Revised: August 2007

I. Introduction

The Child Development Supplement (CDS) of the Panel Study of Income Dynamics (PSID) offers researchers a wealth of information concerning children. The CDS is a comprehensive archive of information about children’s experiences and development, their daily time use http://psidonline.isr.umich.edu/CDS/timediary.html , parenting, schools, daycare, and the ways that families function in the world around us. The data come from questions posed across different modules to several individuals who are involved in children’s lives in the many different locales in which children have their childhood experiences. In line with or discussion in Tutorial 3, the CDS may be thought of as a 'partly balanced panel'. Only those in CDS I were followed subsequently to form CDS II, so if there is a CDS child in CDSII there will necessarily be data on the child from CDSI, but not conversely.

While the CDS can be used as a self-contained "file" or domain for analysis, most researchers find it essential and productive to link data from the CDS to that from the PSID core. This allows the information about children to be supplemented with detailed information about their families, and their parents in particular. For example, some researchers might seek measures of family wealth in 1994 or beyond for inclusion in a file of CDS children. Or others may want to know a family's income-to-needs ratio, or still others the current labor hours or multi-year work histories of the parents of the child. Another theme is connections across generations. To illustrate, research shows there to be a correlation of the Body Mass Index (BMI) of the CDS children and their mothers and also with their grandparents. This tutorial shows the user how to access the needed files to do analyses using the CDS via the PSID Data Center at http://simba.isr.umich.edu. Since so much of the interest in child outcomes is related to the connection to older and contemporary generations, the CDS files are central in the study of intergenerational connections - both those at a point in time or for different generations at different time (year) or life course (age) points.

Files from the PSID Main ("Core") Data, 1968 and forward, can be linked automatically to CDS through the Data Center. In addition, more complex intergenerational links can be accomplished through the Family Identification Mapping System (FIMS) http://simba.isr.umich.edu/FIMS/. The CDS itself has many different modules, and the Data Center can be used conveniently to link variables from the different corresponding files, with or without matching to data selected from the 'core'. Before moving into a demonstration of how the Data Center can be used for these purposes however, some background about the CDS is provided. Users who are familiar with the general structure of the CDS can skip ahead to Section III. Or, for more complete documentation of the CDS design and data a good source is the CDS User Guides http://psidonline.isr.umich.edu/CDS/wavesdoc.html

II. Overview of the CDS

When looking at the family setting we often observe that one parent or other adult has primary responsibility for the child (for organizing children’s activities and coordinating care for the child, for example). The CDS data collection process involves the collection of data about children from such an identified “primary caregiver” (PCG) and the collection of data about this PCG as well. Defined as the person who knows the most about the child’s activities, the PCG is typically the child’s mother, but in some instances the father or some other family member performs this role.

CDS data collection does not end with the primary caregiver, however. To obtain a complete picture of the lives of children, the CDS has information from separate instruments or modules applied to other caregivers, and from modules on schoolteachers/daycare providers and school administrators, each module having a different response rate - or percent of successful completions. Common Core Data (CCD) from the U.S. Department of Education provide measures of the schools and school districts of the CDS children. A sampling of these variables is in the Data Center, and a more extensive set can be obtained via a confidential data user's contract. Information from all these different sources is in separate files and gives rise to a relational data structure. In such a structure the information about a given child in one module is linked to information about the child from another module via an underlying I.D. or 'key' structure. And while there may be valid data about a given child from one module and on its resulting file, the differing response rates means that the companion information may not have been obtained from another module. If your desired analysis file spans measures across CDS modules, the CDS Data Center (based on an Oracle design) currently allows for this by treating the absent module information as if it were 'missing data' - analogous to traditional individual item non-response or missing data within a module. Such CDS modules give rise to what may be termed an 'absent data' structure, as will be described below. This way users can decide whether in merging across CDS modules they want an extensive file of all cases for variables coming from different modules or just the cases in which there is valid data coming from all different modules. The baseline selection is for all cases for all variables selected from CDS modules is what is provided. In principle this can lead to a great deal of absent data.

To follow our research theme, suppose you hypothesize that if separated parents can arrange for an absent father to spend some days with the child that this will connect to a joint care pattern that will reduce behavior problems. In 1997 there was a module for data collected directly from absent fathers. Specifically there was the module, Father's Outside of the the Home - Child Booklet, Module Q7 (http://psidonline.isr.umich.edu/CDS/questionnaires.html ) or 'Absent Dads'. But the module was completed for only 283 cases. If variables from Module Q7 and from the Primary Care Giver, Module Q1 are selected, the resulting file will include the union of the cases across the modules. Users not wanting cases with absent data have to exclude them from their subset file delivered from the Data Center prior to statistical analysis. (In 2002/03 the approach was to collect information about absent father from the PCG, so measures on the absent dad .)

This non-traditional type of 'missing data' we will refer to as 'absent data,' since the questions needed to obtain it were not asked nor possibly even intended to be asked. Note that we can further distinguish types of absent data: absent because the module or instrument was to be asked but did not get asked (e.g. the schoolteacher could not be contacted), absent because the module or instrument was not attempted by design (e.g. third siblings, age 0-12 as of 1997, were not included), and absent because no such observation was available in the sample concept. To illustrate the latter, for a year in question (say, 1994) suppose the child's mother was an unrelated individual living outside the PSID sample. Measures of her market work would appear as absent data. Data from the so-called 'Absent Dad' or Q7 file is also helpful in defining absent data. Data for the question Q7A6, 'In 1996 how many days did (CHILD - name) actually spend with you?' may be absent for several reasons. First, there may be no absent dad. There is a dad, we can be sure, but the dad is a current member of the family with a 'relationship to head code,' so by sample concept there is not an absent dad. Or the dad may be out of the family (home) but was unable to be contacted or refused to participate in Module Q7. A file consisting of variables from answers to PCG Q1 and from Q7 will very have a great deal of absent data. Without taking steps on your part all absent data cases will be provided.

Starting from a relational archive 'data locus ' (for CDS, the individual child), the extent of absent data will normally increase, and sometimes very rapidly, as one goes out from such a data locus or observational starting point for a given individual (or family) in a given year to other individuals and other years. In this tutorial the assumed data locus is the child (as reported by the PCG) and the constructed file can extend out to not only other 2002 (and planned 2007/08) CDS modules but also to other individuals and other years or family level variables in the corresponding and other years. This extension from the data locus to other files will give rise to absent data. To exclude absent data cases from a file you can specify exclusions in the data subsetting statement which is illustrated in Screenshot 13 below. Specifically, you can type in (Q7A6 ne .) in the Data Center subsetting box for SAS or STATA files ((Q7A6 ne .;) in a SAS program outside the Data Center). In Excel the statement is (Q7A6 .ne. ' ').

The CDS measures are collected via multiple modes. Face-to-face, telephone, and mail out data collection, and combination mail out and telephone modes were used to obtain the different respondents' reports on a variety of items ranging from the child's health, to the PCG's school enrollment expectations for the child, to behavior problems used to construct a Behavior Problems Index, and much more. A comprehensive look at the measures included in the PCG instrument and other CDS modules are available at http://psidonline.isr.umich.edu/CDS/questionnaires.html . These modules were collected at different time points since the data collection period was over many months. As a result there are some specialized age variables - such as the age at which the specific time diary was reported (AGEWD_02 or the age at which the diary for the week day was reported in 2002/03), or age (in months) at which the child module was reported in 1997 (ADEATCH).

III. Using the PSID/CDS Data Center

The first goal of this tutorial is to allow users to familiarize themselves with the Child Development Supplement and the process of selecting variables and observations from within the CDS files. To actually use this tutorial to get data after you go through the steps outlined, you must be a registered user. To become a registered user please consult the directions given in Tutorial 1.

As noted previously, there are a number of files or modules in each wave of the CDS--such as (i) the Demographic file, which contains limited background information about the family in which the child resides; and 'result codes' including whether time diary data or Common Core Data are available; (ii) the Elementary/Middle School Teacher File, which contains data collected from teachers about the child; (iii) the Primary Caregiver Child file, which records responses to a series of questions about the child that are posed to the person who is primarily responsible for the child's care (the "PCG"); and (iv) the time diaries, which record ways that children spend their time during weekdays and weekends (and who they spend their time with). This tutorial shows users how to select variables from some of these different modules. A second goal of this tutorial is to give users the tools needed to link CDS data to that in the PSID core. These skills will be useful to many users.

A. The Nature of Our Exercise

To accomplish our goals, we pose a simple research question: Is there a relationship between moms working in the labor market and children's behavior? Why ask this question? Raising children requires parents to make decisions about how young children will be cared for, and to make choices about time use and family finances, as working outside the home and caring for children are tasks that often compete for parents' time. In particular, one expects a restructuring of parental time, especially among parents with children who are very young, and this issue is likely to be most pressing for the PCG. In fact, one of the ways the challenge of restructuring time often manifests itself is in the PCG facing a tradeoff between working outside the home and serving as a stay-at-home parent (Hallberg and Klevmarken, 2003). While different families ultimately may make different choices about labor force participation, all face a delicate balancing act, as theories of child development suggest that financial and time resources are both crucial to child development (Hofferth, 1995).

B. Identifying measures of the phenomenon of interest

What kind of information does the CDS have about child behavior? As authors of this tutorial, we know that the CDS collects data on children's scores on the behavioral problem index (BPI), a widely used measure of behavior problems in psychology, along with data covering the different elements of behavior on which the BPI score is based. A user who is new to the CDS might not know this however. There are several ways to determine what type of information the CDS contains if one is new to the dataset.

1. Determining what data are available in the CDS

One can always consult the comprehensive user guide to determine what data are available in the CDS. Second, a user can take advantage of the numerous search features that the Data Center offers. As shown in the screenshot below (Screenshot 1), it is possible to search the CDS by keyword. One would choose "By search" to do this and then type in whatever term you can think of to best represent the phenomenon you are interested in. For example, suppose you typed "BPI" because you wanted to know if the CDS contained any information related to this index. Since you want to search the CDS for the information, you would want to de-select the other file options (by clicking solely on "CDS--including Time Diary Aggregates") under "Data File Type," as shown in Screenshot 2 (below). After you type in the word "BPI" you would then hit "search codebooks." This will prompt the Data Center to begin a search of the CDS to determine whether there is any information about the BPI contained in the dataset. Your search should result in output that takes the form of a chart indicating that the CDS not only collects the composite BPI score, but also separate indices for internalizing and externalizing behavior. Now suppose you were not even aware that psychologists had developed an index called the BPI, yet you wanted to obtain information about problem behavior among children. Instead of typing "BPI" in the search box, you could type in the phrase "problem behavior" instead. This will result in more output as there are a number of variables in the CDS that relate to different dimensions of behavior problems (such as the variable Q1A24Y--" emot problem yr 97," which indicates whether the child was seen by a psychiatrist, psychologist, doctor or counselor because of emotional, mental or behavioral problems in the year 1997). Among the variables listed in the output however, you also will see the BPI.

Screenshot 1: Search Options and Ways to Identify Variables

Screenshot 2: Executing the "By File" search option -- using the BPI as an example

How can you tell whether a variable really captures the phenomenon that you are interested in? If you click on the purple box next to a variable name, for variables corresponding to data 'as collected' (as distinct from generated variables such as the Income-to-Needs ratio in the core) you can see the full question text for the variable. (See screenshot 3.) For example, click on the box alongside BPI_T02 ("BPI-Total Score 02"). A new box will pop up to show you that this variable comes from "Question items measuring the behavior problem index..." This explanation would help you understand what the BPI is if you did not know what it was. Moreover, the explanation contains a reference to a specific question number (on which the variable is based), which allows you to go to the CDS Questionnaire to see exactly what was asked. This feature allows users to make informed decisions about the usefulness of the CDS for their purposes, and about what variables they want to select from the dataset.

Screenshot 3: Obtaining the documentation for a single variable--The role of the purple box

A second way to familiarize yourself with the CDS, and the wealth of data it contains, is to select the "By index" option in screenshot 1. Doing such takes you to a screen where you can select the CDS' data index. (Click on the "+" sign next to that term.) Selecting this index takes you to a screen that lists a number of different types of issues that a researcher might be interested in, as shown in Screenshot 4 (below). Each represents a type of data collected in the CDS. For example, you can see there is information related to (i) child care, (ii) education, (iii) parenting and (iv) problem behavior. (This list is not exhaustive.) To get more detail on the exact variables containing information about your topic of interest, you can click on the "+" sign next to the topic. For example, clicking on the "+" sign next to the "problem behavior" expands the list of data collected related to this topic, and in the sublist you will see the behavioral problem index listed. (You can also see that there is information about drug use and other risky behavior.) For the purpose of this tutorial, we will focus on the BPI as our measure of problem behavior. More specifically, we will analyze the total BPI score data from the 2002 wave of the CDS.

Screenshot 4: Identifying variables using the "By Index" option

2. Determining what data are available in the PSID

Recalling that our goal is to explore connections between children's behavior and moms' labor force participation, it is now appropriate to begin a discussion of different ways we might measure moms' work. The PSID has collected data about the hours worked by heads and wives of families since 1968, and because the PSID is a longitudinal survey it contains repeated observations on the families that it surveys. Accordingly, it is possible to construct measures of work that combine information from more than one year for a given individual. In this tutorial we opt for a two-year measure of mom's work so we can examine the effect that moms working persistently has on behavior (if there is any such effect). Of course, there are different ways that one might conceptualize persistent work. For example, one could ask whether a mom worked at all in both of the years of interest. This could be the basis of a set of categoric variables - such as worked both years (1-1), worked neither year(0-0), worked intermittently (0-1 and 1-0) . Or, one might define persistent work as having one's average work hours in both years exceed the average for all moms. Alternatively, one might opt to examine work hours over a longer interval of time than just two years. We leave it to the interested user to pursue different possibilities. We will proceed assuming that one wants to conduct an exploratory look to see whether a higher average of work hours across both years is associated with an increase or a decrease in behavior problems.

To construct our measure, we will need information about the average hours worked by heads and wives in two years--1997 and 2003. The relevant PSID variables are HDAVG97, WFAVG97, ER24078, and ER24089. The first two variables are the variables measuring heads' and wives' average work hours per week from the 1997 wave of the PSID, and the latter two represent the average hours worked per week by heads and wives (respectively) from the 2003 PSID. What if you did not know the exact names for these variables a priori? How might you determine what variables to look at if you were interested in work hours? As noted above for the CDS, the Data Center has a search option that allows a user to search the PSID and CDS files by keyword. So, one could go to that page and type "work hours". In doing this search, one would want to highlight "PSID family-level" in the "Data File Type" box. Additionally, one could limit the search to the two years of interest by highlighting the years 1997 and 2003 in the "Data Year" box. (You do this by de-selecting the default option of all years--by clicking on the box--and then re-highlighting your two years of interest by clicking on the mouse at these two years while holding the control key down.)

C. Getting Started on the Analysis

Now that you have a basic understanding of how to determine what information (and variables) the PSID and CDS contain, it is time to discuss the specific steps you will need to take to get the data that you need to do your analysis for this tutorial. To explore possible connections between child care-giving, work decisions, and child well-being, information is needed about the PCG's employment, children’s ages, and child behavior. We will mention a few other variables too, however, as we proceed because enthusiastic users of this tutorial may be motivated to do some follow-up analyses to the exercise that we have designed.

The full list of variables that we want is given in Table 1. You may want to print it out so that you can check variables off the list when we get to the step where we select our variables. The table is followed by a brief discussion of the motivation for inclusion of each variable. In line with our later reference to Data Carts, we call this your 'Shopping List' - which you may write down and mull over before selecting the actual data.

Table 1: 'Shopping' list of key variables of interest and their location

Measure Variable name(s) CDS module or PSID file containing the variable(s)
PCG's report of problem behavior: BPI score BPI_T02 CDS Primary Caregiver-Child file
Child's age ER33406;
Q21IWAGE
PSID Individual file (1997 survey year)
PCG Child File (2002)
Average number of hours worked per week of PCG HDAVG97; WFAVG97; ER24078; ER24089 PSID Family files (1997 and 2003 survey years)
Employment status of PCG ER10081; ER10563; ER21123; ER21373 PSID Family files (1997 and 2003 survey years)
Educational level of PCG UPEDU97H; UPEDU97W PSID Family file (for 1997 survey year)
Child's relationship to PCG RELPCG97; RELPCG02 CDS Demographic files
Child's relationship to the head of the household in which he or she resides ER33403; ER33703 PSID Individual files (1997 and 2003 survey years)
Sex of the household head ER10010; ER21018 PSID Family files (1997 and 2003 survey years)
Head's marital status ER10016; ER21023 PSID Family files (1997 and 2003 survey years)
Weight CH02PRWT CDS Demographic file (2002 survey year)

Characterizing child behavior: We have noted that we are interested in determining whether there appears to be any connection between PCG labor force participation and children's behavior (we want to know whether children whose PCGs work outside the home behave differently from children whose PCGs do not). The CDS actually collects information about children's behavior from both parents and schools. In the first instance, the CDS follows standard practice in developmental psychology by administering a series of questions to PCGs using a question sequence originally developed by James Peterson and Nicholas Zill (Peterson and Zill, 1986), along with additional questions that were added to the Peterson and Zill scale by the National Longitudinal Survey of Youth. The responses to these questions are then used to construct a composite index of child behavior called the BPI. A high score on the index indicates a high level of behavior problems while a low score indicates little problem behavior on the part of the child. Users who are interested in learning more about all 30 questions that underlie the BPI, or in examining individual types of behavior, should refer to Table 6 in Chapter 6 of the CDS User Guide. Recall, the User Guide is available online at http://psidonline.isr.umich.edu/CDS/userguide.html Note that because some of the questions that parents are asked are also asked of teachers, it is possible to construct a BPI score for each child "as reported by teachers". Using both PCG and teacher based measures for analysis would give rise to 'absent data' issues we covered earlier. We will use the BPI as reported by PCGs for our analysis, however.

In addition to its usefulness in characterizing child behavior, the BPI provides an interesting example of a generated variable for users of the tutorial to contemplate. Some of the variables contained in the Panel Study of Income Dynamics and the CDS are variables that the PSID/CDS staff constructs based upon information provided by survey respondents. Why construct such aggregated measures rather than simply providing the underlying data? In certain instances, it is known that researchers prefer to work with variables that really represent an aggregation of different pieces of information; and, in these situations, the PSID and CDS present the variables in the form that researchers want to use them. A traditional PSID example of this (besides the BPI) occurs with the case of family income. The research community often is interested in the effects that income has on a number of individual and family outcomes. And, while most researchers think of annual income when they devise their studies, income data actually are collected from an extensive set of specialized questions because in practice people are NOT paid annually, but receive their earnings in a variety of forms and differing periodicity - either weekly, bi-weekly or monthly instead. Accordingly, as discussed in the documentation for the PSID income variable, available at http://psidonline.isr.umich.edu/data/Documentation/Income/ , the PSID asks about the hourly wage, the number of hours worked per week, the weekly paycheck, an individual's monthly salary, and other income-related questions (bonus, overtime pay, professional practice or business income and the component related to own or spouse's time in the business, income from assets, Social Security, for example) and then uses such diverse pieces of information to compute the aggregate annual income variable or its components preferred by the researchers.

Determining the age of CDS children: Information about the child’s age will be useful because many PCGs temporarily remove themselves from the labor force during the period in which their child is an infant or a toddler, and then re-enter the labor force as the child approaches school or pre-school age, particularly in the United States (Gustafsson and Stafford, 1994). Accordingly, one might be interested in subdividing the PCG population into those who have children age 0-2, those who have children of pre-school or kindergarten age (3-5), and those with older children who can be expected to be in school full-time. One approach would be to obtain age by subtracting the birth year from 2002. This is definitely not recommended, since the time span of the CDS data modules was over a wide time interval from October of 2002- spring of 2003, so the age of the child at the time the measures were reported is not at a single time point. A second option is to take the birth year of the child (BIRTHYR) and then to compare it to the date of the PCG module. The age at time of interview variable--Q21IWAGE--was constructed this way, and it is available in the PCG Child module. We also have a third and much simpler option: the BPI information is only collected for children who are age 3 and older. This is standard practice in developmental psychology. So by restricting the BPI information to be non-missing in our data selection box below one can select data for the children who were age 3 and older and have (as can be seen from the viewable codes) a BPI score from the fact that the values of '99' mean, as the codebook says, 'not ascertained'. This could be a child under age 3 at the time of the PCG Child interview of a child for whom the index could not be constructed even though the were 3 or older.

Measuring labor force activity: The tutorial has already discussed the relevance of HDAVG97, WFAVG97, ER24078, and ER24089. Here we simply note that there will be preliminary work required to determine whether the primary caregiver corresponds to the head or wife of the family. We will discuss the simple procedures needed to make this determination later. Once this is understood, the user should be empowered to access a vastly wider range of measures from the PSID core files, via the Data Center. (For example, we figure some users may want to look for connections between being moms being employed and children's behavior. In a different direction, one could see if the behavior of the PCG in such areas as smoking and exercise has a connection to the child's BPI. The smoking behavior of the PCG as head or wife in 1986, 1999 and on forward can be merged in to an analysis file. If one is interested in a dichotomous labor market indicator, such as whether the mom is employed or not, one can use the PSID's variables for heads' and wives' employment status, rather than looking at connections between behavior and hours that are worked.)

Gauging the PCG's level of education: Educational attainment can factor into a PCG’s decision to re-enter the labor force after the birth of a child, so some users may want a measure of PCG education. Measures of the number of years of schooling completed by the head and the number of years completed by the wife of any given family are contained in the PSID core. Theory would suggest that those with high levels of human capital would stand to benefit greatly from an earnings growth and rebound or career signaling to employers from returning to the labor force (Kim and Polachek, 1994; Stafford and Sundström, 1996). Yet, it also is theoretically possible that highly-educated parents may be particularly concerned about the development of their children (Becker, 1981), and may view their own time a a critical developmental input. An enthusiastic (and energetic!) user might want to determine whether there is an association between years of schooling and working outside the home when one has young children (in addition to examining the work-child behavior nexus), so that he or she can determine whether there appear to be differences between the education levels of those who work outside the home and those who choose to remain at home full-time. Accordingly, we include education variables among the list of data that we will select.

The relationship variables: The task that a researcher confronts in any attempt to determine the employment situation and educational level of the PCG is that of linking information contained in variables characterizing the head and wife of the family unit in which the child resides to the child's PCG. That is to say, there are no variables called "PCG employment status" or "PCG hours worked per week" or "PCG education." Instead, the PSID provides the most complete data covering the employment status and work hours of the family head and wife, and, via relational data structures, it is up to the researcher to determine which of this information applies to the PCG. (In some instances, none of it will apply to the PCG, a type of absent data.) In the few instances in which the PCG is someone other than the child's mother or father, while the head and wife represent the child's parents, having employment information about heads and wives will not provide any information about the primary caregiver. (Instead, one would have to resort to more complex procedures of matching children with their PCG characteristics, though the task is not necessarily impossible.)

To match the data in the PSID to the data in the CDS properly, one has to first determine who the child's PCG is, and then devise a way to assign employment and education information to this person using the head/wife employment and education data from the core. This can be done using a variable indicating the PCG's relationship to the child. When choosing this route, the matching is most easily done if one restricts the analysis to instances in which (1) the PCG is a parent of the child and (2) the head/wife of the family unit in which the child resides is also a parent of the child. To illustrate this principle, we focus on PCGs who are moms in this tutorial. Doing so will be sufficient to help one understand how to attach employment and educational characteristics to PCGs, and it will give us a sample of children and PCGs that is quite adequate to work with. (Over 95% of all observations in the CDS are ones in which the mother is the primary caregiver, so we will not be losing much information - or many observations - by ignoring the few instances in which dads or others are primary caregivers.) Again, because of the way that the information on education levels and employment is reported in the PSID, it is easiest for our purposes to ignore the instances in which the PCG is someone other than the mom (such as a grandparent or some other relative) because we have to indirectly gather the information about the PCG characteristics of interest from the variables reporting information about (female) heads and wives of the family in which the CDS child resides. If we restrict ourselves to cases in which the PCG is the mom it becomes easy to ascribe information about family heads or wives to the PCG using the child's relationship to PCG variable available in the CDS in combination with the child's relationship to head variable provided in the PSID core. Here it is probably worth noting that the CDS contains two "types" of variables describing a child's relationship to his or her PCG. There is a variable denoting the "role" relationship to the PCG, and another providing detailed information about the actual relationship of the PCG to the child (such as whether the PCG is the biological mom, an adoptive mom, a step-mom, et cetera). The role relationship variable (named "Role Relation PCG") contains less information than the second version of the relationship to PCG variable (named "RELPCGxx"--where the "xx" stands in for the last two digits of the relevant CDS survey year). We mention the first "type" of relationship to PCG variable only because we want you to be aware that there are two different variables capturing a given child's relationship to his PCG (so that you don't mistakenly pick the wrong one when we begin selecting our variables).

Note that a second route to take when trying to link the data in the PSID with the CDS data is to use something called a "Map" file. The Map file is a CDS file that the Data Center automatically sends you that reports, for each CDS year, the 1968 id number and Person Number for the child, for the PCG and for the OCG. The PCG's 1968 id number and person number can be used to construct a variable that uniquely identifies each PCG. This variable can then be used to merge characteristics about the PCG into a CDS dataset. We will talk more about this process later in the tutorial (at the end of section II D).

The sex of the head variable and the marital status variable: The firstpiece of data (sex of the household head) can be used to help assign the head and wife information taken from the core. The marital status variables can be used as a check on your work.

CH02PRWT: Why do we need to select weights as instructed above from the CDS demographic file? Like the PSID, the CDS sample is representative of the United States. However, because the sample design and the long evolution of the PSID (from which the CDS families are taken) as well as the CDS sample selection, weights are needed to allow for the fact that certain groups of individuals are over or underrepresented relative to the U.S. population. (See the following website for a discussion of the sample design: http://psidonline.isr.umich.edu/Guide/Overview.html ) The sample weights enable researchers to conduct nationally representative analyses of children and their conditions. Furthermore, because the CDS only interviewed a maximum of two children per family (by random assignment if more than two children under the age of 12 were present in the family), while actual family sizes could be smaller or larger, some adjustment needed to be made to ensure that children from certain types of families were not over or under-represented relative to the population at large. The CDS demographic file offers a number of weight variables, partly because the unit of analysis can be the child or the child's family (requiring individual level weights in the first instance and family level weights in the second), and because there are some differences across the CDS modules in response rates. Detailed information about CDS sampling and weights, can be found at http://psidonline.isr.umich.edu/CDS/weightsdoc.html

The weight variable that we want to work with is CH02PRWT because our unit of analysis is the child, and his or her behavior in 2002 more specifically. This means we will be posing our research questions from the perspective of a child--offering information on the percentage of children whose PCG's are in the labor force, for example, rather than structuring the study of employment in terms of the overall percent of moms who are employed. Both approaches can provide interesting insights. The focus of this tutorial is on children, however, and to keep things simple we will view our focus with them being the unit of analysis. Happily, this offers us the advantage of only having to work with one set of weights (and it leaves something for the really interested user to do on his/her own!).

Onward!: Having reminded ourselves of the variables we want to examine, we now want to remind ourselves of our goal. We want to create a subset of data for a file to be used for studying the following question: How is moms' work, particularly persistent work, associated with or related to child well-being, as measured by the Behavior Problems Index? Admittedly, the full process through which these connections could occur is complex and in this tutorial we are really in exploratory mode. For example, if those working long hours in the market are found to have fewer (or more) behavior problems with their young children, a more in-depth study using other CDS and Core data would be warranted. Those who see problems arising with their children may reduce their work hours to compensate for the deficit (Griliches, 1979). This response could include the paths and timing of work and childcare, or possible connections to welfare benefit eligibility, analysis of which we leave for a more ambitious research project. Similarly, an interest in the reasons for variation in labor force participation among those having young children may prompt one to devise a more sophisticated research project. One that models the labor force participation decision more precisely for example, or one that examines the bargaining processes involved in household decision making when spouses are required to make coordinated decisions about labor force activity.

D. Using the Data Center to Construct a Customized Dataset

How will we get the data that we need to conduct our exercise? The process is mechanically easy but requires some thought. Let's start with an example in which we restrict the focus to cases in which the PCG is a biological mom. To do such we will have to examine the values that the RELPCG97 and RELPCG02 variables take on, and to only include the CDS child records for which these variables takes on a value of "1". The variable "RELPCGxx" reports responses to a question about whom the child's primary caregiver is. The responses are coded as follows: "1" for biological mother, 2 for step-mother, 3 for adoptive mother, 4 for biological father, 5 for stepdad, 6 for adoptive dad, 7 for grandmother, 8 for grandfather, and 9 for female partner of the child's other caregiver ("OCG"). This variable presents an example of a categorical variable. Categorical variables are those that take on numerical values, but where the values simply represent index categories. Accordingly, one would not want to say that an adoptive mother has 3 times greater value than a biological mom. The numbers assigned to categorical variables are not to be interpreted in the same way that the numbers assigned to a numeric variable are. (For an income variable, for example, three dollars of income is indeed 3 times as great as 1 dollar of income).

We also need to think about the coding of the relationship to head variables (ER33703 in 2003) in order to assign information to the PCGs. This variable is also a categorical variable. It takes on a range of values, each denoting a specific relationship between the child and the head of his or her family. In addition to limiting ourselves to cases in which the PCG variable takes on a value of 1, we need to limit ER33703 to cases in which it takes on a value of 30 (child of head, natural), 33 (stepchild of head) , 35 (stepchild of 'wife'/cohabitant), or 83 (child of boy (male) friend/ girl (female) friend), only. For our tutorial these represent instances in which the head is a parent to the child (rather than being a grandparent or an uncle or some undisclosed relative). This means that these are instances in which the information that is presented about the head’s educational and employment characteristics, or the wife’s educational and employment characteristics, can legitimately be used for information about the PCG. (This implies that we will be ignoring the few instances in which heads are grandparents, aunts, uncles, and other relatives). For additional information on the other values that this variable could take on, see the documentation for the demographic file. It is available at http://psidonline.isr.umich.edu/CDS/Codebooks/Default.aspx

What other considerations are relevant? We also will use the information about the sex of head of the family in which the child resided in 2003. Why? If the sex of head is male, then the characteristics of the PCG will have to come from information covering the wife of the family (since we know that all of our PCGs are moms and that moms are necessarily female). This means we will find out about the PCG’s hours worked, educational level, and her employment status from the WFAVG97, ER24089, UPEDU97W, ER10563 and ER21373 variables. Alternatively, if the sex of the head is female, then we know to take the employment and educational information from the variables describing the household head (HDAVG97 ER24078, UPEDU97H, ER10081 and ER21123) in order to get the information that we want about primary caregivers. The issue of the head wife data structure is covered more fully in Tutorial 1 http://psidonline.isr.umich.edu/Guide/tutorials/Default.aspx .

The need to adhere to these requirements presents a perfect opportunity to employ the Data Center’s subsetting capacity. When we retrieve the data from the Data Center, we will not only list the variables that we are interested in, but also limit the sample selection to cases where several variables are in specific value ranges to define the subset of observations we have just discussed.

One caveat: There is one weakness to using the approach that relies on the child relationship variables to match data from the CDS to the core. In instances in which the PSID and the CDS data were not collected in the same year (as is the case for CDS II for example), it is (theoretically) possible for a male head of household to have divorced and re-married within the year. If this male head has custody of his child, then we could get a situation in which the child's case satisfies the biological mom requirement for 2002 (in CDS II), with the child also being the child of the head in 2003 (so that his relationship to head code = 30), but the PSID wife information (from 2003 PSID family file) will not actually pertain to the child's mom. Such a quandary does not occur for CDSI since the survey was conducted in the same year as the PSID. Later we will explain how the Map file helps if one is facing this type of issue.

E. Selecting Your Variables at the Data Center

The Data Center is the place to select the variables that you need. In total, we want to select 22 variables from the Data Center, drawing from two different CDS modules (the Demographic module and the Primary Caregiver-Child module), along with data from the PSID's Family-level files that its Individual-level files. As noted above because of different participation rates in these modules there will be 'absent data'. Remember, you can find the Data Center at the following web address: http://simba.isr.umich.edu.

1. Selecting our CDS variables

Armed with new knowledge and our 'shopping list' of variables, we can (FINALLY!) start by highlighting and thereby selecting the modules from which our data will be taken. To begin, click on the "By File" option (shown in Screenshot 1 in Part II-B above). This takes you to a screen listing the different broad categories of PSID and CDS files. As noted in previous tutorials (such as Tutorial 1 and Tutorial 3), the PSID collects mostly family-level data, but there is also limited data stored in individual-level files so we have both "PSID Family-level" and "PSID Individual-level" listed as options here (as shown in Screenshot 5 below). The screen also presents the option of selecting CDS files--either exclusively from the Time Diaries, or from the CDS inclusive of them. If you click on the "+" next to any of the options, you will get an expanded list of files that you can search from. For example, expanding the list under CDS reveals the Demographic file, the Primary Caregiver Child File, and a number of other CDS modules (as shown in Screenshot 6.) Clicking again expands your choice list even further to reveal the years for which there are data. (See screenshot 7 for example.) If you click on the "+" sign next to your year of interest, the full list of variables available appears.

Screen Shot 5: Selecting variables--"By file" options

Screen Shot 6: Selecting CDS and PSID sub-files or modules

Screenshot 7: Selecting Data categories by year

Since we need the BPI index for 2002, let's click on the "+" next to 2002 under the Primary Caregiver Child File. Next scroll down through the list until you locate our variable of interest (BPI_T02). Then select it by highlighting it by clicking on the mouse. Screenshot 8 (below) shows what you should see at this point. Next you want to select Q21IWAGE from this same module. (Remember that in order to select non-contiguous variables you must hold down on the ctrl key as you highlight your variables of interest.) After you have highlighted both variables, click on the "+" sign next to the Demographic File, and select 2002 here as well. From this variable list we want to select the CH02PRWT and RELPCG02 variables. Highlight them both. (To highlight non-contiguous variables, hold down on the control key after having selected an initial file, and then use your mouse to move down to the next variable of interest and click on it.) Then expand the box for 1997 so that you can select the relationship to PCG variable for 1997 (RELPCG97). NOW A VERY IMPORTANT STEP: We want to instruct the Data Center to put these 'shopping list' and companion variables in a data cart for us ("data cart" is PSID parlance for a shopping cart that stores our selected variables), so you need to click "Add to cart" in order to complete the process. (Simply highlighting the variables you want is not sufficient to actually obtain them.)

Screenshot 8: Selecting Variables from a CDS Module

Before moving on to select additional variables, you may want to verify that the variables you selected are actually in your data cart. To do this, click on the box mentioning variables added to your cart. This will allow you to see the list of variables that you presently have selected to confirm that you have the variables we wanted (BPI_T02, Q21IWAGE, RELPCG97, RELPCG02, and CH02PRWT). If it turns out that you are missing one, you can always go back and add variables to your existing data cart. We will do so in a minute. (For the curious, a brief word about some of the variables that appear in your shopping list or data cart that you did not select. The Data Center will automatically give you some key variables that are important for identification and linking purposes in the PSID. This is the reason you see the 1968 family id number and person number on your list for example. We will not get into a lengthy discussion of those types of variables here. We mention them only so that you do not panic when you see that you actually have more than the 5 variables that you manually selected.)

2. Adding variables from the PSID core

Now we need to select the variables from our shopping list (in Table 1) that come from the PSID core. It is simple to add new variables to the data cart that we already have started to create. All you have to do is to drag your cursor to the upper-left most portion of the screen to the phrase "Data Center." A list of options will materialize and you want to position your cursor next to the phrase "variable selection," and to then choose "by file." This will take you back to the screen that lists the different PSID files that one can choose from (the one in screenshot 5). Now, because we want to select variables from the PSID, we start by clicking on the "+' sign next to "PSID Family-level." Next click on the "+" sign next to "Main Family-level Data." You will then see an expanded list of options as shown below in Screenshot 9. Now all you need to do is to take steps similar to what you did when selecting the CDS variables from the different CDS modules. You click on the "+" sign next to the year of interest and the full list of variables available for that year comes up. Then you select your variables by highlighting them. After you have finished, you hit "Add to cart" to make sure the variables get added to your data cart. Let's do this for the following variables on our list from the 1997 survey year: ER10010 (sex of household head), ER10016 (head's marital status), HDAVG97, WFAVG97, UPEDU97H, UPEDU97W. Let's also do it to select the following variables that our list (Table 1) says that we need for 2003: ER21018 (sex of household head), ER21023 (head's marital status), ER24078 (head's work hours per week), ER24089 (wife's work hours per week). (Reminder: To select non-contiguous variables from a given file you have to hold down on the control key while you select the variables.)

Screenshot 9: Selecting Data Categories by Year from the PSID Family files

After you have hit the "Add to cart" button you are returned to the screen that lists the different files in the PSID. But, wait! We're not done selecting variables. We now need to choose some variables form the PSID Individual files, so click on the "+" sign next to the phrase "PSID Individual-level." This will expand the list of options available. Click on the "+" sign next to "PSID Individual Data by Years" and the full range of years from which you can select data appears. At this stage you need to select the following variables from the 1997 and 2003 individual files: (a) ER33403 and ER33406 from 1997, and (b) ER33703 from 2003. Remember to hit "Add to cart" when you are done.

Now you are almost done. There are only 4 more variables to select before you can begin your analysis. What's left? The variables revealing the employment status of the head and wife of the household for 1997 and 2003. Technically, you can select these using the same steps you used above to get the other variables of interest from the PSID family files. However, the tutorial is going to use these last 4 variables to show you how you can select variables using the PSID's cross year index. So, what you need to do now is to move your cursor up to the phrase "Data Center" in the upper-left portion of the screen, then move it down to the phrase "variable selection" that appears, and over to "By Index." This should lead you to a screen like the one shown below (Screenshot 10).

Screenshot 10: Selecting Variables using the Cross Year Index

We are going to use the Family Data Index--which is the cross-year index for family-level data (as noted in Tutorial 3) to select our 4 variables. So, click on the "+" sign next to that phrase in order to expand the list of options. You should see a screen such as the one below (Screenshot 11).

Screenshot 11: Cross-Year Data Index---Family Data Index example

We are looking for data covering employment, so scroll down until you see that word. If you click on the "+" sign next to the word "employment," you will see that there are many employment-related variables in the PSID. We already know that we want the variables recording heads' and wives' employment status so scroll down until you see "employment status--current" (which is the next to last option). Clicking on the "+" sign next to this phrase reveals yet another list of options. What's important about this list is that it not only tells you what variables are available, but the years in which they are available. This is handy because it allows you to select the employment data for as many years as you want in one step (without having to call up the separate family files for different years). For example, we want information about the head's employment status--and for 1997 and 2003 we see we can obtain this information in the form of a "Head 1st mention" variable. Let's click on the boxes for the years 1997 and 2003 to the far right of the variable name. Do the same for "Wife 1st mention" and then scroll back up to the top of the page and click "Add to Cart." Congratulations! You have now added the last 4 variables that you need for your analysis to your data cart.

To double-check, and to confirm that you have all the variables from Table 1, you probably want to click on the phrase "variables added to your cart" at this point, so that the full list of variables that you have chosen appears. You can expand the list to see each individual variable. (Screenshot 12 gives you a glimpse of what your screen should look like.) Then (if the list is complete), hit "checkout" and you will be taken to a screen where you can enter your login information so that you can go to a final screen to instruct the Data Center to compile your dataset for you.

Screenshot 12: "Variables added to your cart"--the expanded list

Screenshot 13 (below) depicts the box that appears after you login. This is the screen that allows you to tell the Data Center exactly how you want your data delivered to you, and whether you want a codebook (and what format it should be in). While many of the other tutorials have used Excel for calculations, here we recommend using SAS. Accordingly, you would want to tell the Data Center that you would like ASCII data with SAS statements. The Data Center will then send you two files--one with the data, and another that represents a sample SAS program that will allow SAS to read your dataset. Note that we have also entered a subsetting command in the subsetting box. This allows us to restrict the observations that are delivered to us. We want CDS kids, but because we are interested in connecting the information about children to information about heads and wives from the PSID core, we instruct the Data Center to send us only the records for children whose PCG is their biological mom (and children who reside in their parents' household as indicated by the relationship to head variable--an issue we discussed earlier in the section about how we would match PCG information to data from the PSID core, section II-C). Also we want children for whom a BPI score is available. The relevant subsetting command is,

RELPCG02 =1 and (ER33703 = 30 or ER33703 = 33 or ER33703 = 35 or ER33703 = 83) and RELPCG97 = 1 and BPI_T02 ne 99

Finally, we have highlighted the circle next to "CDS kids only" (because we don't want records for individuals in the PSID who are not kids). Additionally, we have told the Data Center to send our data via an e-mail message and attachment. You may also want to instruct the Data Center to compress the files to speed up delivery. You can then use Winzip or a similar program to unzip them. (And, you may want to enter a name for your data cart. Doing this allows you to easily identify the data cart if you decide you want to go back and re-do this analysis at a later date. The Data Center will store the data cart for you and you can logon to retrieve another copy any time. In our example, we named the data cart "my_tutorial4_data," but you could make up any name you like!)

Screenshot 13: Output Options

III. Using SAS with Your Customized Dataset

Here we offer some comments about steps you can take to work with your data, so that you can answer the research question posed at the outset of the tutorial.

(1) First note that the Data Center will send you a SAS starter program that will read the ASCII data into SAS (after you specify the location of the data: Did you download it to your c:\ drive, d: drive, e: drive? What folder did you put it in?) You will need to supplement this program with SAS code that creates a composite PCG work hours variable from the head and wife work hours information. You also will have to write code that instructs SAS to analyze the relationship between work hours and BPI scores (by computing mean PCG work hours for kids with high BPI scores and the mean for those with low BPI scores). You also may need to write SAS code to re-code some of the variables. (Here the codebook that the Data Center sends you will come in handy as you work to decide if and when this needs to be done.)

(2) The Appendix at the very end of this tutorial provides a sense of what your data set actually looks like. Appendix-A presents a SAS-Assist® display of the first seven CDS observations, i.e, data for the first few children in your dataset. You can see that the PCGs are all biological moms as required in our subsetting command (that the code for the child's relationship to the PCG in each year = 1) . Note that our dataset also contains information about "other" caregivers (called the "OCG"). In these first few rows from your dataset you can see that there are children whose OCG information changed from 1997 (CDS-I) to 2002 (CDS-II). The first observation in the dataset, for example, represents a child whose OCG in 1997 was a code 7 (a grandmother), but the OCG in 2002 was code 0 (no OCG).

(3) To create your PCG work hours variable(s) you have to be able to assign the information about head and wife work hours properly, so that your PCG work hours variable captures the work hours of the right person in the household. Simple code such as the following should help.

Table 2. Sample SAS code that can be used to construct the PCG work hours variable

/* now define pcg wk hrs variables for 02 */

data test2;
set test;

length mom_wkhrs02 mom_job02 8;
label mom_wkhrs02 = "mom wk hrs per wk in 02";
label mom_job02 = "mom empl status 02";

if ER33703 = 30 and er21018 = 1 then mom_wkhrs02 = er24089;
else if ER33703 = 30 and er21018 = 2 then mom_wkhrs02 = er24078;
else if (ER33703 = 33 or ER33703 = 35 or ER33703 = 83) and er21018 = 1 then mom_wkhrs02 = er24078;
else mom_wkhrs02 = .;

if ER33703 = 30 and er21018 = 1 then mom_job02 = er21373;
else if ER33703 = 30 and er21018 = 2 then mom_job02 = er21123;
else if (ER33703 = 33 or ER33703 = 35 or ER33703 = 83) and er21018 = 1 then mom_job02 = er21373;
else mom_job02 = .;

run;

Remember, as discussed in earlier sections of this tutorial, the logic that we are using here is that if the head of the household in which the child resides is female then the head must be the mom, since we have restricted ourselves to CDS cases in which the child's relationship to head is child of the head or wife/"wife" (a theme discussed in Tutorial 1). If the household head is male, then the wife/"wife" must be the mom, which means that the PCG is the wife.

(4) The Map file:

You will notice that, in addition to sending you the customized dataset that you created, the Data Center sends you two items not chosen when you selected your data cart - a SAS file with an "M" prefix before the job number, and an ASCII (.txt) file with a similar "M" prefix. This is the famous CDS "Map" file. The Map file contains variables that can be used to construct a unique identifier for the PCG and for the OCG of every CDS child. Generally, it is possible to create a single variable or i.d. .(or 'primary key' if you're thinking in relational data terms) that uniquely identifies any individual in the PSID by combining something called the individual's 1968 Interview Number ("68 ID") with his or her "person number" (PN). The map file delivers this information for the PCGs and OCGs of the CDS children. (See Appendix C for an illustration.) To create a single variable that uniquely identifies each PCG, you want to multiply the "PCGID_xx" variable by 1000 and to add "PCGPN_xx" to it. (Make sure you use the same year's worth of info when you do this--so PCGID_02 goes with PCGPN_02 for example.)

How can you think about the unique identifier variable that results when you take the aforementioned steps? As noted earlier, the PSID collects information for families, and many of the individuals within the families. The "68 id" variable is like a person's last name while the "PN" number is like a first name. So, if we have someone who is either from or a descendent of the hypothetical 'Hurst' family that was interviewed when the PSID started back in 1968, the '68 ID used in conjunction with the PN number tells us whether the person is Sally Hurst, or William Hurst, or some other Hurst person as defined by birth or adoption.

How might this unique identifier variable be helpful? You can create a similar variable for each CDS child, by using the ER30001 and ER30002 variables contained in the Map file. This would enable you to merge the single PCG identifier into your customized dataset of CDS children (merging on a child identifier variable that you create in a similar way), and to then use the PCG identifier variable to match information taken from the PSID core to the PCG.

So, for example, we might go to the Data Center and retrieve all our desired variables from both waves of the CDS along with our data from the PSID 1997 family file, using the restriction that the PCG be the mom. Then we could return to the Data Center to put together a dataset of females and their work hours in 1997 (similar to what was done in Tutorial 1). And, then we would merge the info about the women's work hours onto our CDS-based dataset using the newly created PCG identifier variable to do the merge. That would match up information about mom's work hours precisely: Any biological mom who was a PCG in CDS-II (in 2002) but who was no longer present in the family in 2003 would NOT have any work information reported about her in 2003. And, any woman who was a wife in 2003 but not the biological mom PCG for 2002 would not have her work hours data matched to the child.

Summarizing, the Map file provides a fool-proof way to match PCG information taken from the PSID core to a customized CDS dataset. (For example, in Appendix B you can see that the potential problem that we alluded to in Section II-C in the discussion of the use of child relationship variables when linking CDS-II data to data from the 2003 PSID does indeed materialize in the sample exercise presented in this tutorial: there are a few cases in which the wife of the 2003 family turns out to not be the mom who was the PCG in 2002--even though we restricted the child relationship variables specifically so that we would only pull cases where the PCG is the biological mom and the CDS child is a child of the head or the wife/"wife". We can tell that they are different people by comparing their unique identifiers or primary key.)

Additionally, if you are interested in analyzing PCGs whose relationships to their child extend beyond the simple biological mom category, you will be best off using the Map file approach to link info from the PSID core to the CDS children. (For example, if the PCG is the grandmother, but the child lives with her dad, it will not be possible to get information about the grandmother's characteristics from the Head or Wife variables from the family in which the child resides, since the grandmother won't be the head or wife of the family that family. However, the grandmother might very well be head or wife of a different family and at different time points (current or prior years) in the PSID, since the grandparent family might very well be an original sample member. So information about the grandmother would be in the PSID, and the PCG identifier variables would help you find her.)

Finally, for those who are interested, Appendix C presents the first 12 observations of the Map file to help you visualize what we have said about it. This screen underscores our earlier observation that a child may have an OCG in one period but not the next. For example, the 6th observation represents a child whose OCG in 1997 had a '68 id number of 4 and a PN number of 190, making the OCG's unique identifier variable 4190 (assuming you were to construct such a variable as we described above). Appendix C also illustrates a general principle about CDS children: Generally speaking, a child's PCG might change over time, or a child who was part of CDS I might not appear in CDS II. (Both of these points were alluded to earlier in this tutorial.) Because the Map file that is sent to users contains records for all CDS children, we can see in Appendix C that there are some children (i.e., rows or observations) for which there are numbers in the columns for the PCG '68 ids and PN numbers for 1997, but where there is "." in each respective column for 2002. The first observation in the dataset is a good example. (Note that the Map file contains more observations than the customized dataset that you created, since the Map file contains records for all CDS children while your customized dataset only contains children whose PCG is the biological mom. However, the case count in the Map file should not pose a problem if you decide to use the Map file when doing your analysis, since you will merge the information from it into your customized dataset and any Map file cases for which there is no corresponding child in your customized dataset can be automatically dropped.)

IV. References

Becker, Gary S., A Treatise on the Family, Cambridge, Massachusetts: Harvard University Press, 1981.

Bureau of Labor Statistics' website (www.bls.gov), specifically "Table 5. Employment Status of the Population by Sex, Marital Status, and Presence and Age of Own Children under 18, 2000-2001 Annual Averages," available on-line at www.bls.gov/news.release/famee.t05.htm (or see the BLS' on-line Glossary at http://www.bls.gov/bls/glossary.htm)

Davis, Matthew, McGonagle, Schoeni and Stafford (under review the kids BMI x parents x grandparent.

Griliches, Zvi, "Sibling Models and Data in Economics: Beginnings of a Survey," Journal of Political Economy, October 1979, Vol. 87, No. 5, Part 2, pp. S37-S64.


Gustafsson, Siv and Frank P. Stafford, "Three Regimes of Childcare: the United States, the Netherlands, and Sweden," in Social Protection and Economic Flexibility (Rebecca Blank, ed.), National Bureau of Economic Research, University of Chicago Press, 1994.

Hallberg, Daniel and Klevmarken, N. Anders, "Time for Children: A Study of Parents’ Time Allocation," Journal of Population Economics. 2003, Volume 16, No. 2., p. 205-226.

Heyns, Barbara and Sophia Catsambis, "Mother's Employment and Children's Achievement: A Critique," Sociology of Education, July 1986, Volume 59, Issue 3, pp. 140-151.

Hofferth, Sandra L., "Caring for Children at the Poverty Line," Children and Youth Services Report, 1995, Vol. 17, No. 1-3, pp. 61-90.

Johnson, Rucker and Mary Corcoran, "Welfare Recipient's Road to Economic Self-Sufficiency: Job Quality and Job Transition Patterns Post-PRWORA," unpublished manuscript, University of Michigan Poverty Research and Training Center, March 2002.

Kim, M. K. and Solomon W. Polachek, "Panel Estimates of Male-Female Earnings Functions," Journal of Human Resources, Vol. 29, 1994, pp. 406-428.

Parcel, Toby L. and Elizabeth G. Menaghan, "Early Parental Work, Family Social Capital, and Early Childhood Outcomes," American Journal of Sociology, Vol. 99, Issue 4, January 1994, pp. 972-1009.

Peterson, J.L. and N. Zill, "Marital Disruption, Parent-Child Relationships, and Behavior Problems in Children," Journal of Marriage and the Family, May 1986, Vol. 48, No. 2, pp. 295-307.

Stafford, Frank P. and Marianne Sundström, "Time Out for Childcare: Signaling and Earnings Rebound Effects for Men and Women," Labour, Vol. 10, No. 3, Autumn, 1996, pp. 609-629.

V. Appendix A -- The first seven records in your customized dataset

VI. Appendix B--The few cases of incorrect assignment that occur when child relationship variables are used to link information from CDS-II to the PSID core

VII. Appendix C--The first 12 observations in the Map file

 
 
 
 



Institute for Social Research | University of Michigan | Privacy | Conditions of Use