----------------------------------------------------------------------------- ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- CODEBOOK FOR 1998 SURVEY OF CONSUMER FINANCES ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Survey of Consumer Finances Board of Governors of the Federal Reserve System Mail Stop 153 Washington, DC 20551 ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- To: Users of the 1998 SCF From: Arthur Kennickell, SCF Project Director Date: February 15, 2000 Subject: Description of the processed version of the 1998 SCF ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- WARNING: This codebook contains over 78,000 lines of text, including this introduction, variable descriptions, the program used to collect the survey data, and other material. Most users will probably NOT want to print the entire document. Generally, we work with the file in electronic form. This codebook serves as the principal guide to the variables included on the final public version (February 1, 2000 version) of the 1998 SCF dataset. However, not every variable included in this codebook is actually in the public use dataset. Among other things, the dataset does NOT include most variables related to the sample design, details of geography, or the 3-digit industry and occupation codes. Although we have attempted to mark the variables in the codebook that are not available to the public, there may be errors or omissions. The definitive list of the variables included is given at the end of this file. Please consult that list to determine whether a given variable is available to you. For a general overview of the 1998 SCF, see Arthur B. Kennickell, Martha Starr-McCluer, and Brian J. Surette "Recent Changes in U.S. Family Finances: Results from the 1998 Survey of Consumer Finances," Federal Reserve Bulletin, January 2000, pp. 1-29. Results you may obtain from using this release of the 1998 SCF may differ from those reported in this article for several reasons. First, the analysis weights used in that article were altered to provide robust estimates of the detailed categories shown. In brief, the data were examined for extreme outliers, and where a given case was overly influential in determining an outcome, the weight was trimmed and other weights were inflated to maintain a constant population. In some other cases, a broad view of the data in the analytical framework of the article necessitated the assignment of some data to different categories than those implied by a narrower and more straightforward interpretation of the data. Second, as noted below, the public version of the data has been systematically altered to minimize the likelihood that unusual individual cases could be identified. Our analysis of the public dataset suggests that these changes should not alter the conclusions of reasonable analyses of the data. Finally, over time we correct errors that we find in the dataset. In our past experience, the effect of these errors on the estimates has been quite small. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- TABLE OF CONTENTS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- DATA FILES INCLUDED ON SCF WEB SITE QUESTIONNAIRE UNIT OF ANALYSIS SAMPLE DESIGN CODEBOOK CONVENTIONS VARIABLE NAMES GENERAL DATA CONVENTIONS CASE ID NUMBERS "OTHER" CODES GRIDS SUMMARY VARIABLES DATA REVIEW IMPUTATION DISCUSSION OF RANGE DATA COLLECTION AND J-CODES ANALYSIS WEIGHTS SAMPLING ERROR DISCLOSURE REVIEW COMPARISON WITH OTHER DATA ACKNOWLEDGMENTS VARIABLE DEFINITIONS SURVEYCRAFT PROGRAM, MAIN QUESTIONNAIRE (ENGLISH) SURVEYCRAFT PROGRAM, MAIN QUESTIONNAIRE (SPANISH) SURVEYCRAFT PROGRAM, DKDOL SURVEYCRAFT PROGRAM, INTERVIEWER COMMENTS MAPPING FROM SURVEYCRAFT VARIABLES TO SCF VARIABLES LIST OF VARIABLES INCLUDED ON PUBLIC DATASET ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- DATA FILES INCLUDED ON SCF WEB SITE ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- The primary data files for the survey consist of the following pieces: the main dataset and a file of replicate weights corresponding to X42001 (see below for a description of the replicate weights). In addition, an aggregated version of the main dataset containing summary variables corresponding to those used in "Recent Changes in U.S. Family Finances: Results from the 1998 Survey of Consume Finances" (Arthur B. Kennickell, Martha Starr-McCluer, and Brian J. Surette, Federal Reserve Bulletin, Janary 2000, v. 86, pp. 1-29) is available in a format compatible with most spreadsheet software. To aid users in reconciling their calculations with those found in the January 2000 Bulletin Article, two sets of tables are provided: the first set is based on the current internal version of the data, and the second version is based on the current public version of the data. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- QUESTIONNAIRE ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- The 1998 SCF was collected using computer-assisted personal interviewing (CAPI). Thus, there is no questionnaire in the usual sense. This codebook serves as the most comprehensive guide to the definitions of variables included in the survey. At the end of this file, a copy of the Surveycraft (SC) program that was used to collect the data is included. The SC program serves as the authoritative reference for questions relating to question ordering and skip sequences. Because question ordering is important in understanding the meaning of many questions, users of the data are encouraged to consult the SC program. At the very end of this file, a translation of most SC variables into SCF variables is provided. Although there is usually a direct correspondence between the SC variables and the final variables listed in this codebook, there are some places where the connections are indirect: In some cases, the same question is asked in two difference places, and in the final dataset all instances of answers to the question are mapped into a single location; in other cases variables may be inferred from other information (for example, if a respondent reported a wage on a current job and reported that their employer contributed a certain percent of their wage to a pension plan, then the dollar contribution to the plan would be filled in). Almost always, the data rearrangements can be identified from the shadow variables associated with the variables (see section "VARIABLE NAMES" below). ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- UNIT OF ANALYSIS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Most of the data in the survey are intended to represent the financial characteristics of a subset of the household unit referred to as the "primary economic unit" (PEU). In brief, the PEU consists of an economically dominant single individual or couple (married or living as partners) in a household and all other individuals in the household who are financially dependent on that individual or couple. For example, in the case of a household composed of a married couple who own their home, a minor child, a dependent adult child, and a financially independent parent of one of the members of the couple, the PEU would be the couple and the two children. Summary information is collected at the end of the interview for all household members who are not included in the PEU. The only variables collected separately for the respondent and the spouse or partner of the respondent are those concerning employment, pension, and demographic characteristics. Throughout the codebook, we refer to the "head" of the household. The use of this term is euphemistic and merely reflects the systematic way in which the dataset has been organized. The head is taken to be the single core individual in a PEU without a core couple. In a PEU with a central couple, the head is taken to be either the male in a mixed-sex couple or the older individual in the case of a same-sex couple. No judgment about the internal organization of the households is implied by this organization of the data. When the original respondent was someone other than the person determined to be the head in this sense, all data (including response codes) for the two members of the couple were systematically swapped. The variable X8000 indicates which cases have been subjected to such rearrangement. NOTE: Because only limited information is collected on the ownership of assets and liabilities within the PEU, it is not possible, in general, to make direct separate estimates of the financial characteristics of the individuals in the survey households unless one is prepared to make a number of fairly complex assumptions. To understand this point more thoroughly, there is no substitute for a careful reading of the actual survey questions. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- SAMPLE DESIGN ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- The SCF is based on a dual-frame sample design (see "Consistent Weight Design for the 1989, 1992, and 1995 SCF, and the Distribution of Wealth," Arthur B. Kennickell and R. Louise Woodburn, June 1997, http://www.federalreserve.gov/pubs/oss/oss2/method.html for more details). One set of the survey cases was selected from a standard multi-stage area-probability design. This part of the sample, which contributed 2,813 cases to the final set of interviews, is intended to provide good coverage of characteristics, such as home ownership, that are broadly distributed in the population. The other set of the survey cases was selected as a list sample from statistical records (the Individual Tax File) derived from tax data by the Statistics of Income Division of the Internal Revenue Service (SOI). These records were made available under strict rules governing confidentiality, the rights of potential respondents to refuse participation in the survey, and the types of information that can be made available. This second sample was designed to disproportionately select families that were likely to be relatively wealthy (see "List Sample Design for the 1998 SCF," Arthur B. Kennickell, April 1998, http://www.federalreserve.gov/pubs/oss/oss2/method.html for a more extended discussion of the design of the list sample). The list sample contributed 1,496 cases to the final set of interviews. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- CODEBOOK CONVENTIONS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- For many purposes it is useful to know which responses were available to the interviewer and which were actually known by the respondent. Responses that are noted in the codeframes below by an asterisk are ones that were available to the interviewer on the screen. In some cases, codes were only available given a response to an earlier question. One example that appears throughout the interview is the reporting of institutions where the respondent has an account of some type. If the respondent reported fewer than six financial institution at X305, every time the interviewer came to a question that asked about the institution where the respondent had an account, the screen displayed the names of the already listed institutions (referred to as "Institution 1" etc. in the codebook), a code for "add an institution," and a code to enter to record an unusual type of institution ("other"). Once six institutions had been recorded (either at X305 or by adding institutions later in the interview), the screen displayed the names of the six institutions, the "other" field, and a set of codes for the type of institution (i.e., commercial bank, savings and loan or savings bank, credit union, etc.). In general, if a response is given in the codebook in lower case letters, this indicates that it could have been read to the respondent. Responses listed in all upper case letters are ones that were not intended to be read to the respondent. Codes that result from the recoding of responses originally reported as "other" are given in lower case letters. Codes that could have appeared on the computer screen are marked with an asterisk. Question texts given in capital letters are intended as interviewer instructions. For many questions there are multiple versions. Most commonly, there are variants that are appropriate for single individuals and ones appropriate for families of two or more. Some other variants are more complicated. For example, suppose that a respondent lives in a building with multiple housing units (X702=1), the family owns the entire building (X714=1), and they own the unit they live in separately from the rest of the building. The CAPI program stores the information that there is such a property. Later in the interview when the respondent is asked about the number of investment real estate and vacation properties, one variant of question X1701 reminds the respondent to include the property mentioned earlier. There are many other such instances where the computer alters questions to suit the previous answers given by the respondent, and this codebook attempts to provide at least a summary form of all the possible questions. For example, at X1711 (correspondingly at X1811 and X1911), the respondent is asked whether there are any outstanding loans on a property. If the respondent had previously reported at X1703 (correspondingly at X1803 and X1903) that the property was a time-share, then the variant for time-shares is asked; otherwise a more generic question is asked. The authoritative references for deterimining how the information was presented to the respondent are the executable CAPI program, which is available at http://www.federalreserve.gov/pubs/oss/oss2/98/scf98home.html, and the text of the CAPI program appended to this codebook. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- VARIABLE NAMES ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- This codebook refers to the variables by the names they have in the version of the survey dataset formatted for use with SAS. These names consist of a number prefixed by an "X." We have tried, insofar as it was possible, to retain the variable numbering system used in earlier SCFs. Where the content of a variable has changed in a substantive way, we have assigned a new variable number. Since the 1995 SCF, a small number of questions were added, and a small number were deleted. Each of the variables in the main dataset has a "shadow" variable that describes--in almost all cases--the original state of the variable (i.e., whether it was missing for some reason, a range response was given, etc.). An exception is reported values which have been imputed or otherwise altered to protect the privacy of respondents (see "DISCLOSURE REVIEW" below). Users who so desire may use the shadow variables to restore the data to something very close to their original condition. The shadow variables have the same numbers as the main variable, but have a prefix of "J." A list of the values taken by the shadow variables is given in the section below entitled "DISCUSSION OF RANGE DATA COLLECTION AND J-CODES." ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- GENERAL DATA CONVENTIONS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Throughout the SCF dataset, a value of zero has only one meaning: that the item in question in inapplicable. That is, if a family does not have a checking account, then the number of checking accounts they own would be coded as a zero. Whenever zero is a legitimate response to a question, a value of -1 is used to signify that value. Other specialized codes are defined for each variable in the codebook. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- CASE ID NUMBERS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Under the original numbering system (XX1), important aspects of the sample design are apparent from the identification numbers. Because such information is not releasable under the agreements which allow us to collect the data, each case included in the public version of the dataset has been given a random identification number (YY1). Users should note that it is not possible to know with certainty from the information provided in the public version of this dataset which cases derive from the list sample. Because we routinely use the original numbers internally, users who direct questions to us about specific cases might want to be sure to emphasize that they are using the external ID number to avoid confusion. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- "OTHER" CODES ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- In almost every case where a respondent could supply a response that did not fit in the codeframe offered to interviewers on their computer screens, the CAPI program was constructed to allow the entry of a verbatim response. There were a few open-ended questions that were set up to accept only a verbatim response. All of these responses were run through a standard coding process at NORC. Once the data were at the FRB, strenuous efforts were made to resolve all instances of responses that were unresolved by the NORC coders. Responses that remain coded "other" in the final dataset are unusual, but legitimate responses which do not fit within the existing codeframe; because these responses appear unlikey to reoccur in future surveys, the codeframe was not augmented. Responses that were not informative (or were not answers to the questions that were asked) were treated as missing values and were imputed. An identical process was followed for the 1995 SCF. In earlier surveys, the information recorded for "other" responses was not as complete, and the efforts to recode the available verbatim data were somewhat less stringent. Thus, analysts should exercise caution in time series comparisons of "other" responses from the 1995 and 1998 surveys with those in earlier years. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- GRIDS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Some sets of questions have a natural iterative pattern. For example, the survey asks for detailed information on up to the first six checking accounts owned by the PEU, and summary information is collected about all remaining accounts. The detailed questions are the same for each account. In past interviews done with paper and pencil, some respondents have resisted answering all the detailed questions and have been willing to provide only summary information. Typically, interviewers recorded the summary information in the margins of the questionniare, and editors allocated the data to the skipped questions according to a set of fixed rules. To allow for a variety of respondent-interviewer interactions in the SCF CAPI program, the grid questions were organized to provide a way of collecting summary information in a systematic way. We refer to the associated summary variables as "mop-up variables." Past surveys also indicated that some respondents recalled additional instances of items once they began answering questions in a grid, but interviewers often did not revise the originally reported number. The CAPI procedures were set up to allow for this possibility as well. Consider first a respondent who gives a non-missing response to the question that asks for the number of items of the type to be queried in the grid. The interviewer would ask the respondent the first set of detailed questions on the item. Then, the interviewer would be confronted with a question (not to be read to the respondent); INTERVIEWER: CAN R PROVIDE INFORMATION ABOUT ANOTHER xxxx? The intention of this question was to allow the interviewer to deal with a potentially hostile respondent and immediately branch to the mop-up questions. If the respondent was cooperative, the interviewer entered a YES response and continued through an identical procedure for each iteration until either the number of items reported was exhausted, or the maximum number of detailed questions was asked and the mop-up question was asked to get summary information on all remaining items. If the respondent reported a number of items less than the maximum number about which the detailed questions are asked, the following question was asked at the end of the final iteration: Do you (or your family living here) have another xxxx? A YES response here indicates that the respondent recalled an additional instance in the process of answering the detailed questions. A respondent could continue to "add" iterations until the maximum number of iterations is reached and the mop-up questions are asked. Another possibility is that a respondent may either not know (or be unwilling to tell) the number of instances of an item. Because it is known that there is at least one such instance, the first set of detailed questions is asked. Then the respondent is asked: Do you (or your family living here) have another xxxx? The questioning then proceedes exactly as it would for a respondent who recalled additional instances after providing an initial number of instances. In processing the data, several steps were taken to attribute the data collected to their correct location. First, in some cases interviewers answered the question "INTERVIEWER: CAN R PROVIDE INFORMATION ABOUT ANOTHER xxxx?" negatively, even though only one more instance remained. In such cases, the mop-up data were mapped into the appropriate position in the grid. This data movement is not directly recorded in the J-variables for such cases, although the movement can be deduced from the patterns of J-variables of other questions within the iteration of the grid that do not have mop-up equivalents: the value of the J-variables for such variables without mop-up equivalents would normally be 2052. Second, when respondents added instances, the originally reported number was updated and stored in the customary SCF variable number. The originally reported number of instances has been retained in the dataset since such information cannot be recovered in any other way from the data made available. Third, when summary information was given by respondents who broke off their responses in a grid prematurely, that information was used to bound the imputations of the detailed data. Data items that have an associated J-variable with a value of 90 are ones where a complete response was given in the parallel mop-up variable, and those with a J-variable of 91 are ones where a range response was given in the parallel mop-up variable. There are some complicated mixed cases where a respondent gave a missing value for the number of instances, but was willing to provide non-missing mop-up data. Though tedious, it is possible to deduce this information from the J-variables provided. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- SUMMARY VARIABLES ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- We have not included summary variables (e.g., NET WORTH) in the main data set. Although it is complicated to construct such variables, it is our belief that a substantial amount of judgment is involved in defining variables, and that other analysts should make their own decisions. However, as a guide to users, we have included on the SCF web site a program written in the SAS language that was used to create the variables used in the January 2000 Federal Reserve Bulletin article on the survey. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- DATA REVIEW ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- A very large amount of time has been spent in searching for errors in the data and resolving those errors. Many seeming inconsistencies are actually in the raw data and appear to have no obvious reconciliation. Our presumption is always that the respondent understood each question and reported accurately, and that the process of transcription and coding did not distort that information. In the relatively small number of cases where other information led us beyond a reasonable doubt of the validity of the data, we have changed data, either by altering values directly or by setting them to missing and imputing them; in all such cases, the shadow variables indicate that we have overridden reported data (for an overview of the extent of data changes, see "Measuring Data Quality In the 1998 Survey of Consumer Finances," Arthur B. Kennickell, August 1999, http://www.federalreserve.gov/pubs/oss/oss2/method.html). We ask our colleagues who use this dataset to help us in finding any remaining resolvable inconsistencies. The imputations are subject to the hierarchical logical constraints, but otherwise they reflect the fundamental inconsistencies in the other data. For example, total income (X5729) in the reported data is often not equal to the sum of the individual components (X5702 etc.), so this constraint is not applied to the imputed data. Variability in the imputations for a variable in a given case, sometimes may be large. This variation is a reflection of the fundamental uncertainty about the value of the item. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- IMPUTATION ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Most of the variables that originally contained a missing value code have been imputed. The exceptions include such variables as X6695 (which reports the original number of checking accounts reported by the survey respondent) and X6504 (which is the interviewer's description of the property where the respondent lives). The nature of any missing values may be understood by examining the J-codes associated with the variables. The overwhelming majority of variables that originally contained missing values have been imputed five times by drawing repeatedly from an estimate of the conditional distribution of the data. These imputations are stored as five successive replicates ("implicates") of each data record. Thus, the number of observations in the full dataset (21,545) is five times the actual number of respondents (4,309) (see DISCLOSURE REVIEW below for information on the public version of the dataset). The imputation procedure is described in "Multiple Imputation in the Survey of Consumer Finances" (Arthur B. Kennickell, September 1998, http://www.federalreserve.gov/pubs/oss/oss2/method.html). For a general discussion of multiple imputation and its uses, see MULTIPLE IMPUTATION FOR NONRESPONSE IN SURVEYS by Donald B. Rubin, John Wiley and Sons, 1987. Multiple imputation offers two distinct advantges compared with singly-imputed data. First, because multiple imputation yields multiple outcomes from a random process, it supports more efficient estimation than singly-imputed data. Second, multiple imputation allows users to make straightforward estimates of the degree of uncertainty associated with the missing information. For users who want to estimate only simple statistics such as means and medians ignoring the effects of imputation error on the standard errors of these estimates, it will probably be sufficient to divide the weights by 5. Software to compute means and medians and their associated standard errors with respect to imputation and sampling error is provided in the section on sampling error later in this codebook. Users who want to estimate more complex statistics, particularly regressions, should be cautious in their treatment of the implicates. Many regression packages will treat each of the five implicates as an independent observation and correspondingly inflate the reported significance of results. Users who want to calculate regression estimates, but who have no immediate use for proper significance tests, could either average the dependent and independent values across the implicates or multiply their standard errors by the square root of five. For an easily understandable discussion of multiple imputation in the SCF from a user's point of view, see Catherine Montalto and Jaimie Sung, "Multiple Imputation in the 1992 Survey of Consumer Finances," Financial Counseling and Planning, Volume 7, 1996, pages 133-146 (or on the Internet at http://www.hec.ohio-state.edu/hanna/imput.htm). That article also contains a set of simple SAS macros to use to compute correct standard errors from multiply imputed data. An alternative that is useful for handling the output of general modeling routines is the following set of SAS code: *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; ******************************************************************************; ***** MACRO MISECOMP *****; ******************************************************************************; * MACRO MISECOMP computes standard errors corrected for multiple imputation; * The input may be regression results, or any other results (e.g., probits) that include a point estimate and a standard error estimate for each implicate; * The datasets are named &DSN.1-&DSN&NIMP (where &DSN and &NIMP are defined below); * The form of the input dataset is described above; * Often, it is quite easy to copy output directly from a statistical procedure into the form of this program without deleting extraneous information; * The required input variables are VARN (a name of the statistic of interest in all NIMP datasets), B1-B&NIMP (a working name for the point estimate of interest for each implicate--where the terminal number corresponds to the terminal number of the input dataset), and S1-S&NIMP (a working name for the standard error of the point estimate in each implicate--where the terminal number corresponds to the terminal number of the input dataset; * The parameters of the MACRO are: NIMP: number of implicates (default is 5) DSN: first part of name of each of the NIMP input datasets (e.g., DSN11, DSN12,...,DSN15 could be results for implicates 1-5 for model 1) (default is DSN1i, where "i" ranges from 1 to NIMP) PRNTPR: determines the number of digits of the output data (default is SAS format 10.6); * The output includes three lines for each unique VARN in the input datasets: the final point estimate, the final standard error, and the final t-statistic; ******************************************************************************; * Steps to compute standard errors; * (1) run each model (regressions, probits, etc.) for each of the five implicates separately; * (2) copy the model outputs into program code as described above; /* For example, DATA DSNij; INPUT VARN $ Bi Si; CARDS; data here ; RUN; where "i" ranges over the number of distinct models treated, and "j" ranges over the number of implicates. NOTE: any technique that reads VARN, Bi and Si into the datasets will work. */ * (3) call MISECOMP (MACRO defaults will work correctly for the SCF if the dataset names are DSN11, DSN12, DSN13, DSN14, DSN15); ******************************************************************************; ******************************************************************************; %MACRO MISECOMP(NIMP=5,DSN=DSN1,PRNTPR=10.6); DATA &DSN.1; SET &DSN.1; ORD=_N_; RUN; %DO I=1 %TO &NIMP; PROC SORT DATA=&DSN&I; BY VARN; RUN; %END; DATA ALL; MERGE %DO I=1 %TO &NIMP; &DSN&I %END; ;; BY VARN; ARRAY BMOD {*} %DO I=1 %TO &NIMP; B&I %END;; ARRAY SMOD {*} %DO I=1 %TO &NIMP; S&I %END;; BETA=0; SIGMA=0; ST=0; DO J=1 TO &NIMP; BETA=BMOD{J}+BETA; SIGMA=SMOD{J}**2+SIGMA; END; BETA=BETA/&NIMP; SIGMA=SIGMA/&NIMP; DO I=1 TO &NIMP; ST=ST+(BETA-BMOD{I})**2; END; SIGMA=SQRT(SIGMA+(1+1/5)*ST/(5-1)); TSTAT=BETA/SIGMA; RUN; PROC SORT DATA=ALL; BY ORD; RUN; DATA ALL; SET ALL; PUT VARN @15 BETA &PRNTPR / @15 SIGMA &PRNTPR / @15 TSTAT &PRNTPR; RUN; %MEND MISECOMP; %MISECOMP; ******************************************************************************; ******************************************************************************; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- DISCUSSION OF RANGE DATA COLLECTION AND J-CODES ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Since the 1995 SCF, the CAPI program has allowed interviewers a variety of ways to enter partial information (for a detailed description and analysis of range data in the 1995 survey, see "Using Range Techniques with CAPI in the 1995 Survey of Consumer Finances," Arthur B. Kennickell, January 1997, http://www.federalreserve.gov/pubs/oss/oss2/method.html). In the past, we had evidence that some respondents volunteered figures in ranges. Good interviewers have always tried to get respondents to settle on a single "best" figure, but sometimes it may be that there is no firm figure (e.g., the value of a privately-held business may be known only at the point it is actually sold) and probing too far could cause the respondent to answer "don't know" or to refuse to answer. The CAPI program allows for responses to be reported in ranges volunteered by respondents. There is another class of respondents who may not volunteer a range, who do not know (or will not give) an exact figure, but who will give some information about the value. To obtain information from this second group of people, the CAPI program includes two options. First, a respondent who is uncomfortable actually saying an amount may report a letter from a card that specifies a number of ranges. The range card has been used very successfully in earlier waves of the SCF, but CAPI allows the option to be presented consistently. Second, a respondent who declines the use of the range card is asked a series of questions in a "decision tree" that are designed to specify a range. The dollar breaks in the decision trees vary by question (so that, for example, monthly rent is not subject to the same ranges as the value of corporate stock). The computer sequences used for range followup for all dollar values in the survey (known as "DKDOL") are outlined schematically in a section below. It should be noted that interviewers were strongly instructed that a single dollar value is the best answer to each of these questions. Although there is the distinct possibility that respondents may become "trained" in the use of the range questions during the course of the interview (the effect of this training is unclear at present: respondents may tend to report "too many" ranges because they know that they are allowed; alternatively, respondents may learn that it is much quicker to give a single dollar figure), interviewers should be using all of the standard techniques to get respondents to give a single figure where possible. Evidence from the 1995 SCF suggests that this approach dramatically reduces the frequency of "don't know" responses, but it has little effect on refusals. Although the overall proportion of respondents reporting no information is much lower, generally the proportion providing complete response declined. *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; Schematic diagram of sequence used for all dollar questions: Qnn. How much is your [******]? level 1: $________ $___RANGE $______DK $__Refuse |________________| level 2: Confirm Range card Range card? or dollar range? RC DR YES NO/DK Refuse level 3: OUT Letter Upper bound Letter Decision Lower bound tree level 4: OUT Confirm OUT Confirm OUT level 5: OUT OUT (OUT=proceed to next question) At the first level, the respondent has the option of providing a dollar amount (as in the past, interviewers were strongly urged to obtain a single dollar value where possible), volunteering a range, answering "don't know," or refusing to answer. Each of these responses implies a different sequence of questions. In the case of a single dollar figure, the CAPI program displays in words the number the interviewer has typed into the computer and proceeds to the next question. If the respondent volunteers a range, there is an option to report either a range in dollars (and in some cases the upper or lower bound of a range may be missing--e.g., as in the case where a respondent answers "greater than a million dollars") or to give a letter from a range card (the ranges are given below). If the respondent answers "don't know" or refuses to answer, the program will present a request to use the range card. If the respondent is unable to use the range card (answers "no" or "don't know"), the program presents a series of questions known as a "decision tree," which is specified in greater detail below. If the respondent refuses when asked to use the range card, the program proceeds to the next question. The exact question text for this sequence is given below. Because of software limitations, negative ranges presented a special problem. It was not feasible to build in negative ranges directly. As a compromise, interviewers were instructed to collect the ranges in absolute values and record in a comment box available in the program the fact that the range was negative. Text presented to interviewer at level 2 if R volunteers a range: CHOOSE: ENTER LETTER FROM RANGE CARD ENTER LOW END AND HIGH END OF RANGE Text presented to interviewer at level 3 if R volunteers a range and chooses the range card at level 2: ENTER LETTER FROM RANGE CARD: Responses shown on range card: A ...... $1 - $100 B ...... $101 - $500 C ...... $501 - $1,000 D ...... $1,001 - $2,500 E ...... $2,501 - $5,000 F ...... $5,001 - $7,500 G ...... $7,501 - $10,000 H ...... $10,001 - $25,000 I ...... $25,001 - $50,000 J ...... $50,001 - $75,000 K ...... $75,001 - $100,000 L ...... $100,001 - $250,000 M ...... $250,001 - $500,000 N ...... $500,001 - $1 million O ...... $1 million - $5 million P ...... $5 million - $10 million Q ...... $10 million - $25 million R ...... $25 million - $50 million S ...... $50 million - $100 million T ...... More than $100 million Text presented to interviewer at level 3 if R volunteers a range and gives a dollar range at level 2: ENTER LOW END OF RANGE : $___,___,___.00 ENTER HIGH END OF RANGE : $___,___,___.00 Text presented to interviewer at level 2 if R answers DK/Ref at level 1: Can you give me a range from this card? HAND R RANGE CARD. YES NO Text presented to interviewer at level 3 if R answers DK/Ref at level 1 and answers YES at level 2: ENTER LETTER FROM RANGE CARD: Possible card responses shown on range card: See above Decision tree sequence presented to interviewer at level 3 if R answers DK/Ref at level 1 and NO/DK at level 2: CONSIDER THE FOLLOWING 7 NUMBERS WHICH ARE STRICTLY INCREASING IN VALUE: V1, V2, V3, V4, V5, V6, AND V7. RESPONDENTS ARE ASKED A SEQUENCE OF QUESTIONS TO FIND THE INTERVALS DEFINED BY THESE NUMBER A GIVEN VARIABLE FALLS. Q1. Was it V4 dollars or more? YES --> GO TO Q2 NO, DK --> GO TO Q5 Ref --> EXIT Q2. Was it V5 dollars or more? YES --> GO TO Q3 NO, DK, Ref --> EXIT Q3. Was it V6 dollars or more? YES --> GO TO Q4 NO, DK, Ref --> EXIT Q4. Was it V7 dollars or more? YES, NO, DK, Ref --> EXIT Q5. Was it V1 dollars or more? YES --> GO TO Q6 NO, DK, Ref --> EXIT Q6. Was it V2 dollars or more? YES --> GO TO Q7 NO, DK, Ref --> EXIT Q7. Was it V3 dollars or more? YES, NO, DK, Ref --> EXIT To allow for appropriate ranges for all dollar questions, there are eight different versions of the V1 to V7 variables given below. Version V1 V2 V3 V4 V5 V6 V7 1 10,000 100,000 250,000 500,000 1,000,000 5,000,000 10,000,000 2 50,000 100,000 500,000 1,000,000 5,000,000 10,000,000 25,000,000 3 50,000 100,000 150,000 250,000 500,000 1,000,000 5,000,000 4 5,000 25,000 50,000 100,000 250,000 500,000 1,000,000 5 5,000 10,000 25,000 50,000 100,000 250,000 750,000 6 500 1,000 5,000 10,000 25,000 75,000 250,000 7 100 250 500 1,000 2,000 10,000 50,000 8 50 100 250 500 1,000 5,000 10,000 There are 31 possible unique outcomes of each version of each of the 8 versions of the decision tree: 1. Q1=NO, Q5=NO 2. Q1=NO, Q5=DK 3. Q1=NO, Q5=Ref 4. Q1=NO, Q5=YES, Q6=NO 5. Q1=NO, Q5=YES, Q6=DK 6. Q1=NO, Q5=YES, Q6=Ref 7. Q1=NO, Q5=YES, Q6=YES, Q7=NO 8. Q1=NO, Q5=YES, Q6=YES, Q7=DK 9. Q1=NO, Q5=YES, Q6=YES, Q7=Ref 10. Q1=NO, Q5=YES, Q6=YES, Q7=YES 11. Q1=DK, Q5=NO 12. Q1=DK, Q5=DK ---> NOTE: RESULTS IN NO BOUNDING INFORMATION 13. Q1=DK, Q5=Ref ---> NOTE: RESULTS IN NO BOUNDING INFORMATION 14. Q1=DK, Q5=YES, Q6=NO 15. Q1=DK, Q5=YES, Q6=DK 16. Q1=DK, Q5=YES, Q6=Ref 17. Q1=DK, Q5=YES, Q6=YES, Q7=NO 18. Q1=DK, Q5=YES, Q6=YES, Q7=DK 19. Q1=DK, Q5=YES, Q6=YES, Q7=Ref 20. Q1=DK, Q5=YES, Q6=YES, Q7=YES 21. Q1=Ref ---> NOTE: RESULTS IN NO BOUNDING INFORMATION 22. Q1=YES, Q2=NO 23. Q1=YES, Q2=DK 24. Q1=YES, Q2=Ref 25. Q1=YES, Q2=YES, Q3=NO 26. Q1=YES, Q2=YES, Q3=DK 27. Q1=YES, Q2=YES, Q3=Ref 28. Q1=YES, Q2=YES, Q3=YES, Q4=NO 29. Q1=YES, Q2=YES, Q3=YES, Q4=DK 20. Q1=YES, Q2=YES, Q3=YES, Q4=Ref 31. Q1=YES, Q2=YES, Q3=YES, Q4=YES *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; The shadow variables fall into three large groups. Codes of less than 90 indicate that data were not originally missing (or that they could be inferred with high confidence from other information). Codes with an integer value from 90 through 993 indicate that the respondent provided a range response. The extensive form of the paths through the range routines encompasses a large number of outcomes, as is reflected in the number of possible range codes. For the codes that indicate a range response, there may also be a decimal component. A code with a decimal part equal to 0.5 indicates that the initial response that the respondent gave to the associated dollar question was "don't know." In every other case, there should be no decimal component to the shadow variable. Codes of 994 or more indicate that the associated data value is completely missing. There is an important exception to the normal assignment of J-codes. In some cases, it is not known where a reported value should actually be because a higher-order question was missing. In setting up the data, the reported values are put into all possible locations and the variables all retain the J-codes of the reported values. Imputation sets a path through the data, and the values which are imputed to be inapplicable are given a value of zero, even though their J-codes are not necessarily one of the codes for inapplicable data (1, 3 or 14). When columns of a grid need to be shifted after data are removed, the moved variables retain their original J-codes. *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; Definitions of the "J" Variables (1998 version) 0 = Originally reported value. See above for an exception. 1 = Question is inapplicable (e.g., R has no checking account so value of checking account is coded as zero.) NOTE: all values of zero in the dataset are in some sense inapplicable [also see J-code value 14]; reported values of zero are typically stored as -1. 2 = Data moved from another location (e.g., a motorcycle misclassified in the automobile grid moved to the other vehicle grid); data moved from another location and added to data already at the new location (e.g., wage income from spouse reported in independent adult part of section Y added to data reported for R in Section T); data reported in a "mop-up" field that could be directly mapped into the correct final location. These moves and changes may be the result of verbatim responses, interviewer comments, or other information. 3 = CAPI error: resolution yields an inapplicable code. 4 = CAPI error: resolution yields a non-missing (non-range) value. 5 = Indicates a value coded directly by FRB from a verbatim ("other/specify") response or interviewer comments that translate directly into a valid response. Such values indicate the exercise of a bare minimum of judgment in encoding the content of the text data. ("Super-no" corrections are included here.) 6 = Indicates a value coded directly by NORC from a verbatim ("other/specify") response. 8 = variable computed from other non-missing variables. 9 = variable overridden by logically equivalent information to maintain consistency of data (e.g., when type of property is a time share (X1703=25), but R says they own the share alone (X1704=1)--rather than saying that the property is a time share (X1704=5)--then the response to X1704 is changed to 5). 10 = (1) this code applies to variables where part of the value reported was also reported elsewhere and is edited out here (e.g., in the case where the wage income of NPEU member is reported at X6403 and at X5702 along with income of the PEU, the NPEU value is removed from X5702 and J5702=10); (2) this code also applies to higher-order variables that must be reset as a consequence of editing out a lower-order value entirely (e.g., when X6403 and X5702 both contain only the income of the NPEU member, the data are reset as X5701=5, J5701=10, X5702=0 and J5702=14). 13 = Judgmental override of non-missing reported value. 14 = Inapplicable code generated by any data adjustment, (particularly adjustments associated with J-codes 2, 10, 13, 15, 16, and 17). 15 = Non-stochastic imputation of missing data (typically based at least in part on other, non-codeable data). 16 = Override of reported data based on comment/verbatim information or information from another question (e.g., X7360) (other than codes 2, 5, and 13). Assigment of this value requires both text (verbatim/comment) data and significant judgment, or substantial structural information from other reported values together with judgment. 17 = Value of originally missing data item implied by/computed from other variable(s). Relatively more judgement is implied by this code than a code 8. 25 = This code is used only for the make/model of vehicles. If either the make or the model (but not both) was originally missing, but there was sufficient information to approximate the value of the vehicle, this code is assigned. 30 = This code indicates that the respondent originally offered to provide a dollar range but when the upper and lower bounds of the range were asked, a single non-zero value was given for both the upper and lower bounds. ALL RESPONSES THAT FOLLOW HAVE AT LEAST SOME MISSING INFORMATION 90 = Bounding information available based on summary information provided by respondent (typically, if a R does not know information about items beyond a certain number in a set of detailed questions about a larger number of such items, the R is asked one or a number of summary questions about all remaining instances). 91 = Same as 90, but R gave range data for the summary information. RANGE RESPONSES: POSITIVE RANGES DECISION TREE RESPONSES THAT RESULTED IN A BOUND FOR POSITIVE NUMBERS (NOTE: for decision tree codes, responses that resulted in no usable bounding information are collected separately below) '*' indicates an open-ended interval NOTE: for J-code outcomes from 101-878, 921-940, and 971-990, .5 is added to the J-code if the original response was DK 101=Decision tree response, version 1: outcome 1 (*,<=V1) 102=Decision tree response, version 1: outcome 2 (*,<=V4) 103=Decision tree response, version 1: outcome 3 (*,<=V4) 104=Decision tree response, version 1: outcome 4 (>V1,<=V2 105=Decision tree response, version 1: outcome 5 (>V1,<=V4) 106=Decision tree response, version 1: outcome 6 (>V1,<=V4) 107=Decision tree response, version 1: outcome 7 (>V2,<=V3) 108=Decision tree response, version 1: outcome 8 (>V2,<=V4) 109=Decision tree response, version 1: outcome 9 (>V2,<=V4) 110=Decision tree response, version 1: outcome 10 (>V3,<=V4) 111=Decision tree response, version 1: outcome 11 (*,<=V1) 112=Decision tree response, version 1: outcome 14 (V1,V2) 113=Decision tree response, version 1: outcome 15 (>V1,*) 114=Decision tree response, version 1: outcome 16 (>V1,*) 115=Decision tree response, version 1: outcome 17 (>V2,<=V3) 116=Decision tree response, version 1: outcome 18 (>V2,*) 117=Decision tree response, version 1: outcome 19 (>V2,*) 118=Decision tree response, version 1: outcome 20 (>V3,*) 119=Decision tree response, version 1: outcome 22 (>V4,<=V5) 120=Decision tree response, version 1: outcome 23 (>V4,*) 121=Decision tree response, version 1: outcome 24 (>V4,*) 122=Decision tree response, version 1: outcome 25 (>V5,<=V6) 123=Decision tree response, version 1: outcome 26 (>V5,*) 124=Decision tree response, version 1: outcome 27 (>V5,*) 125=Decision tree response, version 1: outcome 28 (>V6,<=V7) 126=Decision tree response, version 1: outcome 29 (>V6,*) 127=Decision tree response, version 1: outcome 30 (>V6,*) 128=Decision tree response, version 1: outcome 31 (>V7,*) 201=Decision tree response, version 2: outcome 1 (*,<=V1) 202=Decision tree response, version 2: outcome 2 (*,<=V4) 203=Decision tree response, version 2: outcome 3 (*,<=V4) 204=Decision tree response, version 2: outcome 4 (>V1,<=V2 205=Decision tree response, version 2: outcome 5 (>V1,<=V4) 206=Decision tree response, version 2: outcome 6 (>V1,<=V4) 207=Decision tree response, version 2: outcome 7 (>V2,<=V3) 208=Decision tree response, version 2: outcome 8 (>V2,<=V4) 209=Decision tree response, version 2: outcome 9 (>V2,<=V4) 210=Decision tree response, version 2: outcome 10 (>V3,<=V4) 211=Decision tree response, version 2: outcome 11 (*,<=V1) 212=Decision tree response, version 2: outcome 14 (V1,V2) 213=Decision tree response, version 2: outcome 15 (>V1,*) 214=Decision tree response, version 2: outcome 16 (>V1,*) 215=Decision tree response, version 2: outcome 17 (>V2,<=V3) 216=Decision tree response, version 2: outcome 18 (>V2,*) 217=Decision tree response, version 2: outcome 19 (>V2,*) 218=Decision tree response, version 2: outcome 20 (>V3,*) 219=Decision tree response, version 2: outcome 22 (>V4,<=V5) 220=Decision tree response, version 2: outcome 23 (>V4,*) 221=Decision tree response, version 2: outcome 24 (>V4,*) 222=Decision tree response, version 2: outcome 25 (>V5,<=V6) 223=Decision tree response, version 2: outcome 26 (>V5,*) 224=Decision tree response, version 2: outcome 27 (>V5,*) 225=Decision tree response, version 2: outcome 28 (>V6,<=V7) 226=Decision tree response, version 2: outcome 29 (>V6,*) 227=Decision tree response, version 2: outcome 30 (>V6,*) 228=Decision tree response, version 2: outcome 31 (>V7,*) 301=Decision tree response, version 3: outcome 1 (*,<=V1) 302=Decision tree response, version 3: outcome 2 (*,<=V4) 303=Decision tree response, version 3: outcome 3 (*,<=V4) 304=Decision tree response, version 3: outcome 4 (>V1,<=V2 305=Decision tree response, version 3: outcome 5 (>V1,<=V4) 306=Decision tree response, version 3: outcome 6 (>V1,<=V4) 307=Decision tree response, version 3: outcome 7 (>V2,<=V3) 308=Decision tree response, version 3: outcome 8 (>V2,<=V4) 309=Decision tree response, version 3: outcome 9 (>V2,<=V4) 310=Decision tree response, version 3: outcome 10 (>V3,<=V4) 311=Decision tree response, version 3: outcome 11 (*,<=V1) 312=Decision tree response, version 3: outcome 14 (V1,V2) 313=Decision tree response, version 3: outcome 15 (>V1,*) 314=Decision tree response, version 3: outcome 16 (>V1,*) 315=Decision tree response, version 3: outcome 17 (>V2,<=V3) 316=Decision tree response, version 3: outcome 18 (>V2,*) 317=Decision tree response, version 3: outcome 19 (>V2,*) 318=Decision tree response, version 3: outcome 20 (>V3,*) 319=Decision tree response, version 3: outcome 22 (>V4,<=V5) 320=Decision tree response, version 3: outcome 23 (>V4,*) 321=Decision tree response, version 3: outcome 24 (>V4,*) 322=Decision tree response, version 3: outcome 25 (>V5,<=V6) 323=Decision tree response, version 3: outcome 26 (>V5,*) 324=Decision tree response, version 3: outcome 27 (>V5,*) 325=Decision tree response, version 3: outcome 28 (>V6,<=V7) 326=Decision tree response, version 3: outcome 29 (>V6,*) 327=Decision tree response, version 3: outcome 30 (>V6,*) 328=Decision tree response, version 3: outcome 31 (>V7,*) 401=Decision tree response, version 4: outcome 1 (*,<=V1) 402=Decision tree response, version 4: outcome 2 (*,<=V4) 403=Decision tree response, version 4: outcome 3 (*,<=V4) 404=Decision tree response, version 4: outcome 4 (>V1,<=V2 405=Decision tree response, version 4: outcome 5 (>V1,<=V4) 406=Decision tree response, version 4: outcome 6 (>V1,<=V4) 407=Decision tree response, version 4: outcome 7 (>V2,<=V3) 408=Decision tree response, version 4: outcome 8 (>V2,<=V4) 409=Decision tree response, version 4: outcome 9 (>V2,<=V4) 410=Decision tree response, version 4: outcome 10 (>V3,<=V4) 411=Decision tree response, version 4: outcome 11 (*,<=V1) 412=Decision tree response, version 4: outcome 14 (V1,V2) 413=Decision tree response, version 4: outcome 15 (>V1,*) 414=Decision tree response, version 4: outcome 16 (>V1,*) 415=Decision tree response, version 4: outcome 17 (>V2,<=V3) 416=Decision tree response, version 4: outcome 18 (>V2,*) 417=Decision tree response, version 4: outcome 19 (>V2,*) 418=Decision tree response, version 4: outcome 20 (>V3,*) 419=Decision tree response, version 4: outcome 22 (>V4,<=V5) 420=Decision tree response, version 4: outcome 23 (>V4,*) 421=Decision tree response, version 4: outcome 24 (>V4,*) 422=Decision tree response, version 4: outcome 25 (>V5,<=V6) 423=Decision tree response, version 4: outcome 26 (>V5,*) 424=Decision tree response, version 4: outcome 27 (>V5,*) 425=Decision tree response, version 4: outcome 28 (>V6,<=V7) 426=Decision tree response, version 4: outcome 29 (>V6,*) 427=Decision tree response, version 4: outcome 30 (>V6,*) 428=Decision tree response, version 4: outcome 31 (>V7,*) 501=Decision tree response, version 5: outcome 1 (*,<=V1) 502=Decision tree response, version 5: outcome 2 (*,<=V4) 503=Decision tree response, version 5: outcome 3 (*,<=V4) 504=Decision tree response, version 5: outcome 4 (>V1,<=V2 505=Decision tree response, version 5: outcome 5 (>V1,<=V4) 506=Decision tree response, version 5: outcome 6 (>V1,<=V4) 507=Decision tree response, version 5: outcome 7 (>V2,<=V3) 508=Decision tree response, version 5: outcome 8 (>V2,<=V4) 509=Decision tree response, version 5: outcome 9 (>V2,<=V4) 510=Decision tree response, version 5: outcome 10 (>V3,<=V4) 511=Decision tree response, version 5: outcome 11 (*,<=V1) 512=Decision tree response, version 5: outcome 14 (V1,V2) 513=Decision tree response, version 5: outcome 15 (>V1,*) 514=Decision tree response, version 5: outcome 16 (>V1,*) 515=Decision tree response, version 5: outcome 17 (>V2,<=V3) 516=Decision tree response, version 5: outcome 18 (>V2,*) 517=Decision tree response, version 5: outcome 19 (>V2,*) 518=Decision tree response, version 5: outcome 20 (>V3,*) 519=Decision tree response, version 5: outcome 22 (>V4,<=V5) 520=Decision tree response, version 5: outcome 23 (>V4,*) 521=Decision tree response, version 5: outcome 24 (>V4,*) 522=Decision tree response, version 5: outcome 25 (>V5,<=V6) 523=Decision tree response, version 5: outcome 26 (>V5,*) 524=Decision tree response, version 5: outcome 27 (>V5,*) 525=Decision tree response, version 5: outcome 28 (>V6,<=V7) 526=Decision tree response, version 5: outcome 29 (>V6,*) 527=Decision tree response, version 5: outcome 30 (>V6,*) 528=Decision tree response, version 5: outcome 31 (>V7,*) 601=Decision tree response, version 6: outcome 1 (*,<=V1) 602=Decision tree response, version 6: outcome 2 (*,<=V4) 603=Decision tree response, version 6: outcome 3 (*,<=V4) 604=Decision tree response, version 6: outcome 4 (>V1,<=V2 605=Decision tree response, version 6: outcome 5 (>V1,<=V4) 606=Decision tree response, version 6: outcome 6 (>V1,<=V4) 607=Decision tree response, version 6: outcome 7 (>V2,<=V3) 608=Decision tree response, version 6: outcome 8 (>V2,<=V4) 609=Decision tree response, version 6: outcome 9 (>V2,<=V4) 610=Decision tree response, version 6: outcome 10 (>V3,<=V4) 611=Decision tree response, version 6: outcome 11 (*,<=V1) 612=Decision tree response, version 6: outcome 14 (V1,V2) 613=Decision tree response, version 6: outcome 15 (>V1,*) 614=Decision tree response, version 6: outcome 16 (>V1,*) 615=Decision tree response, version 6: outcome 17 (>V2,<=V3) 616=Decision tree response, version 6: outcome 18 (>V2,*) 617=Decision tree response, version 6: outcome 19 (>V2,*) 618=Decision tree response, version 6: outcome 20 (>V3,*) 619=Decision tree response, version 6: outcome 22 (>V4,<=V5) 620=Decision tree response, version 6: outcome 23 (>V4,*) 621=Decision tree response, version 6: outcome 24 (>V4,*) 622=Decision tree response, version 6: outcome 25 (>V5,<=V6) 623=Decision tree response, version 6: outcome 26 (>V5,*) 624=Decision tree response, version 6: outcome 27 (>V5,*) 625=Decision tree response, version 6: outcome 28 (>V6,<=V7) 626=Decision tree response, version 6: outcome 29 (>V6,*) 627=Decision tree response, version 6: outcome 30 (>V6,*) 628=Decision tree response, version 6: outcome 31 (>V7,*) 701=Decision tree response, version 7: outcome 1 (*,<=V1) 702=Decision tree response, version 7: outcome 2 (*,<=V4) 703=Decision tree response, version 7: outcome 3 (*,<=V4) 704=Decision tree response, version 7: outcome 4 (>V1,<=V2 705=Decision tree response, version 7: outcome 5 (>V1,<=V4) 706=Decision tree response, version 7: outcome 6 (>V1,<=V4) 707=Decision tree response, version 7: outcome 7 (>V2,<=V3) 708=Decision tree response, version 7: outcome 8 (>V2,<=V4) 709=Decision tree response, version 7: outcome 9 (>V2,<=V4) 710=Decision tree response, version 7: outcome 10 (>V3,<=V4) 711=Decision tree response, version 7: outcome 11 (*,<=V1) 712=Decision tree response, version 7: outcome 14 (V1,V2) 713=Decision tree response, version 7: outcome 15 (>V1,*) 714=Decision tree response, version 7: outcome 16 (>V1,*) 715=Decision tree response, version 7: outcome 17 (>V2,<=V3) 716=Decision tree response, version 7: outcome 18 (>V2,*) 717=Decision tree response, version 7: outcome 19 (>V2,*) 718=Decision tree response, version 7: outcome 20 (>V3,*) 719=Decision tree response, version 7: outcome 22 (>V4,<=V5) 720=Decision tree response, version 7: outcome 23 (>V4,*) 721=Decision tree response, version 7: outcome 24 (>V4,*) 722=Decision tree response, version 7: outcome 25 (>V5,<=V6) 723=Decision tree response, version 7: outcome 26 (>V5,*) 724=Decision tree response, version 7: outcome 27 (>V5,*) 725=Decision tree response, version 7: outcome 28 (>V6,<=V7) 726=Decision tree response, version 7: outcome 29 (>V6,*) 727=Decision tree response, version 7: outcome 30 (>V6,*) 728=Decision tree response, version 7: outcome 31 (>V7,*) 801=Decision tree response, version 8: outcome 1 (*,<=V1) 802=Decision tree response, version 8: outcome 2 (*,<=V4) 803=Decision tree response, version 8: outcome 3 (*,<=V4) 804=Decision tree response, version 8: outcome 4 (>V1,<=V2 805=Decision tree response, version 8: outcome 5 (>V1,<=V4) 806=Decision tree response, version 8: outcome 6 (>V1,<=V4) 807=Decision tree response, version 8: outcome 7 (>V2,<=V3) 808=Decision tree response, version 8: outcome 8 (>V2,<=V4) 809=Decision tree response, version 8: outcome 9 (>V2,<=V4) 810=Decision tree response, version 8: outcome 10 (>V3,<=V4) 811=Decision tree response, version 8: outcome 11 (*,<=V1) 812=Decision tree response, version 8: outcome 14 (V1,V2) 813=Decision tree response, version 8: outcome 15 (>V1,*) 814=Decision tree response, version 8: outcome 16 (>V1,*) 815=Decision tree response, version 8: outcome 17 (>V2,<=V3) 816=Decision tree response, version 8: outcome 18 (>V2,*) 817=Decision tree response, version 8: outcome 19 (>V2,*) 818=Decision tree response, version 8: outcome 20 (>V3,*) 819=Decision tree response, version 8: outcome 22 (>V4,<=V5) 820=Decision tree response, version 8: outcome 23 (>V4,*) 821=Decision tree response, version 8: outcome 24 (>V4,*) 822=Decision tree response, version 8: outcome 25 (>V5,<=V6) 823=Decision tree response, version 8: outcome 26 (>V5,*) 824=Decision tree response, version 8: outcome 27 (>V5,*) 825=Decision tree response, version 8: outcome 28 (>V6,<=V7) 826=Decision tree response, version 8: outcome 29 (>V6,*) 827=Decision tree response, version 8: outcome 30 (>V6,*) 828=Decision tree response, version 8: outcome 31 (>V7,*) RANGE CARD RESPONSES FOR POSITIVE NUMBERS 901=Range card response via [F9]: range A. $1 to $100 902=Range card response via [F9]: range B. $101 to $500 903=Range card response via [F9]: range C. $501 to $1,000 904=Range card response via [F9]: range D. $1,001 to $2,500 905=Range card response via [F9]: range E. $2,501 to $5,000 906=Range card response via [F9]: range F. $5,001 to $7,500 907=Range card response via [F9]: range G. $7,501 to $10,000 908=Range card response via [F9]: range H. $10,001 to $25,000 909=Range card response via [F9]: range I. $25,001 to $50,000 910=Range card response via [F9]: range J. $50,001 to $75,000 911=Range card response via [F9]: range K. $75,001 to $100,000 912=Range card response via [F9]: range L. $100,001 to $250,000 913=Range card response via [F9]: range M. $250,001 to $500,000 914=Range card response via [F9]: range N. $500,001 to $1,000,000 915=Range card response via [F9]: range O. $1,000,001 to $5,000,000 916=Range card response via [F9]: range P. $5,000,001 to $10,000,000 917=Range card response via [F9]: range Q. $10,000,001 to $25,000,000 918=Range card response via [F9]: range R. $25,000,001 to $50,000,000 919=Range card response via [F9]: range S. $50,000,001 to $100,000,000 920=Range card response via [F9]: range T. More than $100,000,000 921=Range card response via DKDOL: range A. $1 to $100 922=Range card response via DKDOL: range B. $101 to $500 923=Range card response via DKDOL: range C. $501 to $1,000 924=Range card response via DKDOL: range D. $1,001 to $2,500 925=Range card response via DKDOL: range E. $2,501 to $5,000 926=Range card response via DKDOL: range F. $5,001 to $7,500 927=Range card response via DKDOL: range G. $7,501 to $10,000 928=Range card response via DKDOL: range H. $10,001 to $25,000 929=Range card response via DKDOL: range I. $25,001 to $50,000 930=Range card response via DKDOL: range J. $50,001 to $75,000 931=Range card response via DKDOL: range K. $75,001 to $100,000 932=Range card response via DKDOL: range L. $100,001 to $250,000 933=Range card response via DKDOL: range M. $250,001 to $500,000 934=Range card response via DKDOL: range N. $500,001 to $1,000,000 935=Range card response via DKDOL: range O. $1,000,001 to $5,000,000 936=Range card response via DKDOL: range P. $5,000,001 to $10,000,000 937=Range card response via DKDOL: range Q. $10,000,001 to $25,000,000 938=Range card response via DKDOL: range R. $25,000,001 to $50,000,000 939=Range card response via DKDOL: range S. $50,000,001 to $100,000,000 940=Range card response via DKDOL: range T. More than $100,000,000 RESPONDENT-PROVIDED DOLLAR RANGE FOR POSITIVE NUMBERS 941=Upper and lower bounds given 942=Upper bound given, lower bound missing 943=Lower bound given, upper bound missing INTERVIEW COMMENT INDICATES THAT RANGES ARE NEGATIVE DECISION TREE RESPONSES THAT RESULTED IN A BOUND FOR NEGATIVE NUMBERS (NOTE: for decision tree codes, responses that resulted in no usable bounding information are collected separately below) 151=Decision tree response, version 1: outcome 1 (negative value) 152=Decision tree response, version 1: outcome 2 (negative value) 153=Decision tree response, version 1: outcome 3 (negative value) 154=Decision tree response, version 1: outcome 4 (negative value) 155=Decision tree response, version 1: outcome 5 (negative value) 156=Decision tree response, version 1: outcome 6 (negative value) 157=Decision tree response, version 1: outcome 7 (negative value) 158=Decision tree response, version 1: outcome 8 (negative value) 159=Decision tree response, version 1: outcome 9 (negative value) 160=Decision tree response, version 1: outcome 10 (negative value) 161=Decision tree response, version 1: outcome 11 (negative value) 162=Decision tree response, version 1: outcome 14 (negative value) 163=Decision tree response, version 1: outcome 15 (negative value) 164=Decision tree response, version 1: outcome 16 (negative value) 165=Decision tree response, version 1: outcome 17 (negative value) 166=Decision tree response, version 1: outcome 18 (negative value) 167=Decision tree response, version 1: outcome 19 (negative value) 168=Decision tree response, version 1: outcome 20 (negative value) 169=Decision tree response, version 1: outcome 22 (negative value) 170=Decision tree response, version 1: outcome 23 (negative value) 171=Decision tree response, version 1: outcome 24 (negative value) 172=Decision tree response, version 1: outcome 25 (negative value) 173=Decision tree response, version 1: outcome 26 (negative value) 174=Decision tree response, version 1: outcome 27 (negative value) 175=Decision tree response, version 1: outcome 28 (negative value) 176=Decision tree response, version 1: outcome 29 (negative value) 177=Decision tree response, version 1: outcome 30 (negative value) 178=Decision tree response, version 1: outcome 31 (negative value) 251=Decision tree response, version 2: outcome 1 (negative value) 252=Decision tree response, version 2: outcome 2 (negative value) 253=Decision tree response, version 2: outcome 3 (negative value) 254=Decision tree response, version 2: outcome 4 (negative value) 255=Decision tree response, version 2: outcome 5 (negative value) 256=Decision tree response, version 2: outcome 6 (negative value) 257=Decision tree response, version 2: outcome 7 (negative value) 258=Decision tree response, version 2: outcome 8 (negative value) 259=Decision tree response, version 2: outcome 9 (negative value) 260=Decision tree response, version 2: outcome 10 (negative value) 261=Decision tree response, version 2: outcome 11 (negative value) 262=Decision tree response, version 2: outcome 14 (negative value) 263=Decision tree response, version 2: outcome 15 (negative value) 264=Decision tree response, version 2: outcome 16 (negative value) 265=Decision tree response, version 2: outcome 17 (negative value) 266=Decision tree response, version 2: outcome 18 (negative value) 267=Decision tree response, version 2: outcome 19 (negative value) 268=Decision tree response, version 2: outcome 20 (negative value) 269=Decision tree response, version 2: outcome 22 (negative value) 270=Decision tree response, version 2: outcome 23 (negative value) 271=Decision tree response, version 2: outcome 24 (negative value) 272=Decision tree response, version 2: outcome 25 (negative value) 273=Decision tree response, version 2: outcome 26 (negative value) 274=Decision tree response, version 2: outcome 27 (negative value) 275=Decision tree response, version 2: outcome 28 (negative value) 276=Decision tree response, version 2: outcome 29 (negative value) 277=Decision tree response, version 2: outcome 30 (negative value) 278=Decision tree response, version 2: outcome 31 (negative value) 351=Decision tree response, version 3: outcome 1 (negative value) 352=Decision tree response, version 3: outcome 2 (negative value) 353=Decision tree response, version 3: outcome 3 (negative value) 354=Decision tree response, version 3: outcome 4 (negative value) 355=Decision tree response, version 3: outcome 5 (negative value) 356=Decision tree response, version 3: outcome 6 (negative value) 357=Decision tree response, version 3: outcome 7 (negative value) 358=Decision tree response, version 3: outcome 8 (negative value) 359=Decision tree response, version 3: outcome 9 (negative value) 360=Decision tree response, version 3: outcome 10 (negative value) 361=Decision tree response, version 3: outcome 11 (negative value) 362=Decision tree response, version 3: outcome 14 (negative value) 363=Decision tree response, version 3: outcome 15 (negative value) 364=Decision tree response, version 3: outcome 16 (negative value) 365=Decision tree response, version 3: outcome 17 (negative value) 366=Decision tree response, version 3: outcome 18 (negative value) 367=Decision tree response, version 3: outcome 19 (negative value) 368=Decision tree response, version 3: outcome 20 (negative value) 369=Decision tree response, version 3: outcome 22 (negative value) 370=Decision tree response, version 3: outcome 23 (negative value) 371=Decision tree response, version 3: outcome 24 (negative value) 372=Decision tree response, version 3: outcome 25 (negative value) 373=Decision tree response, version 3: outcome 26 (negative value) 374=Decision tree response, version 3: outcome 27 (negative value) 375=Decision tree response, version 3: outcome 28 (negative value) 376=Decision tree response, version 3: outcome 29 (negative value) 377=Decision tree response, version 3: outcome 30 (negative value) 378=Decision tree response, version 3: outcome 31 (negative value) 451=Decision tree response, version 4: outcome 1 (negative value) 452=Decision tree response, version 4: outcome 2 (negative value) 453=Decision tree response, version 4: outcome 3 (negative value) 454=Decision tree response, version 4: outcome 4 (negative value) 455=Decision tree response, version 4: outcome 5 (negative value) 456=Decision tree response, version 4: outcome 6 (negative value) 457=Decision tree response, version 4: outcome 7 (negative value) 458=Decision tree response, version 4: outcome 8 (negative value) 459=Decision tree response, version 4: outcome 9 (negative value) 460=Decision tree response, version 4: outcome 10 (negative value) 461=Decision tree response, version 4: outcome 11 (negative value) 462=Decision tree response, version 4: outcome 14 (negative value) 463=Decision tree response, version 4: outcome 15 (negative value) 464=Decision tree response, version 4: outcome 16 (negative value) 465=Decision tree response, version 4: outcome 17 (negative value) 466=Decision tree response, version 4: outcome 18 (negative value) 467=Decision tree response, version 4: outcome 19 (negative value) 468=Decision tree response, version 4: outcome 20 (negative value) 469=Decision tree response, version 4: outcome 22 (negative value) 470=Decision tree response, version 4: outcome 23 (negative value) 471=Decision tree response, version 4: outcome 24 (negative value) 472=Decision tree response, version 4: outcome 25 (negative value) 473=Decision tree response, version 4: outcome 26 (negative value) 474=Decision tree response, version 4: outcome 27 (negative value) 475=Decision tree response, version 4: outcome 28 (negative value) 476=Decision tree response, version 4: outcome 29 (negative value) 477=Decision tree response, version 4: outcome 30 (negative value) 478=Decision tree response, version 4: outcome 31 (negative value) 551=Decision tree response, version 5: outcome 1 (negative value) 552=Decision tree response, version 5: outcome 2 (negative value) 553=Decision tree response, version 5: outcome 3 (negative value) 554=Decision tree response, version 5: outcome 4 (negative value) 555=Decision tree response, version 5: outcome 5 (negative value) 556=Decision tree response, version 5: outcome 6 (negative value) 557=Decision tree response, version 5: outcome 7 (negative value) 558=Decision tree response, version 5: outcome 8 (negative value) 559=Decision tree response, version 5: outcome 9 (negative value) 560=Decision tree response, version 5: outcome 10 (negative value) 561=Decision tree response, version 5: outcome 11 (negative value) 562=Decision tree response, version 5: outcome 14 (negative value) 563=Decision tree response, version 5: outcome 15 (negative value) 564=Decision tree response, version 5: outcome 16 (negative value) 565=Decision tree response, version 5: outcome 17 (negative value) 566=Decision tree response, version 5: outcome 18 (negative value) 567=Decision tree response, version 5: outcome 19 (negative value) 568=Decision tree response, version 5: outcome 20 (negative value) 569=Decision tree response, version 5: outcome 22 (negative value) 570=Decision tree response, version 5: outcome 23 (negative value) 571=Decision tree response, version 5: outcome 24 (negative value) 572=Decision tree response, version 5: outcome 25 (negative value) 573=Decision tree response, version 5: outcome 26 (negative value) 574=Decision tree response, version 5: outcome 27 (negative value) 575=Decision tree response, version 5: outcome 28 (negative value) 576=Decision tree response, version 5: outcome 29 (negative value) 577=Decision tree response, version 5: outcome 30 (negative value) 578=Decision tree response, version 5: outcome 31 (negative value) 651=Decision tree response, version 6: outcome 1 (negative value) 652=Decision tree response, version 6: outcome 2 (negative value) 653=Decision tree response, version 6: outcome 3 (negative value) 654=Decision tree response, version 6: outcome 4 (negative value) 655=Decision tree response, version 6: outcome 5 (negative value) 656=Decision tree response, version 6: outcome 6 (negative value) 657=Decision tree response, version 6: outcome 7 (negative value) 658=Decision tree response, version 6: outcome 8 (negative value) 659=Decision tree response, version 6: outcome 9 (negative value) 660=Decision tree response, version 6: outcome 10 (negative value) 661=Decision tree response, version 6: outcome 11 (negative value) 662=Decision tree response, version 6: outcome 14 (negative value) 663=Decision tree response, version 6: outcome 15 (negative value) 664=Decision tree response, version 6: outcome 16 (negative value) 665=Decision tree response, version 6: outcome 17 (negative value) 666=Decision tree response, version 6: outcome 18 (negative value) 667=Decision tree response, version 6: outcome 19 (negative value) 668=Decision tree response, version 6: outcome 20 (negative value) 669=Decision tree response, version 6: outcome 22 (negative value) 670=Decision tree response, version 6: outcome 23 (negative value) 671=Decision tree response, version 6: outcome 24 (negative value) 672=Decision tree response, version 6: outcome 25 (negative value) 673=Decision tree response, version 6: outcome 26 (negative value) 674=Decision tree response, version 6: outcome 27 (negative value) 675=Decision tree response, version 6: outcome 28 (negative value) 676=Decision tree response, version 6: outcome 29 (negative value) 677=Decision tree response, version 6: outcome 30 (negative value) 678=Decision tree response, version 6: outcome 31 (negative value) 751=Decision tree response, version 7: outcome 1 (negative value) 752=Decision tree response, version 7: outcome 2 (negative value) 753=Decision tree response, version 7: outcome 3 (negative value) 754=Decision tree response, version 7: outcome 4 (negative value) 755=Decision tree response, version 7: outcome 5 (negative value) 756=Decision tree response, version 7: outcome 6 (negative value) 757=Decision tree response, version 7: outcome 7 (negative value) 758=Decision tree response, version 7: outcome 8 (negative value) 759=Decision tree response, version 7: outcome 9 (negative value) 760=Decision tree response, version 7: outcome 10 (negative value) 761=Decision tree response, version 7: outcome 11 (negative value) 762=Decision tree response, version 7: outcome 14 (negative value) 763=Decision tree response, version 7: outcome 15 (negative value) 764=Decision tree response, version 7: outcome 16 (negative value) 765=Decision tree response, version 7: outcome 17 (negative value) 766=Decision tree response, version 7: outcome 18 (negative value) 767=Decision tree response, version 7: outcome 19 (negative value) 768=Decision tree response, version 7: outcome 20 (negative value) 769=Decision tree response, version 7: outcome 22 (negative value) 770=Decision tree response, version 7: outcome 23 (negative value) 771=Decision tree response, version 7: outcome 24 (negative value) 772=Decision tree response, version 7: outcome 25 (negative value) 773=Decision tree response, version 7: outcome 26 (negative value) 774=Decision tree response, version 7: outcome 27 (negative value) 775=Decision tree response, version 7: outcome 28 (negative value) 776=Decision tree response, version 7: outcome 29 (negative value) 777=Decision tree response, version 7: outcome 30 (negative value) 778=Decision tree response, version 7: outcome 31 (negative value) 851=Decision tree response, version 8: outcome 1 (negative value) 852=Decision tree response, version 8: outcome 2 (negative value) 853=Decision tree response, version 8: outcome 3 (negative value) 854=Decision tree response, version 8: outcome 4 (negative value) 855=Decision tree response, version 8: outcome 5 (negative value) 856=Decision tree response, version 8: outcome 6 (negative value) 857=Decision tree response, version 8: outcome 7 (negative value) 858=Decision tree response, version 8: outcome 8 (negative value) 859=Decision tree response, version 8: outcome 9 (negative value) 860=Decision tree response, version 8: outcome 10 (negative value) 861=Decision tree response, version 8: outcome 11 (negative value) 862=Decision tree response, version 8: outcome 14 (negative value) 863=Decision tree response, version 8: outcome 15 (negative value) 864=Decision tree response, version 8: outcome 16 (negative value) 865=Decision tree response, version 8: outcome 17 (negative value) 866=Decision tree response, version 8: outcome 18 (negative value) 867=Decision tree response, version 8: outcome 19 (negative value) 868=Decision tree response, version 8: outcome 20 (negative value) 869=Decision tree response, version 8: outcome 22 (negative value) 870=Decision tree response, version 8: outcome 23 (negative value) 871=Decision tree response, version 8: outcome 24 (negative value) 872=Decision tree response, version 8: outcome 25 (negative value) 873=Decision tree response, version 8: outcome 26 (negative value) 874=Decision tree response, version 8: outcome 27 (negative value) 875=Decision tree response, version 8: outcome 28 (negative value) 876=Decision tree response, version 8: outcome 29 (negative value) 877=Decision tree response, version 8: outcome 30 (negative value) 878=Decision tree response, version 8: outcome 31 (negative value) RANGE CARD RESPONSES FOR NEGATIVE NUMBERS 951=Range card response via [F9]: range A. -$1 to -$100 952=Range card response via [F9]: range B. -$101 to -$500 953=Range card response via [F9]: range C. -$501 to -$1,000 954=Range card response via [F9]: range D. -$1,001 to -$2,500 955=Range card response via [F9]: range E. -$2,501 to -$5,000 956=Range card response via [F9]: range F. -$5,001 to -$7,500 957=Range card response via [F9]: range G. -$7,501 to -$10,000 958=Range card response via [F9]: range H. -$10,001 to -$25,000 959=Range card response via [F9]: range I. -$25,001 to -$50,000 960=Range card response via [F9]: range J. -$50,001 to -$75,000 961=Range card response via [F9]: range K. -$75,001 to -$100,000 962=Range card response via [F9]: range L. -$100,001 to -$250,000 963=Range card response via [F9]: range M. -$250,001 to -$500,000 964=Range card response via [F9]: range N. -$500,001 to -$1,000,000 965=Range card response via [F9]: range O. -$1,000,001 to -$5,000,000 966=Range card response via [F9]: range P. -$5,000,001 to -$10,000,000 967=Range card response via [F9]: range Q. -$10,000,001 to -$25,000,000 968=Range card response via [F9]: range R. -$25,000,001 to -$50,000,000 969=Range card response via [F9]: range S. -$50,000,001 to -$100,000,000 970=Range card response via [F9]: range T. Less than -$100,000,000 971=Range card response via DKDOL: range A. -$1 to -$100 972=Range card response via DKDOL: range B. -$101 to -$500 973=Range card response via DKDOL: range C. -$501 to -$1,000 974=Range card response via DKDOL: range D. -$1,001 to -$2,500 975=Range card response via DKDOL: range E. -$2,501 to -$5,000 976=Range card response via DKDOL: range F. -$5,001 to -$7,500 977=Range card response via DKDOL: range G. -$7,501 to -$10,000 978=Range card response via DKDOL: range H. -$10,001 to -$25,000 979=Range card response via DKDOL: range I. -$25,001 to -$50,000 980=Range card response via DKDOL: range J. -$50,001 to -$75,000 981=Range card response via DKDOL: range K. -$75,001 to -$100,000 982=Range card response via DKDOL: range L. -$100,001 to -$250,000 983=Range card response via DKDOL: range M. -$250,001 to -$500,000 984=Range card response via DKDOL: range N. -$500,001 to -$1,000,000 985=Range card response via DKDOL: range O. -$1,000,001 to -$5,000,000 986=Range card response via DKDOL: range P. -$5,000,001 to -$10,000,000 987=Range card response via DKDOL: range Q. -$10,000,001 to -$25,000,000 988=Range card response via DKDOL: range R. -$25,000,001 to -$50,000,000 989=Range card response via DKDOL: range S. -$50,000,001 to -$100,000,000 990=Range card response via DKDOL: range T. Less than -$100,000,000 RESPONDENT-PROVIDED DOLLAR RANGE FOR NEGATIVE NUMBERS 991=Upper and lower bounds given (negative amount) 992=Upper bound given, lower bound missing (negative amount) 993=Lower bound given, upper bound missing (negative amount) OTHER RANGE RESPONSES THAT YIELDED NO NUMERICAL BOUNDING INFORMATION: ALL VARIABLES WITH J-CODE VALUES BELOW THIS POINT INITIALLY CONTAIN MISSING VALUE CODES AND ALL VARIABLES WITH J-CODE VALUES ABOVE THIS POINT INITIALLY CONTAIN A RANGE MID-POINT OR OTHER SUCH VALUE INTERVIEWER COMMENT INDICATING NEGATIVE NUMBER 994=Decision tree response, any version: outcome 21 (negative amount) 995=Decision tree response, any version: outcome 12 (negative amount) 996=Decision tree response, any version: outcome 13 (negative amount) 997=R reached range card field by agreeing to give a range at a dollar field, volunteered to give a letter from the range card, and subsequently responded DK/Refuse letter from the range card (negative amount) 998=R answered DK/Refused to a dollar question, volunteered to give a letter from the range card, and subsequently responded DK/Refuse letter from the range card (negative amount) 999=R reached a field allowing both an upper bound and a lower bound for a dollar amount by volunteering to give a range, but subsequently responded DK/Ref to both upper and lower bound (negative amount) 1000=R answered DK to main $ question, and refused following question requesting a range from the range card (negative amount) 1001=R answered Ref to main $ question, and refused following question requesting a range from the range card (negative amount) NO INDICATION OF NEGATIVE NUMBER 1094=Decision tree response, any version: outcome 21 1095=Decision tree response, any version: outcome 12 1096=Decision tree response, any version: outcome 13 1097=R reached range card field by agreeing to give a range at a dollar field, volunteered to give a letter from the range card, and subsequently responded DK/Refuse letter from the range card 1098=R answered DK/Refused to a dollar question, volunteered to give a letter from the range card, and subsequently responded DK/Refuse letter from the range card 1099=R reached a field allowing both an upper bound and a lower bound for a dollar amount by volunteering to give a range, but subsequently responded DK/Ref to both upper and lower bound 1100=R answered DK to main $ question, and refused following question requesting a range from the range card 1101=R answered Ref to main $ question, and refused following question requesting a range from the range card OTHER CODES FOR MISSING DATA 2050 = Original response was DK. 2052 = Original response missing as a result of missing information for a higher-order question (typically a YES/NO cut question). In this case, the higher-order question has been imputed in such a way as to render the response appropriate. Also includes some other miscellaneous cases: (1) if a dollar variable was missing and DKDOL returned a DK/REF, the corresponding frequency is given a missing value code equal to that of the dollar field; (2) similarly, for clusters of variables containing a dollar amount and percent options. 2053 = Original response was refused 2054 = Original response was "some, DK how many" (see B6). 2056 = Missing value determined from verbatim response by NORC coders. 2060 = Unresolved data problem (none should remain in final dataset). 2079 = Data missing because of questionnaire error, or data not collected. 2080 = Recode variable, missing because data not collected for sub-group, data to be imputed. 2081 = Recode variable, some, but not all components originally missing. 2082 = Recode variable, all components originally missing. 2097 = Override of reported information with (at least partially) imputed data 2098 = Override of reported/inap./other information with a missing value. 2099 = Used for absent spouse for J104 or J105 when X104 or X105 < 0. 3000 = Data missing because R broke off the interview (each of these cases reviewed to be sure that sufficient information is reported that the case can count as a "partial accepted as complete") 3001 = Program, reporting or recording error. 3002 = Temporary value given to variables containing illegal values. These will all be resolved in editing and converted to other existing codes. (includes "range U") 3003 = Illegal zeroes 3004 = Uninformative/irrelevant verbatim response 3005 = Data not available (applies to data from HEF) 3500 = Data set to missing and imputed for disclosure avoidance General instructions for J variable coding for recoded variables: When a recoded variable is taken directly from another single X-variable, it should have the same J-variable code. When a recoded variable may come from a single variable in the original X-variables, or as the result of a calculation based on some number of X-variables, it is important to distinguish the information content in the J-variables. As noted above, when the value is taken directly, the J-variable should have exactly the same value as that for the X-variable's shadow J-variable. However, when some calculation is involved, this should be reflected in the J-variable -- codes 8, 2081, and 2082. When a recode cannot be computed because some part of the underlying information was not collected for some subset of cases, the recode's J-variable should be coded 9 or 2080. *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- ANALYSIS WEIGHTS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- Because the SCF sample is not an equal-probability design, weights play a critical role in interpreting the survey data. The main dataset contains the final nonresponse-adjusted sampling weights. These weights are intended to compensate for unequal probabilities of selection in the original design and for unit nonresponse (failure to obtain an interview). The weight (X42001) is a partially design-based weight constructed at the Federal Reserve using original selection probabilities and frame information along with aggregate control totals estimated from the Current Population Survey. This weight is a relatively minor revision of the consistent weight series (X42000) maintained for the SCFs beginning with 1989 (For a detailed discussion of these weights, see "Consistent Weight Design for the 1989, 1992, and 1995 SCFs and the Distribution of Wealth," by Arthur B. Kennickell and R. Louise Woodburn, Review of Income and Wealth, Series 45, Number 2, June 1999, pp. 193-215 or the longer version given on the SCF web site at http://www.federalreserve.gov/pubs/oss/oss2/method.html). The nature of the revisions to the consistent weights is described in "Revisions to the SCF Weighting Methodology: Accounting for Race/Ethnicity and Homeownership" (Arthur B. Kennickell, January 1999, http://www.federalreserve.gov/pubs/oss/oss2/method.html). A version of the revised weight has been computed for all the surveys beginning with 1989, and this variable has been added to the public versions of the SCF datasets. Users should be aware that the population defined by the weights for *each implicate* (see above) is 102.5 million households: the sum of each of the weights over all sample cases and imputation replicates is equal to five times the number of households in the sample universe. Although the weights should produce reliable results at the level of broad aggregates (e.g., net worth and income ), it is important to note that many of the variables collected in the SCF are highly skewed in their distributions and that many such variables will apply to only a relatively small fraction of the sample; thus, estimates of characterstics of such variables may be distorted by outliers. In the SCF group at the Federal Reserve, we routinely review our calculations for the presence of overly-influential outliers, and robust techniques are applied when appropriate. We encourage other users to exercise similar care in analyzing the data. The issue of weighting in regressions has long been controversial. Users of the SCF may find two references particularly useful: (1) Analysis of Complex Surveys, C.J. Skinner, D. Holt, and T.M.F. Smith (editors), John Wiley and Sons, 1989 (see particularly pages 8-10, 154-157, and 286-287). (2) The Analysis of Household Surveys: A Microeconometric Approach to Development Policy, Angus Deaton, Johns Hopkins University Press, 1997 (see particularly pages 67-73). At the least, users should think carefully about the effects of weights in their particular models. Weighted estimates may be dramatically less efficient than unweighted estimates. If one is interested in estimating descriptions of the population--rather than "structural models"--there are some clear justifications for weighting. If weights make a substantial difference in regression estimates, analysts may want to consider the possibily that their models omit some key structure that could be controlled for in a way other than weighting. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- SAMPLING ERROR ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- The SCF is a scientific instrument designed for the measurement of behavior. However, even under ideal operational conditions, the measurements of the survey are limited in a fundamental way by the fact that it is based on a sample of respondents rather than the entire population. Variability of estimates based on sample data can be estimated. Because we are unable to give users any of the most basic sample information about cases in the dataset, they would be unable on their own to compute reasonable estimates of the sampling variances of their estimates. To facilitate such estimation, we have included a file of replicate weights and multiplicity factors corresponding to X42001. Using detailed information about the original sample design, we selected 999 sample replicates from the final set of completed cases in a way intended to capture the important dimensions of sample variation (for details see "Weighting design for the 1992 Survey of Consumer Finances," Arthur Kennickell, Douglas McManus and Louise Woodburn, December 1996, http://www.federalreserve.gov/pubs/oss/ oss2/method.html). For each survey case and each replicate, the file contains a weight (WT1B1-WT1B999) and the number of times the case was selected in the replicate (MM1-MM999). We computed weights for each replicate using exactly the same procedures we used for the main weights. Replicate weights were computed only for the first implicate of each case. For many purposes, users will probably want to multiply the weight times the multiplicity: in all cases the sum of each of the weights times the corresponding multiplicities of the cases equals the total number of households. To estimate the sampling variance of the mean of family income, for example, a user would estimate the mean 999 times using the replicate weights and compute the standard error of that estimate. An estimate of the total standard error attributable to imputation and sampling is given by SQRT((6/5)*imputation variance + sampling variance). A simple SAS program to compute the standard error due to sampling and imputation for the mean and median of a given variable is provided below. This program may be adapted easily for other types of calculations. For example, to compute the standard error of a proportion, create a zero/one dummy variable to indicate the presence of the item; the standard error of the mean will be the correct standard error of the proportion. To conserve on necessary memory, the program computes sampling error using blocks of 100 replicate weights rather than the full set at once. Users with large amounts of RAM may wish to increase the size of these blocks, and those with smaller amounts may wish to decrease the size. *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; * MACRO MEANIT; * AK May 22, 1998 version; * DSN specifies the name of the dataset to be used (the dataset should contain the following: the main weight renamed as WGT0, a set of variables WGT1-WGT999 equal to the replicate weights multiplied by the multiplicity factors, a variable for which one wishes to compute the standard error due to imputation and sampling for the mean and median, and a variable IMPLIC equal to the implicate number of each case) VAR contains the name of the variable for which one desires standard errors PFLAG: blank prints interim statistics/any character string (e.g., NO) surpresses printing WHERE: defines subsets of data (use IML conventions, e.g., ((X333=3 | X444=4) & X555=5 & X666^=6); * NOTE: the calculation excludes observations with missing values from the calculation. Thus, if one wants to make the calculation for only non-INAP values, a convenient short cut might be to set all such values (normally zero in the main SCF database) to a missing value (a WHERE condition would also work). The program assumes that missing value patterns are consistent across implicates--if this is not the case, a WHERE condition should be used; * WARNING: if one uses this MACRO to compute variances for very small sub-populations, there is a chance that some of the replicates may contain no cases where the condition defining the sub-population holds. In this case, the program will return a fatal error; %MACRO MEANIT(DSN=,VAR=NW2,PFLAG=,WHERE=); PROC SORT DATA=&DSN; BY &VAR; RUN; * compute pooled (over implicates) global mean/median; PROC UNIVARIATE DATA=&DSN; %IF (&WHERE NE ) %THEN %DO; WHERE (&WHERE & &VAR>.Z); %END; %ELSE %DO; WHERE (&VAR>.Z); %END; FREQ WGT0; VAR &VAR; RUN; PROC IML WORKSPACE=9000 SYMSIZE=5000; RESET LOG LINESIZE=78; PRINT "CALCULATION FOR &VAR"; * first imputation variance; EDIT &DSN; TEMP={IMPLIC &VAR WGT0}; %IF (&WHERE EQ ) %THEN %DO; READ ALL VAR TEMP WHERE (&VAR>.Z) INTO MDATA; %END; %ELSE %DO; READ ALL VAR TEMP WHERE (&WHERE) INTO MDATA; %END; * total population; %IF (&WHERE EQ ) %THEN %DO; POP=SUM(MDATA[,3])/5; %END; * create matrix to hold values of means/medians by implicates; IM=SHAPE(0,1,5); ID=SHAPE(0,1,5); * compute mean/median; DO I=1 TO 5; IMP=MDATA[LOC(MDATA[,1]=I),2:3]; * compute mean; MM=IMP[,1]#IMP[,2]; %IF (&WHERE NE ) %THEN %DO; POP=SUM(IMP[,2]); %END; IM[1,I]=MM[+,]/POP; * compute median; IMP[,2]=CUSUM(IMP[,2])/POP; ID[1,I]=IMP[MIN(LOC(IMP[,2]>=.5)),1]; FREE IMP MM; END; IMEAN=IM[,+]/5; IMEDIAN=ID[,+]/5; PRINT "MEAN OVER IMPLICATES " IMEAN; PRINT "MEDIAN OVER IMPLICATES " IMEDIAN; FREE MDATA IMEAN IMEDIAN; %IF (&PFLAG EQ ) %THEN %DO; PRINT IM ID; %END; * next sampling variance; * create matrix to hold values of means/medians by replicates; RM=SHAPE(0,1,999); RD=SHAPE(0,1,999); %DO I=1 %TO 10; %IF (&PFLAG EQ ) %THEN %PUT CLUMP NUMBER &I; %IF (&I EQ 1) %THEN %DO; %LET TOP=99; %LET BOT=1; %LET LEN=100; %END; %ELSE %DO; %LET BOT=%EVAL(&TOP+1); %LET TOP=%EVAL(&TOP+100); %LET LEN=101; %END; %LET WSTR=%STR(); %DO J=&BOT %TO &TOP; %LET WSTR=&WSTR WGT&J; %END; EDIT &DSN; TEMP={&VAR &WSTR}; %IF (&WHERE EQ ) %THEN %DO; READ ALL VAR TEMP WHERE (IMPLIC=1 & &VAR>.Z) INTO MDATA; %END; %ELSE %DO; READ ALL VAR TEMP WHERE (IMPLIC=1 & &WHERE) INTO MDATA; %END; * compute means; MEAN=MDATA[,2:&LEN]#MDATA[,1]; %IF (&WHERE NE ) %THEN %DO; POP=MDATA[+,2:&LEN]; RM[,&BOT:&TOP]=MEAN[+,]/POP[,1:&LEN-1]; %END; %ELSE %DO; RM[,&BOT:&TOP]=MEAN[+,]/POP; %END; * compute medians; DO I=2 TO &LEN; %IF (&WHERE NE ) %THEN %DO; MDATA[,I]=CUSUM(MDATA[,I])/POP[I-1]; %END; %ELSE %DO; MDATA[,I]=CUSUM(MDATA[,I])/POP; %END; RD[&BOT+I-2]=MDATA[MIN(LOC(MDATA[,I]>=.5)),1]; END; FREE MDATA; %END; %IF (&PFLAG EQ ) %THEN %DO; PRINT RM RD; %END; * finally, compute standard error wrt imputation/sampling; * (X-X-bar)**2/(n-1); IVM=(IM-IM[,+]/5)##2; IVM=IVM[,+]/4; IVD=(ID-ID[,+]/5)##2; IVD=IVD[,+]/4; RVM=(RM-RM[,+]/999)##2; RVM=RVM[,+]/998; RVD=(RD-RD[,+]/999)##2; RVD=RVD[,+]/998; * SQRT((((ni+1)/ni))*(SIGMAI**2) + SIGMAR**2); TVM=SQRT((6/5)*IVM+RVM); TVD=SQRT((6/5)*IVD+RVD); IVM=SQRT(IVM); IVD=SQRT(IVD); RVM=SQRT(RVM); RVD=SQRT(RVD); PRINT "STD DEV IMPUTATION: MEAN: " IVM " MEDIAN: " IVD; PRINT "STD DEV SAMPLING: MEAN: " RVM " MEDIAN: " RVD; PRINT "COMBINED STD DEV: MEAN: " TVM " MEDIAN: " TVD; QUIT; %MEND MEANIT; * create dataset from main dataset and replicate weight file; DATA DAT(KEEP=NW IMPLIC WGT0-WGT999); MERGE xxx.main_ds(KEEP=Y1 X42001 ...) xxx.rep_wgts(KEEP=Y1 MM1-MM999 WT1B1-WT1B999); BY Y1; * multiply replicate weights by the multiplicity; ARRAY MULT {*} MM1-MM999; ARRAY RWGT {*} WT1B1-WT1B999; ARRAY WGTS {*} WGT1-WGT999; DO I=1 TO DIM(MULT); * take max of multiplicity/weight: where cases not selected for a replicate, there are missing values in these variables; WGTS{I}=MAX(0,MULT{I})*MAX(0,RWGT{I}); END; WGT0=X42001; * define implicate number of case; IMPLIC=Y1-10*YY1; * define net worth (for example); NW=.......; RUN; * run the macro; %MEANIT(DSN=DAT,VAR=NW); *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*; ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- DISCLOSURE REVIEW ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- A paramount goal of the survey is to protect the privacy of the participants, who generously shared their personal information. In light of this goal, the data in this release have been systematically altered by several means to minimize the possibility of identifying any survey respondent. For some discrete variables, small or unusual cells were collapsed as noted in the individual variable descriptions below. Continuous variables were rounded. Data were also blurred by other unspecified means. In addition, a number of other cases were identified for more extensive treatment. Some of these cases were selected on the basis of extreme or unusual data values; other cases were selected at random. For each of these cases, a selection of critical variables was set to missing and statistically imputed subject to constraints designed to ensure that any distortions induced in key population statistics would be minimal. The geographic identifiers were also systematically altered for a subset of respondents using imputations methods conditioning on key financial characteristics. Where relevant, the codebook provides more detailed information on cell collapsing and other techniques. By design, the SCF sample excludes people who are included in the Forbes Magazine list of the 400 wealthiest people in the U.S. (see references in "SAMPLE DESIGN" above). However, there are several reasons why respondents with wealth at this level could appear in the sample anyway. In the 1998 survey, there were four observations that had net worth greater than the minumum level needed to qualify for the Forbes list. Because it would be very difficult to sufficiently obscure the identity of such people without rendering their data virtually useless, it was decided to remove them from the public version of the datset. Thus, the public version of the dataset contains 4,305 of the 4,309 observations in the full dataset. It is important to note that aside from the cell collapsing, there is no key in this codebook or in the dataset that would allow users to identify with certainty either which data items have been smoothed or otherwise altered, or which cases were selected for imputation of critical values (that is, the shadow variables in this dataset may not always reflect the true original status of every variable). Although this blurring of the data will have some effect on analysis, that effect should be negligible in most cases. For further details on the procedures taken to protect the identity of respondents, see "Analyzing the Disclosure Review Procedures for the 1995 Survey of Consumer Finances," Gerhard Fries, Barry W. Johnson, and R. Louise Woodbutn, September 1997, http://www.federalreserve.gov/pubs/oss/oss2/method.html) and "Multiple Impution and Disclosure Protection: The Case of the 1995 SCF" (Arthur B. Kennickell, November 1997, http://www.federalreserve.gov/pubs/oss/oss2/method.html). The disclosure protections applied to the data are the product of an agreement between the Federal Reserve Board, NORC, and SOI. Users who feel that the restrictions imposed on the public dataset are too constricting are encouraged to submit written proposals for expanded data release, and those requests will be given serious consideration in the release of data from future surveys. Note that dollar variables, except X3918 X3920, in the public dataset have been rounded according to the following scheme which preserves the population mean on average: * All dollar variables except wages; ARRAY AMT {*} X412 X413 X414 X420 X421 X423 X424 X426 X427 X429 X430 X7575 X505 X510 X513 X518 X521 X526 X602 X604 X607 X612 X614 X617 X619 X623 X627 X631 X635 X703 X708 X716 X717 X721 X804 X805 X808 X813 X812 X904 X905 X908 X913 X912 X1004 X1005 X1008 X1013 X1012 X1035 X1039 X1040 X1044 X7141 X1108 X1109 X1104 X7142 X1119 X1120 X1115 X7143 X1130 X1131 X1126 X1136 X8401 X1202 X1206 X1210 X1211 X1215 X1219 X1220 X1224 X1405 X1408 X1409 X1410 X1415 X1417 X1505 X1508 X1509 X1510 X1515 X1517 X1605 X1608 X1609 X1610 X1615 X1617 X1619 X8402 X1621 X8404 X1706 X1709 X1714 X1715 X1718 X1723 X1722 X1730 X1806 X1809 X1814 X1815 X1818 X1823 X1822 X1830 X1906 X1909 X1914 X1915 X1918 X1923 X1922 X1930 X2002 X2003 X2006 X2007 X2010 X2012 X2013 X2016 X2017 X2020 X8406 X8407 X8410 X8411 X8414 X8416 X8417 X8420 X8421 X8424 X3121 X3124 X3126 X3129 X3130 X3131 X3132 X3221 X3224 X3226 X3229 X3230 X3231 X3232 X3321 X3324 X3326 X3329 X3330 X3331 X3332 X3335 X8425 X3336 X8426 X3337 X8427 X3408 X3409 X3410 X3412 X3413 X3414 X3416 X3417 X3418 X3420 X3421 X3422 X3424 X3425 X3426 X3428 X8452 X3429 X8453 X3430 X8454 X2105 X2112 X2117 X8428 X2209 X2213 X2214 X2218 X2309 X2313 X2314 X2318 X2409 X2413 X2414 X2418 X7158 X7162 X7164 X7169 X2422 X8430 X2424 X8432 X2425 X8433 X2506 X2510 X2514 X2515 X2519 X2606 X2610 X2614 X2615 X2619 X2623 X8435 X2625 X8437 X2626 X8438 X7805 X7815 X7817 X7824 X7828 X7838 X7840 X7847 X7851 X7861 X7863 X7870 X7905 X7915 X7917 X7924 X7928 X7938 X7940 X7947 X7951 X7961 X7963 X7970 X7179 X8440 X7180 X8441 X2714 X2718 X2719 X2723 X2731 X2735 X2736 X2740 X2814 X2818 X2819 X2823 X2831 X2835 X2836 X2840 X2914 X2918 X2919 X2923 X2931 X2935 X2936 X2940 X7183 X8443 X7184 X8444 X7187 X3506 X3510 X3514 X3518 X3522 X3526 X3529 X8446 X3610 X3620 X3630 X3706 X3711 X3716 X3718 X8447 X3721 X3804 X3807 X3810 X3813 X3816 X3818 X8448 X3822 X3824 X3826 X3828 X3830 X6704 X3833 X3835 X3902 X3906 X7635 X3908 X7636 X3910 X7637 X7633 X7638 X7634 X7639 X6705 X6706 X3915 X3922 X7641 X3930 X3932 X6817 X6820 X6832 X6835 X4003 X4005 X4006 X4010 X4011 X4014 X4016 X4018 X4022 X4026 X4030 X4032 X4204 X4207 X4210 X4214 X4220 X4229 X7211 X4224 X4226 X7697 X4304 X4307 X4310 X4314 X4320 X4329 X7220 X4324 X4326 X7698 X4404 X4407 X4410 X4414 X4420 X4429 X7229 X4424 X4426 X7699 X4436 X8449 X4804 X4807 X4810 X4814 X4820 X4829 X7278 X4824 X4826 X7724 X4904 X4907 X4910 X4914 X4920 X4929 X7287 X4924 X4926 X7725 X5004 X5007 X5010 X5014 X5020 X5029 X7296 X5024 X5026 X7726 X5036 X8450 X5306 X5311 X5318 X5326 X5334 X5418 X5426 X5434 X6804 X8455 X5504 X5507 X5510 X5513 X5516 X6806 X8457 X5604 X5608 X5612 X5616 X5620 X5624 X5628 X5632 X5636 X5640 X5644 X5648 X6807 X8458 X5702 X5704 X5706 X5708 X5710 X5712 X5714 X5716 X5718 X5720 X5722 X5724 X5729 X7362 X5732 X5734 X5751 X7651 X7652 X5804 X5809 X5814 X5818 X8451 X5821 X5823 X5926 X5928 X6650 X6652 X7666 X6403 X6415 X6418 X6421 X6432 X6436 X6437 X6439 X8163 X8164 X8166 X8167 X8168 X8188; DO I = 1 TO DIM(AMT); IF (0 < AMT{I} < 5) THEN AMT{I}=1; ELSE IF (5 <= AMT{I} < 1000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT{I},10); IF (RAN>PROB/10) THEN AMT{I}=10*(INT(AMT{I}/10)); ELSE AMT{I}=10*(1+INT(AMT{I}/10)); IF AMT{I}=0 THEN AMT{I}=5; END; ELSE IF (1000 <= AMT{I} < 10000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT{I},100); IF (RAN>PROB/100) THEN AMT{I}=100*(INT(AMT{I}/100)); ELSE AMT{I}=100*(1+INT(AMT{I}/100)); END; ELSE IF (10000 <= AMT{I} < 1000000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT{I},1000); IF (RAN>PROB/1000) THEN AMT{I}=1000*(INT(AMT{I}/1000)); ELSE AMT{I}=1000*(1+INT(AMT{I}/1000)); END; ELSE IF (1000000 <= AMT{I}) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT{I},10000); IF (RAN>PROB/10000) THEN AMT{I}=10000*(INT(AMT{I}/10000)); ELSE AMT{I}=10000*(1+INT(AMT{I}/10000)); END; ELSE IF (-1000 <= AMT{I} < - 5) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT{I},10); IF (RAN>PROB/10) THEN AMT{I}=10*(INT(AMT{I}/10)); ELSE AMT{I}=10*(1+INT(AMT{I}/10)); END; ELSE IF (-10000 <= AMT{I} < -1000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT{I},100); IF (RAN>PROB/100) THEN AMT{I}=100*(INT(AMT{I}/100)); ELSE AMT{I}=100*(1+INT(AMT{I}/100)); END; ELSE IF (-1000000 < AMT{I} < -10000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT{I},1000); IF (RAN>PROB/1000) THEN AMT{I}=1000*(INT(AMT{I}/1000)); ELSE AMT{I}=1000*(1+INT(AMT{I}/1000)); END; ELSE IF .Z < AMT{I} <= -1000000 THEN AMT{I}=-1000000; END; * wages: special treatment for hourly wages <=25; ARRAY AMT2 {*} X4112 X4131 X4509 X4520 X4532 X4540 X4605 X4613 X4712 X4731 X5109 X5120 X5132 X5140 X5205 X5213; ARRAY PER2 {*} X4113 X4132 X4510 X4521 X4533 X4541 X4606 X4614 X4713 X4732 X5110 X5121 X5133 X5141 X5206 X5214; DO I=1 TO DIM(AMT2); IF PER2{I}=18 THEN DO; IF (AMT2{I} < 25 AND AMT2{I} > 0) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},.1); IF (RAN>PROB/.1) THEN AMT2{I}=.1*(INT(AMT2{I}/.1)); ELSE AMT2{I}=.1*(1+INT(AMT2{I}/.1)); END; ELSE IF (25 <= AMT2{I} < 1000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},10); IF (RAN>PROB/10) THEN AMT2{I}=10*(INT(AMT2{I}/10)); ELSE AMT2{I}=10*(1+INT(AMT2{I}/10)); END; ELSE IF (1000 <= AMT2{I} < 10000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},100); IF (RAN>PROB/100) THEN AMT2{I}=100*(INT(AMT2{I}/100)); ELSE AMT2{I}=100*(1+INT(AMT2{I}/100)); END; ELSE IF (10000 <= AMT2{I} < 1000000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},1000); IF (RAN>PROB/1000) THEN AMT2{I}=1000*(INT(AMT2{I}/1000)); ELSE AMT2{I}=1000*(1+INT(AMT2{I}/1000)); END; ELSE IF (1000000 <= AMT2{I}) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},10000); IF (RAN>PROB/10000) THEN AMT2{I}=10000*(INT(AMT2{I}/10000)); ELSE AMT2{I}=10000*(1+INT(AMT2{I}/10000)); END; ELSE IF (-1000 <= AMT2{I} < - 5) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},10); IF (RAN>PROB/10) THEN AMT2{I}=10*(INT(AMT2{I}/10)); ELSE AMT2{I}=10*(1+INT(AMT2{I}/10)); END; ELSE IF (-10000 <= AMT2{I} < -1000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},100); IF (RAN>PROB/100) THEN AMT2{I}=100*(INT(AMT2{I}/100)); ELSE AMT2{I}=100*(1+INT(AMT2{I}/100)); END; ELSE IF (-1000000 < AMT2{I} < -10000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},1000); IF (RAN>PROB/1000) THEN AMT2{I}=1000*(INT(AMT2{I}/1000)); ELSE AMT2{I}=1000*(1+INT(AMT2{I}/1000)); END; ELSE IF .Z < AMT2{I} <= -1000000 THEN AMT2{I}=-1000000; END; ELSE DO; IF (0 < AMT2{I} < 5) THEN AMT2{I}=1; ELSE IF (5 <= AMT2{I} < 1000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},10); IF (RAN>PROB/10) THEN AMT2{I}=10*(INT(AMT2{I}/10)); ELSE AMT2{I}=10*(1+INT(AMT2{I}/10)); END; ELSE IF (1000 <= AMT2{I} < 10000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},100); IF (RAN>PROB/100) THEN AMT2{I}=100*(INT(AMT2{I}/100)); ELSE AMT2{I}=100*(1+INT(AMT2{I}/100)); END; ELSE IF (10000 <= AMT2{I} < 1000000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},1000); IF (RAN>PROB/1000) THEN AMT2{I}=1000*(INT(AMT2{I}/1000)); ELSE AMT2{I}=1000*(1+INT(AMT2{I}/1000)); END; ELSE IF (1000000 <= AMT2{I}) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},10000); IF (RAN>PROB/10000) THEN AMT2{I}=10000*(INT(AMT2{I}/10000)); ELSE AMT2{I}=10000*(1+INT(AMT2{I}/10000)); END; ELSE IF (-1000 <= AMT2{I} < - 5) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},10); IF (RAN>PROB/10) THEN AMT2{I}=10*(INT(AMT2{I}/10)); ELSE AMT2{I}=10*(1+INT(AMT2{I}/10)); END; ELSE IF (-10000 <= AMT2{I} < -1000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},100); IF (RAN>PROB/100) THEN AMT2{I}=100*(INT(AMT2{I}/100)); ELSE AMT2{I}=100*(1+INT(AMT2{I}/100)); END; ELSE IF (-1000000 < AMT2{I} < -10000) THEN DO; RAN=UNIFORM(5555555); PROB=MOD(AMT2{I},1000); IF (RAN>PROB/1000) THEN AMT2{I}=1000*(INT(AMT2{I}/1000)); ELSE AMT2{I}=1000*(1+INT(AMT2{I}/1000)); END; ELSE IF .Z < AMT2{I} <= -1000000 THEN AMT2{I}=-1000000; END; END; ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- COMPARISON WITH OTHER DATA ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- In general, medians of financial characteristics estimated from the SCF should compare well with medians estimated from other surveys using comparable population definitions. However, estimates of means will often differ, largely for two reasons. First, means of many financial characteristics may not be very robustly estimated in surveys that interview only a relatively small number of wealthy households. The distribution of many financial characteristics (e.g., net worth) is highly skewed, and sparce representation of the upper tail will translate into a noisy estimate of statistics, such as the mean, that are strong affected by the top of the distribution. Second, there may also be a degree of bias in the measurement of some financial characteristics. Evidence suggests that there is differentially higher nonresponse among wealthy households. Failure to account for such differences in the creation of analysis weights leads to a misrepresentation of the size of the upper tail of wealth and characterstics associated with being in that tail. By using frame data for the list sample, the SCF has the means to identify and make some corrections for such nonresponse. However, this option is not available in most other surveys. The SCF may also be compared with aggregate statistics, such as the flow of funds accounts (FFA), which are constructed by the Board of Governors of the Federal Reserve System. An extensive analysis of the differences in these two sources is provided by Rochelle Antoniewicz ("A Comparison of the Household Sector from the Flow of Funds Accounts and the Survey of Consumer Finances," Finance and Economics Discussion Series 1996-26, Board of Governors of the Federal Reserve System, June). As discussed in that paper in detail, there are many conceptual differences between the SCF and the FFA. Three of these differences are particularly noteworthy here. First, the FFA "household sector" includes the holdings of nonprofits, and estimates of those holdings must be made to create a population basis closer to that used in the SCF. Second, the financial concepts used in the FFA often differ from those used in the SCF. Substantial effort is usually required to align the concepts in the two sources, and in some cases there is no clear way of doing so. Third, both the FFA and the SCF provide estimates. Because the two series are developed from independent sources of information with different statistical properties, it would be surprising if they yielded precisely the same totals even if the populations and concepts could be made perfectly consistent. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- ACKNOWLEDGMENTS ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- The SCF is a large project that involves intense commitment by many people. At the Federal Reserve, the main project staff involved with the creation of the data has included Gerhard Fries, Arthur Kennickell, Annelise Li, Amber Lytle, Kevin Moore, Martha Starr-McCluer, Amy Stubbendick, Annika Sunden, and Brian Surette. Important support has come from the FRB officer corps, particulary Edward Ettin and Myron Kwast who have invested their credibility in making the project possible. In addition to funding the survey, the individual members of the Board of Governors have actively encouraged the development and use of the survey. Support from the Statistics of Income Division at the IRS has been essential. Barry Johnson has been tireless in his work to obtain the necessary data for the selection of the list sample, in his work on the disclosure review, and in sharing the insights he has gained in working with the IRS estate tax data. Dan Skelly, the director of SOI, and James Nunns at the Office of Tax Analysis at the Department of the Treasury have encouraged and supported us through many difficult periods. At the National Opinion Research Center at the University of Chicago, very many people have touched the project in important ways. The project director for the 1998 SCF at NORC was Lisa Thalji, and the Associate Project Director was Mary Hess. Their commitment and organization were critical in the success of the 1998 survey. Geoff Walker tirelessly revised the CAPI software to produce the best possible instrument for the interviewers, and he provided a variety of other types of technical support. The coding of respondents' verbatim responses was done by Pam Melendez, Darius Stroud, and Betty Williams. Pat Smillie was the coordinator for coding, and Denise Ellison was the coding supervisor. Deep thanks are also due to many others at the NORC central office, including Dan Bartels, Val Geralds, Rachael Harter, Nick Holt, Jim Rogers, Dave Riemer, Rebecca Smith, Suzanne Turner, Karen Veldman, Robert Wagers, and Ken Wilmer. Phil Depoy, president of NORC, provided essential support and personal encouragement throughout the project. I apologize to the many other people at NORC whose names I cannot remember. One of the greatest strengths of NORC is its very talented field staff. Pat Phillips served as the principal liason to the field, and Sandy Pitzer and Linda Wiedmer served as the Field Project Managers. The Field Managers included Julie Feldman, Lynn Gallagher, Marty Hargas, Barbara Lawrence, Susan Miller, Nancy Mutz, Carolyn Nicholas, Joseph Pierce, and Barbara Watt. These are all very creative and dedicated people. We all deeply mourn the passing of Madeline Lauer, who was a devoted NORC Field Manager who had signed up to work on the 1998 survey, but who was diagnosed with a terminal illness just before training began. The interviewers, some of whom may prefer not to be named, were the people who did the hardest work. In 1998, SCF interviewers included many experienced interviewers--some on earlier SCFs--and others from a wide variety of backgrounds. They deserve the deepest gratitude of all users of the SCF data. The only people who gave more than the interviewers were the survey respondents, who are necessarily anonymous. May every user remember that some person gave his or her time in the public interest to create the data that make their analysis possible. No set of acknowledgements would be complete without mentioning three people: Fritz Scheuren, who provided early and continuing encouragement, insights, and support for the SCF project; Robert Avery, my predecessor as director of the SCF, is a colleague who not only created the atmosphere that made the current development of the project possible, but who continues to contribute as a sounding board for our ideas; and Dorothy S. Projector, project director of the Federal Reserve's landmark 1962-63 Survey of Financial Characteristics of Consumers, set a very high standard for all future work on household wealth surveys. ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- CONTACT INFORMATION ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- It is likely that some users will have trouble understanding the organization of the data at first. IF AFTER HAVING FRAMED A FOCUSED QUESTION AND EXHAUSTED ALL OF YOUR LOCAL RESOURCES, YOUR PROBLEM PERSISTS, you may call Gerhard Fries at ((202) 452-2578 or e-mail Gerhard.Fries@frb.gov) or me ((202)-452-2247 or e-mail Arthur.Kennickell@frb.gov)). ****We prefer correspondence via e-mail.**** While we would like to be helpful to you, please realize that we are not set up to provide extensive services to users. We hope that by persistence, you will almost always be able to figure out what you need by consulting the questionnaire and the codeboo