This document reports results from a workshop on "Methodological Advances and the Human Capital Initiative" held 12 July 1996 at the National Science Foundation. The views and comments contained in this document are not necessarily those of the National Science Foundation, but are instead exclusively those of the workshop participants. For further information or additional copies of this report, contact Cheryl L. Eavey, National Science Foundation, at (703) 306-1729, or CEAVEY@NSF.GOV.
The Human Capital Initiative offers researchers a unique opportunity to
address important substantive issues, ones that often require new ways of
conceptualizing or combining data, new or modified methods of analysis
or new formal models for relating the constructs of interest to observable
data. The HCI thus provides researchers a vehicle for advancing the
measurement, methodological, and statistical components of their
disciplines. Such advances, made in the context of one or more
disciplines, open up issues not previously amenable to empirical or
theoretical analysis.
The Methodology, Measurement, and Statistics (MMS) Program of the
National Science Foundation invites proposals that embed advances in
methodology, data analysis, and/or formal modeling within the context of
well-justified substantive research issues, as well as more generally.
MMS recognizes that methodological developments relevant for human
capital issues may require substantial background, both substantively and
methodologically. MMS thus encourages collaborations across the social,
behavioral, economic, and statistical sciences. Proposals for conferences
and/or workshops on methodological topics appropriate for addressing HCI
issues also are welcome.
In order to stimulate discussion of methodological needs in human capital
research and to identify potential areas of research, the Methodology,
Measurement, and Statistics Program convened a workshop on 12 July
1996 to address the topic "Methodological Advances and the Human
Capital Initiative." Workshop discussions were informed by and built
upon the agenda for HCI research as described in the various NSF
brochures and announcements. The document, Investing in Human
Resources: A Strategic Plan for the Human Capital Initiative, outlines a
strategy for HCI research "designed to increase understanding of the
nature and causes of existing problems and to evaluate the effectiveness of
policies aimed at improving the human resources of America's citizens"
(NSF, 1994). In addition to formulating research agendas for HCI's six
substantive areas, the report also briefly suggests data and methodological
needs. Data needs identified include the extension of longitudinal data
sets, the collection of data from multiple sources, and embedded studies
that merge alternative forms of empirical analysis. Methodological needs
identified include expanded methodologies for dynamic modeling and
models that link micro-level behavior of individuals with macro-level
institutions and environments.
MMS workshop participants agreed with the focus on understanding causal
relations and on the importance of the data and methodological needs
identified in the strategic plan. Indeed, the centrality of longitudinal data
for addressing many important human capital questions led to long
discussions of design and analysis issues relevant for studies based on
panel data. Ultimately, discussions converged on the following topics:
1) Design and Analysis of Longitudinal Data
Examples of specific methodological questions related to each of these
topics are given below. These topics are intended to be illustrative of
some important areas of methodological and/or statistical research relevant
to the MMS Program and consistent with the goals and strategic plan of
the Human Capital Initiative.
Most longitudinal studies approach design issues that follow initial sample
selection on an ad hoc basis; thus, we have limited cumulative knowledge
and systematic study to bring to bear on new investigations. How and
when should attrition be modelled? Should data always be gathered at
equally-spaced time intervals? As data are gathered over time, the units of
measurement change in a dynamic fashion. Designs for continued data
collection inevitably require choices that have serious implications for
analysis and inference. MMS welcomes proposals addressing questions of
design in longitudinal studies.
Beyond questions of design, the analysis of longitudinal data invariably
leads to specific methodological problems. Advances on the topics
identified below would enhance the value of longitudinal data for
addressing complex human capital questions.
Nonparametric methods. Large longitudinal data sets, some containing
hundreds of thousands of person-years in observations, often are useful for
addressing issues related to HCI. For example, major datasets on labor
issues, such as the National Longitudinal Survey (NLS) and the Panel
Study of Income Dynamics (PSID), can be used to study the number and
duration of poverty spells for individuals with different levels of
education. Motivated by the sensitivity of results to specific functional
form assumptions, recent research has developed less restrictive
procedures for use in large samples. These include classical smoothing
procedures, neural networks, and Bayesian hierarchical models. MMS
welcomes research that further refines and applies such nonparametric
methods to human capital issues.
Discrete choice. Many individual decisions in the human capital
accumulation process are discrete; for example, decisions to leave or
return to school, fertility, etc. Recent advances have made it possible to
study simple models of sequences of discrete choice over time, such as
labor market participation. Methods for inference in structural models of
dynamic decision making are a promising line of research for
understanding the human capital investment process.
Individual, cohort, and age effects. Several important policy questions
address changes over time in individuals' responses to opportunities
available to them, such as propensities to invest in education. Separating
secular changes from individual heterogeneity and changes of the life cycle
raise specific methodological questions. The ability to address these
policy questions is determined, in large part, by advances in methods that
address these issues.
Small-area data estimation. As analysts strive to explain the decisions
made by economic agents, there is increasing pressure to move from
macro to micro, and ultimately, individual-level scales of analyses.
Moving to higher levels of resolution often requires estimating small-area
attribute values (for example of households or places of work) from larger
units of analysis. We are just beginning to understand how to perform
these estimations.
Estimating missing values. A particularly pressing problem is the
estimation of missing values for individual-level data that generates
samples with large numbers of zeros due to privacy concerns issues and/or
missing measurements.
Boundary value estimation and/or transformations. The artificial
truncation of a spatial process presents particular concerns of the value of
the recorded measurements in the areal units at the boundary. This is
similar to the problem of truncation in event history analysis. Corrections
that have been proposed are arbitrary in theory and computationally
intensive.
Extraction and exploration techniques. As different kinds of agencies
move to collect data in a geo-referenced framework, researchers will have
to deal with the computational and storage burden this referencing entails.
Even when researchers have no interest in maintaining the geo-
referencing, they may require techniques to extract the data from the data
set. Further, faced with the volume of data described above, they may
need to develop new tools for data visualization as a means of exploring
such data and developing preliminary research questions.
"Establishment surveys," in which the units studied are organizations, such
as workplaces, schools, hospitals, or agencies of the government, are
natural vehicles for addressing such questions. In such surveys, one or
more individual informants provide data on behalf of the establishment.
Establishment survey data are sometimes integrated with individual-level
data on organizational members, with archival data on the places, or with
industries that constitute an establishment's setting or competitive
environment.
Establishment surveys have long been used as components of systems of
national accounts, for the estimation of population totals such as
employment or output levels. Scholars in the social sciences are now
turning to establishment surveys for different purposes -- to develop, for
example, knowledge about work organizations, school processes and
effects, the development and diffusion of human resources practices, and
innovation, as well as to study schools and work organizations as contexts
for learning and skills acquisition.
The methodological literature on establishment surveys is much less
extensive than that on surveys of individuals, and many methodological
problems involved in such studies have been little-studied or are poorly
understood. Other problems result from the changing purposes of
establishment surveys and the changing nature of organizational
phenomena. Problems requiring attention include, but are not limited to,
the following:
Such effects of design and context might be considered as introducing their
own components of variance, which are part of the uncertainty of the
resulting information, but which is not captured in estimates of sampling
standard error. That is, if we consider several surveys which made
different detailed design choices, those surveys would produce estimates
that are much more variable than would be expected due to sampling
errors alone.
Policy decisions typically are concerned with questions that are broader
than the particularities of a single data collection design. For example,
policy makers want to know how social class is related to reading
achievement, not how social class measured in a specific way is related to
a score on a particular set of achievement test items. The standard method
of calculating uncertainty in information and policy analyses is based on
sampling uncertainty; but for the reasons outlined above, sampling
standard errors alone provide an underestimate of the uncertainty in the
data and in summaries produced from it.
Studies that provide insight about more realistic estimates of uncertainty of
information produced by human capital research would be highly
desirable. Such studies might include systematic investigations of
variations in procedures and models for reasonable distribution of variation
in those procedures. Studies of actual replications that have already been
conducted might serve as "natural experiments." Ideas for general
methods that might be broadly applicable would be particularly interesting.
Questions include, for example, whether the current concept of a
household adequately captures the diverse living arrangements of
Americans, including the phenomena of blended families, children in joint
custody, or even the homeless. How can researchers model appropriately
the multiple ethnic identities of Americans and discover and test the
salience of particular categories? How do concepts and classification
schemes taken from existing data systems affect the research process of
the secondary data user? What new concepts and classification systems
need to be developed to meet the needs of the Human Capital Initiative?
How can or should one develop common measurements to capture the
experience, for example, at 'home' and 'work'? What measurement
schemes are required for units of analysis at different levels of
aggregation, for example, poor people versus poor neighborhoods?
This project uses the Gibbs sampler to develop and implement estimation
strategies that will enable researchers to obtain robust estimates of
parameters and appropriate intervals in applications of hierarchical models
with dichotomous outcomes in small-sample, social research settings.
Guidelines for proper implementation and use of these strategies will be
developed through analyses of a series of simulated data sets and through
analyses of the data from two studies: A multi-site evaluation of a dropout
prevention initiative, and an NSF-funded study of the effects of different
mathematics.
SBR-9631387: "Project to Revise the Historical Labor Statistics of the
United States"
This project will revise Chapter D (Labor) of the United States Census
Bureau's Historical Statistics of the United States. Historical Statistics is a
massive, two-volume compendium of 54 chapters on topics touching all of
the social, behavioral, humanistic, and natural sciences. This award
assists in a major collaborative effort to produce an updated, revised,
expanded, and electronically-accessible "millennial edition" of Historical
Statistics. In addition to a completed revision of Chapter D, this project
will develop a protocol for the revision of the remaining chapters of
Historical Statistics.
SBR-9515136: "Improving Within-School and School-Community
Systemic Linkages for At-Risk Students"
This project investigates empirically the impact of recent federal reform
initiatives legislated by Title I of the Elementary and Secondary Education
Act on the narrowing of the achievement gap between educationally at-risk
students and their more advantaged peers. The project makes use of the
comprehensive, Congressionally mandated Prospects data files, which
consist of standardized reading and math achievement scores for a
nationally-representative sample of nearly 40,000 students, and detailed
information regarding the students themselves, and their schools,
classrooms, and families. The investigators' analyses will generate
national estimates of the extent and intensity of these reform activities, and
will produce empirically-based paradigms for the improvement of federal
Title I programs and the schools that serve at-risk students.
SBR-9423018: "Causal Inference Applied to Income Effects"
The objective of this project is to measure validly the treatment effects of
giving additional income to low and middle income families. The study
uses the Massachusetts State Lottery as a natural experiment in which
some families are randomly assigned additional income and some are not.
Subjects in both the treatment and control group will be surveyed by mail
and by phone. The use of this natural experiment will allow the
researchers to make valid inferences about the effects of additional income
on these families using a rigorous definition of causality. The data from
the surveys will be linked to earnings
records from the Social Security Administration.
Carl Amrhein
Cheryl Eavey
Stephen Fienberg
John Geweke
Larry Hedges
Charles Manski
Peter Marsden
John Sprague
Thomas Wallsten
- Research on methodological aspects of new or existing
procedures for data collection; research to evaluate or
compare existing data bases and data collection
procedures; and the collection of unique databases with
cross disciplinary implications, especially when paired
with developments in measurement or methodology.
- The methodological infrastructure of social and
behavioral research.
Up-to-date information on the program, including recent awards lists and
announcements of special funding opportunities, is available on the MMS
Home Page:
The Foundation provides awards for research and education in the sciences
and engineering. The awardee is wholly responsible for the conduct of
such research and preparation of the results for publication. The
Foundation, therefore, does not assume responsibility for the research
findings or their interpretation.
The Foundation welcomes proposals from all qualified scientists and
engineers and strongly encourages women, minorities, and persons with
disabilities to compete fully in any of the research and education related
programs described here. In accordance with federal statutes, regulations,
and NSF policies, no person on grounds of race, color, age, sex, national
origin, or disability shall be excluded from participation in, be denied the
benefits of, or be subject to discrimination under any program or activity
receiving financial assistance from the National Science Foundation.
Facilitation Awards for Scientists and Engineers with Disabilities (FASED)
provide funding for special assistance or equipment to enable persons with
disabilities (investigators and other staff, including student research
assistants) to work on NSF projects. See the program announcement or
contact the program coordinator at (703) 306-1636.
The National Science Foundation has TDD (Telephonic Device for the
Deaf) capability, which enables individuals with hearing impairment to
communicate with the Foundation about NSF programs, employment, or
general information. To access NSF TDD dial (703) 306-0090; for FIRS,
1-800-877-8339.Methodological Advances and the
Since Fiscal Year 1995, the National Science Foundation has provided
funding opportunities for research on human capital issues. NSF's Human
Capital Initiative (HCI) supports fundamental research "which advances
basic understanding of the causes of the psychological, social, economic,
and cultural capacities for productive citizenship" (NSF 95-8). In
particular, background material for the initiative identified research
agendas for six high priority thematic areas: Workplace, Education,
Families, Neighborhoods, Disadvantage, and Poverty (NSF, 1994).1
Human Capital Initiative
2) Data Integration
3) Establishment Surveys
4) Data Collection Procedures in Survey Research
5) Survey Research and Measurement Issues
6) Methods for Linking Diverse Approaches to
Understanding Behavior
Longitudinal Data: Issues of Design and Analysis
Longitudinal studies play a central role in many human capital research
projects, and renewed attention to the design of such studies is critical to
understanding a broad array of human capital issues. Many longitudinal
studies begin with a single or a small group of cohorts and follow them
over time. Samples need to be refreshed, however, either because of
attrition or changes in the population after the cohorts were initiated (e.g.,
immigrants). In addition, often there is interest in generalizing beyond
those specific cohorts to a larger population.Data Integration
Micro-level data presents many problems including availability, reliability,
and continuity. If we chose, for example, a problem set that seeks to
relate the environment experienced in inner city neighborhoods by young
males to their economic productivity, a researcher would have to look at
health, physical environment, crime, education, and economic standing
(among other factors) to help explain the productivity of these citizens.
No single source of data is likely to be adequate. Integrating data from
disparate sources and/or different measurement scales presents a host of
statistical, procedural, and computational problems.2 Proposals for
research that address the barriers to data integration across space, time,
and sources will benefit researchers working in many empirical settings.
Methodological issues that require additional research include the
following:Establishment Surveys
Many substantive questions highlighted by the strategic plan for human
capital initiative have to do with the performance of organizations.
Examples include the capacity of educational institutions or training
vendors to contribute to building a skilled workforce, the capacity of
employers to renew and increase the training and skills of their employees,
and the ability of work organizations to innovate and to produce products
and services effectively and efficiently. Both short- and long-term
organizational performance -- current output levels as well as the
infrastructural capacity to sustain and increase outputs -- are of
importance.
Data Collection Procedures in Survey Research
Social data are affected by choices made in design and implementation of
data collection procedures. For example, choices of how questions are
asked, the details of sampling frames, actual times and conditions of
measurement, etc., must be made in every data collection. Examples of
how such choices affect social data include the considerable work on the
effects of how questions are asked, which respondents data are collected
when more than one respondent might have data, and the effects found in
the National Assessment of Educational Progress (NAEP) of question
context and time of measurement. In many cases, the actual choices made
are not the only ones possible. The choices are to some extent arbitrary,
and reasonable people might choose other alternatives that most
researchers would regard as equally valid.Survey Research and Measurement Issues
Research designed to meet the goals of the Human Capital Initiative may
require conceptual and definitional innovations in existing (canonical)
categories of social measurement. MMS welcomes proposals aimed at
improving the core concepts of human capital research, the classification
systems used in surveys, and the implications of categories and concepts
embedded in administrative data systems. Among the most obvious are
the core concepts grounding the social, behavioral, and economic sciences,
including household, race and ethnicity, neighborhood, establishment, or
occupation. Boundary questions and categorization issues abound in such
concepts. The social sciences have always been sensitive to these
questions. The core concepts and classification systems themselves are the
result of the assumptions and procedures of earlier generations of users,
which themselves were embedded in the technologies and questions of
interest to their creators. Nevertheless, the consciously interdisciplinary
focus of MMS and the Human Capital Initiative provide an opportunity to
foster the development of new methods attuned to the particular
substantive issues raised in the intersecting conceptual and classifying
devices used to analyze human experience.Methods for Linking Diverse Approaches
Understanding the objective and subjective determinants of behavior is
among the most challenging problems facing empirical researchers seeking
to understand schooling decisions, labor supply, and other behaviors
related to human capital formation. How do individuals form expectations
about the consequences of alternative actions? How do the decisions
people make depend on these expectations, on their preferences, and on
the constraints they face? Different behavioral and social science
disciplines have used distinctive empirical research strategies to address
these fundamental questions. Economists have sought to infer the
structure of decision making almost entirely from data on actual choices.
Sociologists and social psychologists have collected and analyzed data on
attitudinal measures elicited from respondents in sample surveys.
Cognitive psychologists have conducted experiments aimed at
understanding the subjective constructs that people use in framing
alternatives and reaching decisions. MMS invites proposals that aim to
extract the best elements of these strategies, to creatively synthesize them,
and to improve upon them. Proposed research might, for example, seek
to integrate experimental and survey approaches to illuminate issues of
common concern.
to Understanding BehaviorRecent Projects Related to the Human Capital
SBR-9422901: "Estimation of Hierarchical Models with Dichotomous
Outcomes in Small-Sample, Social Research Settings"
Initiative Supported by the MMS Program
Michael Seltzer, University of California/Los Angeles
Susan B. Carter, University of California/Riverside
Kenneth K. Wong, University of Chicago
Larry Hedges, University of Chicago
Donald B. Rubin, Harvard University
Guido Imbens, Harvard UniversityWorkshop Participants
History and Urban Affairs
University of Wisconsin
Department of Geography
University of Toronto
Methodology, Measurement, and
Statistics Program
National Science Foundation
Department of Statistics
Carnegie Mellon University
Department of Economics
University of Minnesota
Department of Education
University of Chicago
Department of Economics
University of Wisconsin
Department of Sociology
Harvard University
Department of Political Science
Washington University
Department of Psychology
University of North Carolina, Chapel Hill
ADDENDUM
- The development, application, and extension of formal
models and methodologies for social and behavioral
research, including methods for improving
measurement.