Finding Social Science Data
Outline of talk by Daniel C. Tsang
Fulbright Scholar to Vietnam
Social Science Data Librarian, University of California, Irvine
Introduction
There is a variety of social science data now available on the Web.
These data can be:
- Freely accessible
- Documentation [or meta-data, i.e., data about data] freely available
- Licensed access or fee-based access
- Restricted access
Often, if the data are not available on the Web, information will be
given on how to obtain the data, usually after registration.
Social Science Data Archives (maps from CESSDA)
Europe: http://www.nsd.uib.no/cessda/europe.html
North America: http://www.nsd.uib.no/cessda/namer.html
Other: http://www.nsd.uib.no/cessda/other.html
Specific web sites - ICPSR http://www.icpsr.umich.edu
ICPSR stands for Inter-university Consortium for Political and Social
Research. This data archive is the largest social science data archive
in the U.S. Free access is available to its members who pay an annual
membership fee depending on the type of institution. But selected datasets
as well as all codebooks (documentation) are available freely. Among
the codebooks freely available on this site are earlier waves of the
World Values Survey (WVS; the data are available to licensed users).
This site is where the 2001 WVS that includes the Vietnam component.
Datasets are freely downloadble from ICPSR's Publication-Related Data
Archive (http://www.icpsr.umich.edu/pra/index.html).
This
archive is an excellent site to deposit and to locate --data associated
with a publication, as many journals are now requiring authors to do.
ICPSR also has a freely accessible Bibliography of Data-Related Literature
(http://www.icpsr.umich.edu/citations/index.html)
and a Social Science Variables Database (http://webapp.icpsr.umich.edu/cocoon/SSVD/basicSrch).
- UCI Social Science Data Archives http://data.lib.uci.edu
This is one example of a Web page that links to a variety of online
data sites. Click on "Data Sources" in the left menu for links
to freely accessible data and documentation that is freely accessible
as well as links to statistical tables. There are also links there to
licensed data, i.e., data licensed specifically to the institution,
in this case UCI. There is also a newsletter, Social Science Data Archivist:
http://data.lib.uci.edu/ssda/archabout.htm
Click on a "V" in the alphabetic listing at the top of the
data sources page for datasets and/or documentation relating to Vietnam.
Among the studies or data sites linked are:
Vietnam: ARIC [Asia Recovery Information Center] Indicators : http://aric.adb.org/pre_defined_indicators.asp?id=1
Vietnam Business Firms Survey, 1995-1997: http://www2-irps.ucsd.edu/faculty/cwoodruff/data.htm
Vietnam Census, 1989, 1999 from Integrated Public-Use Microdata Series:
http://www.ipums.umn.edu/
Vietnam: Davidson Data Center & Network from University of Michigan
Business School gateway {economic and business
data}: http://ddcn.prowebis.com/
Vietnam Life History Survey 1991: http://www.csde.washington.edu/research/vietnam/vlhswebdocs/data.html
Vietnam Living Standards Measurement Study: http://www.worldbank.org/html/prdph/lsms/country/vn98/vn98bif.pdf
Obtaining data: http://www.worldbank.org/html/prdph/lsms/guide/gnlacc.html
Vietnam Longitudinal Survey, 1995-http://www.csde.washington.edu/research/vietnam/vls.html
Vietnam: Social Accounting Matrix (SAM), 1996-97: http://www.ifpri.org/data/VietNam01.htm
Vietnam: World Values Survey 2001: http://www.democ.uci.edu/democ/archive/vietnam.htm
- Orange County Surveys: A Digital Archive: http://ocsurveys.lib.uci.edu
This site, mounted at UCI, offers over two decades of public opinion
(annual) data on Orange County, California. The datasets for each year
are freely available for downloading, in a variety of formats (including
SPSS, SAS and STATA) together with an electronic codebook that gives
the frequencies for each variable (or question). Users can also pick
variables and do cross-tabs of the selected variables online. They can
also pick selected variables and generate a customized codebook. The
site uses the Survey Documentation and Analysis (SDA) software licensed
from UC Berkeley.
- The Pacific Poll: http://pacpoll.lib.uci.edu
Another digital archive at UCI focusing on public opinion in the Pacific
Rim. Current datasets include those on Mexican Americans.
- General Social Survey, 1972-2002: http://www.icpsr.umich.edu/GSS99/
This is a national U.S. survey repeated every few years. This site
offers the ability to manipulate selected variables using the SDA software.
Click on the tab for "Analyze" to access the SDA page.
- Other SDA sites: http://www.icpsr.umich.edu/ACCESS/sda.html
ICPSR is a good example of a website that has many studies mounted
on SDA. These are called ICPSR "topical archives":
Health and Medical Care Archive | International Archive of Educational
Data | National Archive of Computerized Data on Aging | National Archive
of Criminal Justice Data | Substance Abuse & Mental Health Data
Archive
- ProQuest Reference Asia: http://asia.proquestreference.com/pqrasia
Licensed access. Current news analysis and research reports on social,
political, economic, business and cultural conditions in Greater China
(China, Taiwan, Hong Kong, Macau) and in Southeast Asia (Brunei, Cambodia,
Indonesia, Laos, Malaysia, Myanmar, Philippines, Singapore, Thailand
and Vietnam). Includes ASEAN Secretariat material. Includes country
information, statistical agency reports/yearbooks,
policy documents and industry analyses, and articles from selected journals
in the countries covered. Key individuals are also profiled. Browsable
by region or subject. Includes hyperlinks to non-governmental organization
(NGO) web sites. Available data tables are downloadable as Excel files.Includes
Hanoi Statistical Yearbook.
- Social Science Electronic Library: http://www.socio.com/edl.htm
Licensed access. Data from health and sexuality studies. Includes datasets
on HIV/AIDS, drug use, pregnancy, family studies.
- The Data Web: http://www.thedataweb.org/index.html
TheDataWeb is network of U.S. Government and non-governmental online
data libraries that have mounted Census data, economic data, health
data, income and unemployment data, population data, labor data, cancer
data, crime and transportation data, family dyanmics, and vital statistics
data. A "DataFerret" java applet or application must be downloaded
to fully utitlize this site. Studies include U.S. Census of Population
and Housing; American Community Survey, Current Population Survey, American
Housing Survey, National Health and Nutrition Examination Survey, etc.
The DataFerret tool allows for creation of charts, tables, and maps.
Data can be output in a variety of formats including SPSS, SAS and Excel.
User's guide: http://www.thedataweb.org/support/user/index.html
- IMSMA: Information Management System for Mine Action http://www.imsma.ch/
IMSMA is a database for humanitarian de-mining projects. Field results
from the Vietnam UXO/Landmine Impact Assessment & Technological
Survey, being conducted under an agreement between the U.S. (via the
Vietnam Veterans of America Foundation) and the Ministry of Defense
of Vietnam, are expected to be entered into this database when the project
is completed. Right now, digital maps are available as "Webreports"
on Chad and Yemen. UXO means Unexploded Ordnance. For more on the VVAF,
see: http://www.vvaf.org.
- The Southeast Asia Digital Atlas On-Line Demonstration: http://www.gisc.berkeley.edu/seadca/coverpage.html
This website is a demonstration project for the use of Geographic Information
Systems to spatially organize and orient research data.
This interface allows users to select data layers for display on an
interactive map. Many of the points of the data layers are linked to
additional information such as text, historic photographs, images of
sites or objects, or audio and video files. The sample research datasets
included in the on-line demonstration represent a range of data types
created for varied purposes.
Demonstration comprises: Isan Travels of Etienne Aymonnier, Ceramics
Trade Shipwrecks. Monuments of Angkor, McFarland Missionary Photo Collection
Inscriptions of Angkor and Lao Temple Murals
- TimeMap: Showcase applications: http://www.timemap.net/showcase/applications.html
More Geographic Information Systems applications with interactive mapping
of changes over time.
Tools for data archiving and data archivists
- Data Use Tutorial: http://www.icpsr.umich.edu/help/newuser.html
From ICPSR. Includes information on what is a data definition statement,
and instructions on importing raw (ASCII) data files into SPSS etc.
- SDA: Survey Documentation & Analysis: http://csa.berkeley.edu:7502/
Manuals etc. on mounting SDA software (which has to be licensed).
- DDI: Data Documentation Initiative: http://www.icpsr.umich.edu/DDI/
This is a new standard for data documentation or meta-data. It involves
XML tagging of each line of a codebook, enabling online retrieval of
information at the variable level. Site at ICPSR: "The Data Documentation
Initiative (DDI) is a project to establish an international standard
for social science documentation. The DDI site links to the most current
version of the standard and also provides information on the committee
process for developing the standard, a tag library to assist users in
preparing DDI-compliant documents, and general information related to
the eXtensible Markup Language (XML) and software tools."
- NESSTAR: http://www.nesstar.com
Nesstar software is an integrated suite of products aimed at facilitating
the location and use of socio-economic, and similarly structured, data.
It has been developed at the UK Data Archive in the University of Essex
and the Norwegian Social Science Data Services in Bergen building upon
the R&D of the EC funded NESSTAR and FASTER projects.
It allows users to browse distributed data catalogues over the Web,
examine detailed information about the data (metadata), carry out simple
data analysis (e.g. tabulations and graphical displays) and then download
data, in whole or part, in one of a number of popular formats. The system
contains registration and authentication facilities to filter access
to data as necessary and a suite of data publishing and server management
tools. [Description from website]
- IASSIST: http://datalib.library.ualberta.ca/iassist/
IASSIST which stands for International Association for Social Science
Information, Service, and Technology, is the professional organization
for social science data archivists and data producers. It meets annually
and produces a quarterly newsletter that has mainly published conference
papers. Its workshops and programs focus on cutting-edge developments
in the field. Membership is open to individuals. There is limited funding
available on a competitive basis for international conference attendees.
Contact info:
Daniel C. Tsang
Social Science Data Librarian
380 Langson Library, University of California
PO Box 19557, Irvine CA 92623-9557 USA
Tel: 1-949-824-4978
Fax: 1-949-824-2700
E-mail: dtsang@uci.edu
Drafted: 14 March 2004; Revised 22 March 2004; Revised 25 March 2004
Return to top of page