Finding Social Science Data

Outline of talk by Daniel C. Tsang
Fulbright Scholar to Vietnam
Social Science Data Librarian, University of California, Irvine


There is a variety of social science data now available on the Web. These data can be:

    - Freely accessible

    - Documentation [or meta-data, i.e., data about data] freely available

    - Licensed access or fee-based access

    - Restricted access

Often, if the data are not available on the Web, information will be given on how to obtain the data, usually after registration.

Social Science Data Archives (maps from CESSDA)

Europe: http://www.nsd.uib.no/cessda/europe.html

North America: http://www.nsd.uib.no/cessda/namer.html

Other: http://www.nsd.uib.no/cessda/other.html

Specific web sites - ICPSR http://www.icpsr.umich.edu

ICPSR stands for Inter-university Consortium for Political and Social Research. This data archive is the largest social science data archive in the U.S. Free access is available to its members who pay an annual membership fee depending on the type of institution. But selected datasets as well as all codebooks (documentation) are available freely. Among the codebooks freely available on this site are earlier waves of the World Values Survey (WVS; the data are available to licensed users). This site is where the 2001 WVS that includes the Vietnam component. Datasets are freely downloadble from ICPSR's Publication-Related Data Archive (http://www.icpsr.umich.edu/pra/index.html). This
archive is an excellent site to deposit and to locate --data associated with a publication, as many journals are now requiring authors to do. ICPSR also has a freely accessible Bibliography of Data-Related Literature (http://www.icpsr.umich.edu/citations/index.html) and a Social Science Variables Database (http://webapp.icpsr.umich.edu/cocoon/SSVD/basicSrch).

- UCI Social Science Data Archives http://data.lib.uci.edu

This is one example of a Web page that links to a variety of online data sites. Click on "Data Sources" in the left menu for links to freely accessible data and documentation that is freely accessible as well as links to statistical tables. There are also links there to licensed data, i.e., data licensed specifically to the institution, in this case UCI. There is also a newsletter, Social Science Data Archivist:

Click on a "V" in the alphabetic listing at the top of the data sources page for datasets and/or documentation relating to Vietnam. Among the studies or data sites linked are:

    Vietnam: ARIC [Asia Recovery Information Center] Indicators : http://aric.adb.org/pre_defined_indicators.asp?id=1

    Vietnam Business Firms Survey, 1995-1997: http://www2-irps.ucsd.edu/faculty/cwoodruff/data.htm

    Vietnam Census, 1989, 1999 from Integrated Public-Use Microdata Series: http://www.ipums.umn.edu/

    Vietnam: Davidson Data Center & Network from University of Michigan Business School gateway {economic and business
    data}: http://ddcn.prowebis.com/

    Vietnam Life History Survey 1991: http://www.csde.washington.edu/research/vietnam/vlhswebdocs/data.html

    Vietnam Living Standards Measurement Study: http://www.worldbank.org/html/prdph/lsms/country/vn98/vn98bif.pdf

    Obtaining data: http://www.worldbank.org/html/prdph/lsms/guide/gnlacc.html

    Vietnam Longitudinal Survey, 1995-http://www.csde.washington.edu/research/vietnam/vls.html

    Vietnam: Social Accounting Matrix (SAM), 1996-97: http://www.ifpri.org/data/VietNam01.htm

    Vietnam: World Values Survey 2001: http://www.democ.uci.edu/democ/archive/vietnam.htm

- Orange County Surveys: A Digital Archive: http://ocsurveys.lib.uci.edu

This site, mounted at UCI, offers over two decades of public opinion (annual) data on Orange County, California. The datasets for each year are freely available for downloading, in a variety of formats (including SPSS, SAS and STATA) together with an electronic codebook that gives the frequencies for each variable (or question). Users can also pick variables and do cross-tabs of the selected variables online. They can also pick selected variables and generate a customized codebook. The site uses the Survey Documentation and Analysis (SDA) software licensed from UC Berkeley.

- The Pacific Poll: http://pacpoll.lib.uci.edu

Another digital archive at UCI focusing on public opinion in the Pacific Rim. Current datasets include those on Mexican Americans.

- General Social Survey, 1972-2002: http://www.icpsr.umich.edu/GSS99/

This is a national U.S. survey repeated every few years. This site offers the ability to manipulate selected variables using the SDA software. Click on the tab for "Analyze" to access the SDA page.

- Other SDA sites: http://www.icpsr.umich.edu/ACCESS/sda.html

ICPSR is a good example of a website that has many studies mounted on SDA. These are called ICPSR "topical archives":

Health and Medical Care Archive | International Archive of Educational Data | National Archive of Computerized Data on Aging | National Archive of Criminal Justice Data | Substance Abuse & Mental Health Data Archive

- ProQuest Reference Asia: http://asia.proquestreference.com/pqrasia

Licensed access. Current news analysis and research reports on social, political, economic, business and cultural conditions in Greater China (China, Taiwan, Hong Kong, Macau) and in Southeast Asia (Brunei, Cambodia, Indonesia, Laos, Malaysia, Myanmar, Philippines, Singapore, Thailand and Vietnam). Includes ASEAN Secretariat material. Includes country information, statistical agency reports/yearbooks,
policy documents and industry analyses, and articles from selected journals in the countries covered. Key individuals are also profiled. Browsable by region or subject. Includes hyperlinks to non-governmental organization (NGO) web sites. Available data tables are downloadable as Excel files.Includes Hanoi Statistical Yearbook.

- Social Science Electronic Library: http://www.socio.com/edl.htm

Licensed access. Data from health and sexuality studies. Includes datasets on HIV/AIDS, drug use, pregnancy, family studies.

- The Data Web: http://www.thedataweb.org/index.html

TheDataWeb is network of U.S. Government and non-governmental online data libraries that have mounted Census data, economic data, health data, income and unemployment data, population data, labor data, cancer data, crime and transportation data, family dyanmics, and vital statistics data. A "DataFerret" java applet or application must be downloaded to fully utitlize this site. Studies include U.S. Census of Population and Housing; American Community Survey, Current Population Survey, American Housing Survey, National Health and Nutrition Examination Survey, etc. The DataFerret tool allows for creation of charts, tables, and maps. Data can be output in a variety of formats including SPSS, SAS and Excel. User's guide: http://www.thedataweb.org/support/user/index.html

- IMSMA: Information Management System for Mine Action http://www.imsma.ch/

IMSMA is a database for humanitarian de-mining projects. Field results from the Vietnam UXO/Landmine Impact Assessment & Technological Survey, being conducted under an agreement between the U.S. (via the Vietnam Veterans of America Foundation) and the Ministry of Defense of Vietnam, are expected to be entered into this database when the project is completed. Right now, digital maps are available as "Webreports" on Chad and Yemen. UXO means Unexploded Ordnance. For more on the VVAF, see: http://www.vvaf.org.

- The Southeast Asia Digital Atlas On-Line Demonstration: http://www.gisc.berkeley.edu/seadca/coverpage.html

This website is a demonstration project for the use of Geographic Information Systems to spatially organize and orient research data.

This interface allows users to select data layers for display on an interactive map. Many of the points of the data layers are linked to additional information such as text, historic photographs, images of sites or objects, or audio and video files. The sample research datasets included in the on-line demonstration represent a range of data types created for varied purposes.

Demonstration comprises: Isan Travels of Etienne Aymonnier, Ceramics Trade Shipwrecks. Monuments of Angkor, McFarland Missionary Photo Collection

Inscriptions of Angkor and Lao Temple Murals

- TimeMap: Showcase applications: http://www.timemap.net/showcase/applications.html

More Geographic Information Systems applications with interactive mapping of changes over time.

Tools for data archiving and data archivists

- Data Use Tutorial: http://www.icpsr.umich.edu/help/newuser.html

From ICPSR. Includes information on what is a data definition statement, and instructions on importing raw (ASCII) data files into SPSS etc.

- SDA: Survey Documentation & Analysis: http://csa.berkeley.edu:7502/

Manuals etc. on mounting SDA software (which has to be licensed).

- DDI: Data Documentation Initiative: http://www.icpsr.umich.edu/DDI/

This is a new standard for data documentation or meta-data. It involves XML tagging of each line of a codebook, enabling online retrieval of information at the variable level. Site at ICPSR: "The Data Documentation Initiative (DDI) is a project to establish an international standard for social science documentation. The DDI site links to the most current version of the standard and also provides information on the committee process for developing the standard, a tag library to assist users in preparing DDI-compliant documents, and general information related to the eXtensible Markup Language (XML) and software tools."

- NESSTAR: http://www.nesstar.com

Nesstar software is an integrated suite of products aimed at facilitating the location and use of socio-economic, and similarly structured, data. It has been developed at the UK Data Archive in the University of Essex and the Norwegian Social Science Data Services in Bergen building upon the R&D of the EC funded NESSTAR and FASTER projects.

It allows users to browse distributed data catalogues over the Web, examine detailed information about the data (metadata), carry out simple data analysis (e.g. tabulations and graphical displays) and then download data, in whole or part, in one of a number of popular formats. The system contains registration and authentication facilities to filter access to data as necessary and a suite of data publishing and server management tools. [Description from website]

- IASSIST: http://datalib.library.ualberta.ca/iassist/

IASSIST which stands for International Association for Social Science Information, Service, and Technology, is the professional organization for social science data archivists and data producers. It meets annually and produces a quarterly newsletter that has mainly published conference papers. Its workshops and programs focus on cutting-edge developments in the field. Membership is open to individuals. There is limited funding available on a competitive basis for international conference attendees.

Contact info:
Daniel C. Tsang
Social Science Data Librarian
380 Langson Library, University of California
PO Box 19557, Irvine CA 92623-9557 USA
Tel: 1-949-824-4978
Fax: 1-949-824-2700
E-mail: dtsang@uci.edu
Drafted: 14 March 2004; Revised 22 March 2004; Revised 25 March 2004

