Skip to Main Content

Research Data Management

Ethical Considerations

National Statement on Ethical Conduct in Human Research (2007) (Updated May 2015)

Section 3: Ethical considerations specific to research methods or fields.

"This section discusses various research methods and fields. Some chapters are a result of the further expansion of this revised National Statement beyond health and medical research. The focus is on general principles – the section is not intended to be exhaustive. It reflects the interdisciplinary nature of many types of research and the use, in some research projects, of a number of different research methods."

Internet-based research is a relatively new area and deserves careful consideration of ethical practice.
Convery, I., & Cox, D. (2012). A review of research ethics in internet-based research. Practitioner Research in Higher Education, 6(1), 50-57.
Bond, C. S., Ahmed, O.H., Hind, M., Thomas, B., & Hewitt-Taylor, J. (2013). The conceptual and practical ethical dilemmas of using health discussion board posts as research data. Journal of Medical Internet Research, 15(6), e112.
DOI: 10.2196/jmir.2435
Clark, K. Duckham, M. Guillemin, M. Hunter, A. McVernon, J. O’Keefe, C. Pitkin, C. Prawer, S. Sinnott, R. Warr, D. Waycott, J. (2015). Guidelines for the ethical use of digital data in human research. The University of Melbourne.

Maintaining Integrity

This learning module, from the Responsible Conduct in Data Management tutorial developed by the Northern Illinois University, describes issues related to maintaining integrity of the data collection process.

"Data collection is the process of gathering and measuring information on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. The data collection component of research is common to all fields of study including physical and social sciences, humanities, business, etc. While methods vary by discipline, the emphasis on ensuring accurate and honest collection remains the same."

SAGE Research Methods Online

Anonymising Data

The broad term ‘anonymisation’ can be used to cover various techniques which convert personal data into de-identified data, to assist in making rich data resources available whilst protecting individuals’ privacy.

Some common techniques are covered here, but each has pros and cons and may not remove risk completely, depending on the data and potentially the additional information an intruder has access to.

The information here is from the document Anonymisation: managing data protection risk - code of practice by the U.K Information Commissioner’s Office, which has more detail about each of these techniques, as well as case studies.

  • Data masking
    This involves stripping out obvious personal identifiers such as names from a piece of information, to create a data set in which no person identifiers are present. Could be partial or use a linking ‘key’.
  • Pseudonymisation
    De-identifying data so that a coded reference or pseudonym is attached to a record to allow the data to be associated with a particular individual without the individual being identified.
  • Aggregation
    Data is displayed as totals, so no data relating to or identifying any individual is shown. Small numbers in totals are often suppressed through ‘blurring’ or by being omitted altogether.
  • Derived data items and banding
    Derived data is a set of values that reflect the character of the source data, but which hide the exact original values. This is usually done by using banding techniques to produce coarser-grained descriptions of values than in the source dataset eg replacing dates of birth by ages or years, addresses by areas of residence or wards, using partial postcodes or rounding exact figures so they appear in a normalised form.
Remember to factor in the time and money to prepare data for sharing.

A recent paper published in Trials (BioMed Central) examined these factors and, from the two examples, established that around 40-50 hours of staff time and £2,000 - £3,000 were needed to prepare patient level data from clinical trials.

The Anonymisation Decision-Making Framework

Preparing Data For Sharing: Guide to Social Science Data Archiving

Data Documentation

The UK Data Archive provides a comprehensive overview of data documentation at various levels of granularity.

"A crucial part of making data user-friendly, shareable and with long-lasting usability is to ensure they can be understood and interpreted by any user. This requires clear data description, annotation, contextual information and documentation."

File Formats and File Naming

© Western Sydney University, unless otherwise attributed.
Library guide created by Western Sydney University Library staff is licenced under a Creative Commons Attribution 4.0 International (CC BY)