Research Guides: Systematic Reviews & Evidence Synthesis: Step 7: Data Extraction

About Step 7: Data Extraction

Step 7: Data Extraction (DE)

Average time (in hours) to complete

In Step 7, you will collect preselected information about included studies in a table format to summarize them and make them easier to compare and synthesize. You will: 

Choose the data you want collected from each study.
Choose a method for collecting the data.
Create the data extraction table (this step ideally includes testing the data table)
Collect or "extract" the data. 
Review the data for any errors.

How a librarian can help with this step

During Step 7, ZSR can advise you on:

What the data extraction stage entails
Finding examples of completed data tables in the literature
Choosing what data to to extract from your included articles
Creating a random sample of citations for beta testing the DE phase before full implementation
Best practices for reporting the data you choose to extract

Click HERE to contact us for SR support!

What is data extraction (DE) all about?

In the DE step, your SR Team will develop your (1) evidence table which will give detailed information for each study included in your review. You will then create a (2) summary table which give a high-level overview of the findings of your review; this is the table that is often included in published SR studies.

Project teams often use their research question framework - like PICO - to determine what data to extract from their studies, but your tables should describe study characteristics and results. They will help you determine which studies are eligible for synthesis.

Best practice (like with the screening step) suggests at least two reviewers and extractors to reduce errors and bias in your review. This step requires a lot of planning but can be made more efficient by using DE tools. Also, don't forget beta testing this step (and the tools you choose to use) can make a big difference with project efficiency. Read on!

Data extraction (DE) tools

Systematic Review Software

Pros

FREE unlimited use for wfu.edu and wakehealth.edu users
Single system for all SR review elements (screening, extraction, etc.)
Automatically highlights discrepancies
Helps assure blinding during the extraction process
Side-by-side PDF / data extraction panels

Cons

Customization of DE forms can take time / training (albeit readily available from Covidence)

Excel Spreadsheets, Google Sheets

Pros

FREE! Microsoft programs are free for WFU users and WFU is a Google-friendly campus
Quick start - easy to learn, use, and customize

Cons

Reviewers need to manually find and resolve discrepancies (may increase errors and decrease accuracy)
Increased potential for bias (these platforms don't have embedded blinding tools)

Poll Everywhere, Qualtrics, Google Forms

Pros

FREE! Poll Everywhere and Qualtrics are free for WFU users and WFU is a Google-friendly campus
Offers blinding between reviewers during DE (and therefore decreased potential for bias)

Cons

Free versions may have limited options and capabilities
Data will need to be downloaded in a useable format (all options provide these functions, but they are not consistent)

Microsoft Word, Google Docs

Pros

FREE! Microsoft programs are free for WFU users and WFU is a Google-friendly campus
Quick start - easy to learn, use, and customize

Cons

Reviewers need to manually find and resolve discrepancies (may increase errors and decrease accuracy)
Increased potential for bias (these platforms don't have embedded blinding tools)

Cochrane Revman

RevMan offers collection forms for population, interventions, and outcomes, quality assessments / risk of bias, AND for data for analysis and forest plots. RevMan is a free software download and Cochrane offers a comprehensive Knowledge Base for support and training.

Pros

FREE!
Compatible with Covidence
Includes tools to assist with writing the full review

Cons

Project teams would have to learn how to use the software
Form elements are not customizable; data must be entered manually.

What data should be extracted?

Extract the data that is relevant to answering your research question. A good practice is to review your exemplar SR's to provide guidance for what data you choose to collect. The following are items listed in Table 5.3.a in the Cochrane Handbook for Systematic Reviews of Interventions and should be considered when choosing the types of data to extract during this phase.

Data Type	Descriptions
Information about DE from reports	Name of data extractors, date of data extraction, and identification features of each report from which data are being extracted
Eligibility Criteria	Confirm eligibility of the study for the review Reason for exclusion
Study Methods	Study design (parallel, factorial, crossover, cluster aspects of design for randomized trials, and/or study design features for non-randomized studies; single or multicentre study; if multicentre, number of recruiting centres) Recruitment and sampling procedures used Enrolment start and end dates; length of participant follow-up Details of random sequence generation, allocation sequence concealment, and masking for randomized trials, and methods used to prevent and control for confounding, selection biases, and information biases for non-randomized studies* Methods used to prevent and address missing data* Statistical analysis: Unit of analysis (e.g. individual participant, clinic, village, body part) Statistical methods used if computed effect estimates are extracted from reports, including any covariates included in the statistical model Likelihood of reporting and other biases* Source(s) of funding or other material support for the study Authors’ financial relationship and other potential conflicts of interest
Participants	Setting Region(s) and country/countries from which study participants were recruited Study eligibility criteria, including diagnostic criteria Characteristics of participants at the beginning (or baseline) of the study (e.g. age, sex, comorbidity, socio-economic status)
Interventions	Description of the intervention(s) and comparison intervention(s), ideally with sufficient detail for replication: Components, routes of delivery, doses, timing, frequency, intervention protocols, length of intervention Factors relevant to implementation (e.g. staff qualifications, equipment requirements) Integrity of interventions (i.e. the degree to which specified procedures or components of the intervention were implemented as planned) Description of co-interventions Definition of ‘control’ groups (e.g. no intervention, placebo, minimally active comparator, or components of usual care) Components, dose, timing, frequency For observational studies: description of how intervention status was assessed; length of exposure, cumulative exposure
Outcomes	For each pre-specified outcome domain (e.g. anxiety) in the systematic review: Whether there is evidence that the outcome domain was assessed (especially important if the outcome was assessed but the results not presented; see Chapter 13) Measurement tool or instrument (including definition of clinical outcomes or endpoints); for a scale, name of the scale (e.g. the Hamilton Anxiety Rating Scale), upper and lower limits, and whether a high or low score is favourable, definitions of any thresholds if appropriate Specific metric (e.g. post-intervention anxiety, or change in anxiety from baseline to a post-intervention time point, or post-intervention presence of anxiety (yes/no)) Method of aggregation (e.g. mean and standard deviation of anxiety scores in each group, or proportion of people with anxiety) Timing of outcome measurements (e.g. assessments at end of eight-week intervention period, events occurring during the eight-week intervention period) Adverse outcomes need special attention depending on whether they are collected systematically or non-systematically (e.g. by voluntary report)
Results	For each group, and for each outcome at each time point: number of participants randomly assigned and included in the analysis; and number of participants who withdrew, were lost to follow-up or were excluded (with reasons for each) Summary data for each group (e.g. 2×2 table for dichotomous data; means and standard deviations for continuous data) Between-group estimates that quantify the effect of the intervention on the outcome, and their precision (e.g. risk ratio, odds ratio, mean difference) If subgroup analysis is planned, the same information would need to be extracted for each participant subgroup
Miscellaneous	Key conclusions of the study authors Reference to other relevant studies Correspondence required Miscellaneous comments from the study authors or by the review authors

*Full description required for assessments of risk of bias (see Chapter 8, Chapter 23 and Chapter 25).