Writing Systematic Reviews

It Takes a Village

By Diana L. Thompson
[Somatic Research]

I recently attended a workshop on conducting a systematic review (SR) at the Samueli Institute, a nonprofit organization in Alexandria, Virginia, that facilitates research on healing and wellness.1 Motivated by the growing body of massage research and lack of conclusive reviews, I wanted to determine if an individual like me, without a background in research, could assess the data and write a definitive SR that would help inform our profession. 

For the workshop, I was asked to identify an SR topic in advance and come prepared to begin work on the review onsite. I mentally scanned the topics of recent research conference presentations to see what stood out in my mind. Several hospitals have presented on using massage therapy in acute care settings; hospital massage has demonstrated immediate and measureable results, and potentially opens up job opportunities for massage therapists. Conclusive evidence on massage for acute care is critical as we attempt to integrate into hospital and medical clinic settings. Perhaps this would make an interesting SR.

I arrived in Virginia with the following goal in mind: to investigate the effectiveness of massage therapy for postoperative care in hospital settings, specifically on pain, function, patient satisfaction, and reduced hospital stays. 

What is a Systematic Review?

An SR is a literature review focused on a research question that tries to identify, appraise, select, and synthesize all high-quality research evidence relevant to that question.2 The goal of an SR is to draw conclusions useful for clinical practice and policy-making, and identify future directions for research.3 Most importantly, a sound, conclusive SR is a tool to assist in improving the care you provide, marketing your practice, and enhancing access to your services.

SRs are a critical component of evidence-based health care, a topic often discussed in this column. In order for evidence to better inform practice, there must be consensus across a wide range of studies, not just a single study. Specific aims of an SR consist of identifying if sufficient research exists on a particular topic, evaluating the body of research for methodological quality, and determining if cross-study consensus is sufficient to draw meaningful conclusions. SRs can also identify gaps in the evidence that can then be used to inform future studies.

A good SR starts with a clearly defined research question; conducts a thorough, systematic search of the literature; rates the quality of all studies that meet the predetermined criteria; and critically presents both positive and negative findings. All this is done with a team of experts, including subject matter experts, reviewers, a librarian, and sometimes a statistician. 

Identifying a
Research Question

An interesting and productive research question starts with a topic pertinent to a common clinical dilemma or condition. With postoperative somatic therapy, we might be concerned with feasibility (are hospitals willing to provide massage or bodywork to patients) and with outcomes (how does massage benefit postoperative patients).

The next step is to clearly define the research question. Which should we explore: massage therapy in general or specific bodywork modalities? Which measures and outcomes will best represent acute care outcomes: reducing pain, relieving stress, or improving function? The more specific the question, the more targeted the search and the greater the possibility for drawing meaningful conclusions.4 To clarify the research question, use a process of defining and refining the population, intervention, comparison, and outcomes, also known as PICO.


Identify the characteristics of the population you wish to investigate, specifically the important characteristics of the typical patient: the condition or primary problem, the chief complaints or symptoms, and any added information that might influence the results. Consider age, ethnicity, gender, or groups that define the population, such as athletes, baby boomers, military personnel, twins, etc. 

These populations will also have several types of surgical interventions specific to their needs. For example, cesarean sections are specific to women between puberty and middle age. Instead of limiting the type of surgery, it makes more sense to limit the population. Until massage is more common in postoperative care hospital units, however, we are interested in people of all ages, ethnicities, and genders following any surgical intervention. 


The intervention identifies the techniques or disciplines used to address postoperative care, the frequency and duration of the care provided, and the qualifications of the person(s) applying the techniques. 

Massage therapy is a general term used by most researchers, and often incorporates other somatic practices, such as craniosacral therapy, lymphatic drainage, and neuromuscular therapy. Energy work and techniques that are regulated differently are often segregated, such as aromatherapy, reflexology, reiki, Therapeutic Touch, etc., and can be searched separately. It is less common to find research on individual hands-on bodywork approaches. Therefore, rather than narrowing the search to a technique that may not yield very many results, I chose to use the general term massage and cast a wider net.  

Specifying dosing—when, how often, and how long the massage would be administered—further clarifies the research question. In order to have a direct effect on acute postoperative pain, I felt massage needed to be administered within three days of the surgical intervention, performed at least once, and applied for at least 20 minutes.

Finally, we must determine the criteria for those implementing the protocol. For this SR to be useful for influencing the integration of massage therapy in health-care settings, it seems important that the massage is performed by licensed or certified massage therapists, rather than untrained touch administered by family members or massage performed mechanically. I also decided to rule out nursing studies unless the nurses were dual-licensed as massage therapists.

The intervention is defined as non-specified massage performed by licensed or certified massage therapists at least once within three days of the surgical intervention for at least 20 minutes. 


Comparison specifies the type of control group used in the massage studies. Many SRs review randomized control trials or clinical comparison trials and ignore smaller pilot studies, making the control group an important factor in the article selection process. However, massage therapy does not easily lend itself to placebo trials, so it may be helpful to conduct reviews that include smaller studies, which may not include comparison groups, so we can begin to drawn meaningful conclusions. 

“Grey” literature includes studies that have yet to be published in peer-reviewed journals. Conference proceedings, dissertations, and studies that are still in process can be searched and included in the reviews. I have been citing grey literature in the form of conference proceedings in an attempt to bring forth current information, even though it was not yet published in peer-reviewed journals. Articles that have not been peer-reviewed, however, have not been critically assessed for bias, and therefore warrant a rigorous review before inclusion in SRs.


Outcomes identify the measures that will be used to note the efficacy, success, or progress of the intervention. Pain is the primary symptom in postoperative patients. Measures for pain can vary, most commonly including a zero-to-10 numerical rating scale or a visual analog scale by which patients mark their current level of pain on a continuum between no pain and unbearable pain. Pain can also affect blood pressure, function, heart rate, and sleep, and contribute to anxiety, fatigue, stress, or tension, inviting many other effect measures. I refined my outcomes criteria to include three categories of information: pain, function (including but not limited to sleep), and anxiety (including but not limited to relaxation, stress, and tension).  

Selecting Articles for Review

The next step is to select articles to inform the review. First, identify relevant keywords or search terms. Here is the list of keywords culled from the work we did above by defining PICO: “massage,” “massage therapy,” “postsurgery,” “acute pain,” “sleep,” “patient satisfaction,” “stress,” “anxiety,” “depression.” 

Next, identify which databases to search. In order to conduct a comprehensive search and include as many articles on the subject as possible, a wide range of databases should be included. I only have access to publicly available databases, and can only afford to include articles that are open access—this was my first clue that I needed help to conduct an SR. 

Methodological bias is the biggest threat to the validity of an SR.5 When selecting articles for review, include all that meet the predetermined selection criteria. It is unacceptable to rule out articles because of poor outcomes, and positive results should never be overstated, whether to gain favor with a publisher or to enhance the outcomes. 

SRs often exclude studies if they do not conform to certain study designs, are not written in English, or do not occur within a certain timeframe.6 The best tool for mitigating bias is transparency. Define eligibility requirements and document the process and all decisions—all steps of the SR process should be explicitly stated, and an explanation for how all decisions were made should be included and documented throughout each step of the SR. A description of the process must be included in the final manuscript.

Screening, Reviewing, and Interpreting

Once the citations and abstracts of all the results of the keyword searches are compiled, create a list of questions to standardize the screening process. The goal is to create a standardized process for determining whether the titles and abstracts match the inclusion/exclusion criteria. Those that are clearly irrelevant are excluded, full-text papers are obtained for the remaining articles, and the criteria are applied again. Those that meet the criteria are included in the review and move on to the scoring phase.7 

Next, the articles are scored independently by at least two reviewers. Multiple reviewers are essential for limiting bias, which makes it impossible for me, alone, to write an SR. Standardized, objective tools are available for scoring articles, or a customized checklist can be created. Every reviewer uses the same tool, and conflicts are tracked and discussed until consensus is reached.8 The review team stays in close communication throughout the rest of the process.

Reviewers assess the internal validity (the outcome is a direct result of the intervention), external validity (the outcomes are generalizable to a larger population), and model validity (the study is representative of practice). This determines the strength of the research data. Quantity and quality is key: more high quality research draws stronger, more definitive conclusions. 

Once the studies are scored, the results are summarized into tables. All reviewers must gather at this point and interpret the data together. This is the time to critically appraise both positive and negative findings, discuss the results, and draw conclusions. Everyone participates in writing the report, dividing up the sections according to roles in the project, and the final document is edited and approved as a group.  

Collaboration in Quality

Through this valuable experience, I learned that it takes a village to write a quality SR, one that can advance the profession, help identify policy, and inform best practices. While I will continue to write articles on my own, which summarize research on topics of interest, I more fully understand the difference between translating a few articles versus conducting a comprehensive SR.

I am encouraged and motivated by the newfound relationship with the Samueli Institute. At any time, a group within the profession can put forth a project by identifying a topic of interest, gathering the necessary funds, appointing a primary investigator to lead the project, and enlisting an organization like the Samueli Institute to provide the necessary tools and skilled personnel to conduct a defensible SR. I believe the time has come for more conclusive reviews on somatic therapies. 


1. Samueli Institute, “News and Information,” accessed August 2012, www.siib.org/news/news-home/120-SIIB.html.

2. Deborah J. Cook et al., “Systematic Reviews: Synthesis of Best Evidence for Clinical Decisions,” Annals of Internal Medicine 126, no. 6 (1997): 376–80.

3. C. Crawford, S. Jain, and W.B. Jonas, Introduction to Systematic Reviews Workbook (Samueli Institute, 2012). Author workbook for private seminar.

4. University of Southern California, “Asking a Good Question (PICO),” accessed August 2012, www.usc.edu/hsc/ebnet/ebframe/PICO.htm.

5. Crawford, Introduction to Systematic Reviews Workbook.

6. Ibid.

7. Ibid.

8. Ibid.

  A licensed massage practitioner since 1984, Diana L. Thompson has created a varied and interesting career out of massage: from specializing in pre- and postsurgical lymph drainage to teaching, writing, consulting, and volunteering. Her consulting includes assisting insurance carriers on integrating massage into insurance plans and educating researchers on massage therapy theory and practice to ensure research projects and protocols are designed to match how we practice. Contact her at soapsage@comcast.net.