Summary of Trial Bank Project

Trial Banks: An Informatics Infrastructure for Evidence-Based Medicine

Keywords: Evidence-based medicine, meta-analysis, electronic publication, heterogeneous databases, decision-support systems, ontology

This doctoral dissertation is available in PDF here (4.7 megabytes).


Randomized controlled trials (RCTs) are one of the best sources of evidence for the scientific practice of medicine, but substantial gaps exist between every day practice and "best practice" as defined by RCTs [1].

In recent years, computer-based approaches to managing the clinical literature have become widespread, and hopes are high that information technology can help close the proof-to-practice gap and improve health care quality. However, because clinical research results are published only in text and computers cannot read text articles, computers are effectively illiterate of the evidence they are supposed to help clinicians apply. Without "knowing" the clinical details of RCTs, today's decision support systems cannot deliver RCT evidence as selectively and contextually as practicing clinicians want [2,3].

Furthermore, in many situations, clinicians are best served by information from systematic reviews that critically appraise and synthesize evidence from all relevant RCTs. However, systematic reviews are difficult and time-consuming to perform, because of incomplete reporting of study methods and results [4,5] and because information must be manually abstracted from text articles. Although reporting problems have decreased with the widely adopted CONSORT reporting recommendations [6], systematic reviewers such as those from the Cochrane Collaboration could benefit greatly from complete databases of RCT information.

In summary, while billions of dollars are spent every year conducting RCTs, results of these RCTs are difficult for both clinicians and systematic reviewers to find, interpret, or apply to clinical practice. The result is a an inefficient transfer of evidence from research to practice, and missed opportunities for improving health outcomes [1].

The Trial Bank Project

The Trial Bank Project captured RCT information into RCT Bank, an electronic knowledge base specifically designed to support systematic reviewing and evidence-based practice [7]. The goal of the Trial Bank Project was to demonstrate the value of RCT Bank as a computer-understandable repository of detailed RCT information, and to demonstrate proof of concept for sustainable ways to capture a wide variety of RCTs into RCT Bank.

RCT Bank and RCT Schema

The data model for RCT Bank (called RCT Schema) is guaranteed to contain all the trial information needed for rigorously applying RCT evidence to practice, as derived from a task analysis of systematic reviewing. RCT Bank collected information about a trial's protocol (e.g., intervention, outcomes, eligibility criteria), execution, follow-up, and summary and/or subject-level results, to allow computers to

  • analyze trial information more completely
  • identify and retrieve trials more accurately
  • facilitate critical appraisal of a trial
  • facilitate systematic review across trials
  • provide a knowledge base for clinical decision support
  • identify potential new findings and gaps in the evidence through data mining and other knowledge discovery methods

As an open-access repository of RCT evidence for computer-assisted decision support systems, RCT Bank contained much more information than other RCT databases that aim to increase subject enrollment or to catalog trials, used a standard medical vocabulary (UMLS) to code clinical concepts, and could be queried directly by other machines through its Java or Perl-based APIs [8].

RCT Schema is capable of capturing multi-armed trials, and a wide range of interventions (e.g., drugs, procedures, devices, behavioral interventions) and outcomes (e.g., dichotomous, continuous, categorical, survival, cost) in any clinical discipline.

Trial Bank Publishing

We collaborated with JAMA and the Annals of Internal Medicine (Annals) to capture into RCT Bank the RCTs they publish. In the first phase of this collaboration, Trial Bank Project staff entered data from the articles into RCT Bank using a secure, web-based program called Bank-a-Trial. As trial-bank publishing takes hold, we anticipate that authors will themselves submit RCT Bank entries in conjunction with submitting their manuscripts for peer review. RCT Bank also includes trials that are directly acquired from trial investigators independent of journal publishing.

Journal-associated databank publishing already exists. Major molecular biology journals require that authors submit their genomic sequence data directly into a database called GenBank. When the paper-based article is published, the corresponding GenBank accession number is appended to it, and readers have immediate and direct access to the sequence data to which the article refers.

Between January 2002 and July 2003, twelve JAMA and two Annals RCTs have been co-published in RCT Bank and could be browsed using RCT Presenter. Overall, 51% of authors contacted agreed to participate. The participation rate increased substantially (from 43% to 76%) after the first trial was published in Presenter. In 2003 we conducted a web-based survey that asked users to evaluate a clinical trial using both Presenter and the journal article. 70% of the respondents rated Presenter as good as or better than the journal article for all the attributes evaluated.

External Collaborations

The Trial Bank Project collaborated with the National Center for Biomedical Ontology (NCBO) as a Driving Biological Project to stimulate the development of ontology-based services and technology for Analyzing Evidence from HIV Trials. The development of CTeXplorer is one of the products of this collaboration.

We also collaborated with the Electronic Primary Care Research Network (ePCRN) project, which seeks to build an electronic infrastructure to facilitate the recruitment of subjects and the performance of randomized trials in any U.S. primary care practice with web access.


RCT Bank captured essential RCT information into an open-access, electronic knowledge base that was specifically designed to support rigorous, computer-assisted evidence-based practice. We promoted the case that simply publishing science in electronic text is no longer good enough for the needs of clinical care and research. The Trial Bank Project had the following contributions:

  • defined the first detailed information-model ontology of the design and results of randomized trials. This work directly led to our current work on the Ontology of Clinical Research.
  • demonstrated proof of concept of trial-bank publishing as a new publishing model, founded on the principle that scientific knowledge should be disseminated in the form that best facilitates its use
  • promoted the idea of that RCTs can and should be computerized at a large scale 
The Human Studyome Project is now the follow-on to the Trial Bank Project.

Funding: LM-06780 from the National Library of Medicine and the United States Presidential Early Career Award in Science and Engineering.
Past funding: VA Health Services Research and Development Service, the LM-05305 from the National Library of Medicine, and HS-08362 from the Agency for Healthcare Research and Quality.


