Task II: Judging Internal Validity

Back to Core Tasks


Competencies

  1. Was the statistical design of the trial appropriate?
  2. Was there any intervention assignment bias?
  3. Were the intervention groups comparable?
  4. Was there any intervention-related bias?
  5. Were there co-interventions that may have confounded the results?
  6. Are the outcome variables meaningful?
  7. Was there any outcome assessment or measurement bias?
  8. Was there any follow-up bias?
  9. Were the results analyzed appropriately?
  10. What biases might the trial personnel have introduced?
  11. Is the trial internally valid?

Competency Decomposition

Competency A: Was the statistical design of the trial appropriate?
Subcompetency
Justification
Data Requirement
1. What were the study hypotheses? Primary hypothesis is the question the trial was most designed to answer a. primary hypothesis
Secondary hypotheses are the questions that data were collected for, but not necessarily for a definitive answer b. secondary hypotheses 
Findings for post-hoc hypotheses are less persuasive than for a priori hypotheses c. post-hoc hypotheses
2. Were the analysis groups and subgroups appropriate? Specification of a priori subgroups should relate to the study hypotheses a. a priori subgroups
Findings for post-hoc subgroup analyses are less persuasive than for a priori ones b. post-hoc subgroups
3. Was the trial designed with reasonable power to answer the primary hypothesis? The outcome on which the power calculations are performed should be related to the primary hypothesis a. powered outcome
The difference in effect size that the trial is designed to detect should be clinically significant b. hypothesized difference in effect size 
The higher the alpha, typically 0.05, the greater the chance of false positive result in one or both directions c.i. alpha level, ii. one or two-tailed
The higher the power, typically 0.80 (i.e. beta = 0.2), the lower the chance of false negative result  d. power
The method for calculating a target sample size depends on the type of variable being analyzed e. i. target sample size, and ii. method of calculation
Target enrollment is the recruitment goal f. target enrollment
Target enrollment may have to be inflated above the target sample size to allow for enrollment refusals, dropouts, etc. This data also helps future planning for related trials.  g. explanation of any difference between target sample size and target enrollment
The actual power achieved by the trial depends on the sample size achieved, and may lead to less negative predictive value than anticipated h. actual sample size
4. Was the trial monitored appropriately? Information on monitoring committees needed, to see who did the interim analyses, had stopping power a.i. name, and ii. makeup of monitoring committee(s)
Details of interim analysis plans needed to assess whether bias may be introduced in subsequent conduct of trial  b. interim analysis plans 
How the results affected execution of the trial is helpful for determining presence of any bias c. procedure for reporting to investigators findings of monitoring
If no committee members were trained in statistics, they may miss errors d. statisticians on monitoring committee?
Area of specialization of committee members may bias oversight e. areas of specialization of monitoring committee members
A data monitoring committee member who was also an author may not be independent. f. monitoring committee members authors?
5. Was the trial stopped prematurely? Require details of stopping rule used a. description of stopping rule
If stopping rule not defined a priori, may allow for bias in when to stop trial b. when stopping rule defined
How often was the data peeked at? when? what adjustments were made for this? c. i. monitoring schedule, ii. adjustment for multiple looks
How premature was trial stoppage? Premature termination of trial may exaggerate finding, and may leave secondary hypotheses unanswered d. when trial stopped relative to planned
6. Were there important differences between the trial's design and its execution? Require to know stage of trial to know what to critique a. current stage of trial
If the protocol changed from design to execution, the trial may no longer be a valid test of the trial hypotheses b.i. changes between intended and executed protocols, ii. reasons for the changes
Knowing when protocol changed gives idea of how many subjects were affected by the change c. date of protocol changes

 Back to [Top] [Core Tasks]
 
Competency B: Was there any intervention assignment bias?
Subcompetency
Justification
Data Requirement
1. What was the unit of randomization? Definition of unit of randomization necessary to judge appropriateness of statistical analysis a. unit of randomization
2. Was the randomization schedule truly random? Randomized allocation minimizes selection bias by equally distributing unknown confounders between the intervention groups a. random sequence generation method
If fixed randomization scheme: was one group oversampled? Variables that are stratified are not randomly distributed in the intervention groups; smaller blocking sizes can interfere with randomization b.i. allocation ratio, ii. stratification variables, iii. blocking scheme
If adaptive randomization scheme: describe method (number, baseline, outcome adaptive?)  c. adaptive randomization method
3. Was intervention allocated randomly?  Subjects have to be allocated to an intervention based on some application of the randomization schedule a. method of intervention allocation
Unconcealed allocation is associated with exaggerated outcomes b. method of allocation concealment
4. How effective was allocation concealment? Data on whether the person in charge of allocating interventions could guess which intervention upcoming subjects were to get tells if person could second guess allocation  a. allocator's guess of intervention allocation 
 
 Back to [Top] [Core Tasks]
 
Competency C: Were the intervention groups comparable?
Subcompetency
Justification
Data Requirement
1. How effective was the randomization? If baseline characteristics are equally distributed statistically between the randomized groups, unknown characteristics are also likely to be equally distributed. a. i. baseline characteristics, ii. statistical test for difference, iii. statistical result
2. Were groups comparable after randomization? Subject characteristics could have changed between eligibility determination and randomization, such that intervention groups become less comparable than at enrollment a. time interval between enrollment and randomization 
Subject characteristics could have changed between  randomization and intervention, such that intervention groups become less comparable than at randomization b. time interval between randomization and intervention
 
 Back to [Top] [Core Tasks]
 
Competency D: Was there any intervention-related bias?
Subcompetency
Justification
Data Requirement
1. What was the experimental intervention? The intended intervention is what the trial was designed to test. Particular details depend on the type of intervention (drug, procedure, behavorial, environmental).  a. description of intervention i. type, and ii. type-specific details
Intended intervention may include modifications for specific subject circumstances b. subject-specific adjustments allowed
Intervention effect can only be ascertained if it was clear who got what intervention  c. which intervention groups assigned to intervention
Performance bias may exist if intervention received differed substantially from what was intended d. differences between planned and actual intervention
2. What was the control intervention? Since the intervention effect is specified as a comparison to the control, we must know what the control intervention was a. description of control i. type, and ii. type-specific details
Rationale for a placebo control should be explicitly discussed b. justification for type of control
Explicit description of similarity of interventions yields information on probability of success in masking intervention c. similarity of control and experimental intervention
Intervention effect can only be ascertained if it was clear who got what intervention d. which intervention groups assigned to control
3. Was there differential compliance across the intervention and control groups? Exclusion bias can result if certain types of subjects are more likely not to complete assigned intervention. a. what proportion of each intervention group completed their assigned intervention
Subjects who complete their assigned intervention but do so with less than 100% compliance dilute the intervention effect b. compliance in each intervention group
Presence of systematically different reasons between intervention groups to discontinue assigned intervention introduces a hidden bias c. i. reasons for not completing assigned intervention, ii. number of subjects for each reason in each intervention group
Subjects who cross-over dilute the intervention effect d. number who crossed over to other intervention 
4. Was intervention masking achieved? Unblinding of subjects may lead to performance bias a.i. method, and ii. efficacy of blinding of subjects to intervention
Unblinding of care providers may lead to performance bias b. i. method, and ii. efficacy of blinding of provider(s) to intervention
Unblinding of study nurses may lead to performance bias c. i. method, and ii. efficacy of blinding of study nurse(s) to intervention
Unblinding of investigators may lead to performance bias d. i. method, and ii. efficacy of blinding of investigator(s) to intervention
5. Were trial participants blinded to interim trial results? Unblinding of subjects to results may lead to performance bias a. i. method, and ii. efficacy of blinding of subjects to results
Unblinding of care providers to results may lead to performance bias b. i. method, and ii. efficacy of blinding of provider(s) to results
Unblinding of study nurses to results may lead to performance bias c. i. method, and ii. efficacy of blinding of study nurse(s) to results
Unblinding of investigators to results may lead to performance bias d. i. method, and ii. efficacy of blinding of investigator(s) to results
 
 Back to [Top] [Core Tasks]
 
Competency E: Were there co-interventions that may have confounded the results?
Subcompetency
Justification
Data Requirement
1. Could pre-enrollment interventions have confounded the results? If used, how long was the washout time? A prior intervention may still be a confounder if its effects last longer than washout period a. duration of washout period
2. Were there co-intenventions that may have confounded the results? Allowed co-interventions helps in generalizability a. description of allowed co-interventions i. type, and ii. type-specific details
Effects that are in fact due to co-interventions may be falsely attributed to the intervention b. i. type, and ii. type-specific details of actual co-interventions, iii. by which intervention groups
If co-interventions were disproportionately taken by one group, then the observed effect cannot so easily be ascribed only to the tested intervention c. proportion of each intervention group taking each co-intervention
3. Could follow-up activities have confounded the results? Frequent clinic visits during trial follow-up may lead to improved outcomes that are not generalizable to the non-experimental setting a. schedule of follow-up visits
Actions at each follow-up could constitute additional therapy, or may lead to casefinding bias b. actions during follow-up
Follow-up personnel could have contributed a intervention effect, e.g. friendly nurses c. personnel that carried out the follow-up activities
Performance bias may exist if intervention groups received more follow-up activities differentially d. proportion receiving follow-up activities per intervention group
 
 Back to [Top] [Core Tasks]
 
Competency F: Are the outcome variables meaningful?
Subcompetency
Justification
Data Requirement
1. What were the outcome variables? Well-defined outcomes (e.g. death) are less subject to error in measurement than poorly defined ones a. outcome definitions
Timing of outcome assessment should make sense pathophysiologically or clinically, and on relevant subgroups if not assessed in all subjects b. i. when outcome assessed, ii. on which intervention groups
Primary outcome is the one used in the a priori power calculation for the trial c. designation of i. primary and ii. secondary outcomes
2. Are the outcomes intermediate or final? Intermediate outcomes may give only weak support to the study's hypothesis a. outcome definitions
Require the study hypotheses to determine if the outcomes are intermediate or not b. i. primary and ii. secondary hypotheses
Require the objective of the study to determine if the outcomes are intermediate or not c. study objective
3. What side effects, if any, were monitored? Side effects important for establishing the clinical context of the intervention effect a. side effect definitions
Timing of side effect assessment should make sense pathophysiologically or clinically, and on relevant subgroups if not assessed in all subjects b. i. when side effects assessed, ii. on which intervention groups
4.Were there any changes in the outcome definitions between design and execution? Trial may not be as valid if trial actually measured something other than originally intended a. i. outcomes changed, ii. why, iii. to what
 
 Back to [Top] [Core Tasks]
 
Competency G: Was there any outcome assessment or measurement bias?
Subcompetency
Justification
Data Requirement
1. How was each outcome assessed? Full description of assessment method is needed to assess presence or absence of detection bias a. description of assessment method
Untrained or improperly trained assessors can introduce detection bias b. description of assessors
2. How accurate was the assessment method? Unreliable or poorly validated measurement may cause detection bias a. i.validity and ii. reproducibility of assessment method
3. Did the otucome assessors have any knowledge that may have led to biased assessment? Lack of assessor blinding can lead to detection bias a. i. method, and ii. efficacy of blinding of assessor(s) to intervention received
Lack of assessor blinding can lead to detection bias  b. i. method, and ii. efficacy of blinding of assessor(s) to interim results
 
 Back to [Top] [Core Tasks]
 
Competency H: Was there any follow-up bias?
Subcompetency
Justification
Data Requirement
1. Was there differential follow-up between the intervention and control groups? Lesser follow-up reduces the precision of observed results, and magnifies potential exclusion bias a. proportion of subjects followed up, in each intervention group
Exclusion bias can result if certain subjects are systematically more likely to be lost to follow-up b. clinical characteristics of i. followed and ii. not followed, in each intervention group
Reasons for loss to follow-up may provide information on nature and extent of exclusion bias c. i. reasons for lack of follow-up, and ii. how many for each reason, in each intervention group
2. Was there differential rates of outcomes assessment between the intervention and control groups? Missing data can lead to exclusion bias, from incomplete measurement a.% of subjects yielding usable data at each timepoint, in each intervention group
Exclusion bias can result if certain subjects are systematically more likely to be lost to follow-up b. clinical characteristics of i. assessed and ii. not assessed, for each outcome in each intervention group
Reasons for lack of outcome assessment may provide information on nature and extent of exclusion bias c.i. reasons outcome not assessed, and ii. how many for each reason, for each outcome in each intervention group
Duration of follow-up gives information on attrition of subjects overtime d.i. mean follow-up, ii. person-years of follow-up for each outcome, in each intervention group
 
 Back to [Top] [Core Tasks]
 
Competency I: Were the results analyzed appropriately?
Subcompetency
Justification
Data Requirement
1. What were the raw results of the study? 
 
 
 
 

 

Raw results must be clear, e.g. must have a denominator a.i. numerator and ii. denominator of all raw results
Both the estimate of the effect and its precision (e.g., standard deviation) are needed b. summary descriptors, with precision
Parameterized summary descriptors can be misleading if done inappropriately c. justification for parameterization, or transformation
Require to know when this datum was assessed d. follow-up time per datapoint
2. What perspective(s) was used? Intention-to-treat analysis is less biased than efficacy analysis, but efficacy analysis provides more information on effectiveness of intervention a. intention to treat and/or efficacy analysis?
Many different definitions of intention-to-treat and of efficacy analysis are used b.i definition of intention to treat analysis, ii. definition of efficacy analysis
3. Were appropriate statistical analyses performed? 
 
 
 
 

 

Require to know which statistical method used for each test, to be able to duplicate it. Software errors may invalidate results a. for each test, i. name of statistical method(s),ii. software used
Inappropriate methods can yield misleading results b. justification for use of these statistical methods
Actual value of test statistic more useful than a declaration of significance c. actual result of test statistic, i. estimate, ii. upper 95% and iii. lower 95% confidence interval
Statistical methods have strong assumptions about nature of data that may be inappropriate (e.g. normality) d. evidence that assumptions were fulfilled or reasonable
4. Were losses to follow-up handled appropriately? Inappropriate handling of losses to follow-up can lead to misleading results a. censoring method
5. Are the results robust to alternative analyses and inferential statistics? Subject-level data needed for reanalysis by other investigators using other methods a. raw results, follow-up time, and completeness, as II.H.2.d, II.I.1.a and d
 
 Back to [Top] [Core Tasks]
 
Competency J: What outside biases might have been introduced?
Subcompetency
Justification
Data Requirement
1. Could the source of funding have introduced bias? 

 

Commercial or other interests may influence a study's outcome a. funding source i.who, ii. what type
The reporting may be biased if biased sponsors had right to modify or withdraw the manuscript b. funder's role in preparation of manuscript
2. How likely is it that the investigators introduced bias? 
 
 

 

Particular investigators may have known subject biases a. investigators
Area of specialization may bias design and/or results b. area of specialization of each investigator
If investigators have financial interest in outcome of study, they could introduce bias c.i amount of money involved, ii. nature of financial conflict
Open access to investigators for questions and clarifications provides accountability for integrity of results d. i. name and ii. contact information for contact person

3. What assurances are there that the trial was conducted with integrity?

Any retractions or corrections, due to intentional fraud or unintentional error, may limit internal or external validity 

a. description of any i. fraud, ii. retraction, iii. correction

Previous history of fraud by an investigator would increase our prior suspicion of fraud in the study

b. integrity record of investigators and funders

 
 Back to [Top] [Core Tasks]
 
Competency K: Is the trial internally valid?
Subcompetency
Justification
Data Requirement
1. Were the trial's conclusions supported by the data? Requires the authors' interpretation of the trial a. authors' conclusion of the trial
Conclusions are supported by the results b. all the data requirements for II.A.1.a-b, II.H.2.d, II.I.1.a and d
2. What study limitations were acknowledged? Authors identification and discussion of study limitations helps to judge proper strength of conclusion  a. authors' statement of study limitations
3. What were the recommendations for clinical action supported by the trial results? Requires the authors' recommendation for clinical action, if any a. authors' statement of clinical application
 
 Back to [Top] [Core Tasks]


Last updated on October 18, 2002.
Copyright 2002, UC Board of Regents, sim@medicine.ucsf.edu