1 Description

Trial Sequential Analysis in Java is a sequential meta-analysis that aims to transparently control for type I and type II errors [1]. When conducting a systematic review with meta-analysis transparency is of highest importance. Including a Trial Sequential Analysis to the meta-analysis is no exception. One of the most important tools is providing a public available protocol that describes, in detail, how and when the analysis will be conducted for each chosen outcome. If failing to do so, the analysis will stand out as biased due to the risk that each variable has been chosen to show the results the researchers was aiming to achieve (e.g., nearly significant, or highly significant). This standard operating procedure aims to help and guide future reviewers of interventional trials on how to navigate through the meta-analysis when conducting Trial Sequential Analysis.

2 Predefine Trial Sequential Analysis parameters in the protocol phase - and make them public!

2.0.1 General responsibilities

The entire review team is responsible for making a thoroughly described protocol publicly available before starting the systematic review. This is clearly recommended in The Cochrane Handbook of Systematic Reviews of Interventions [2]. The following sections should be described in full in the protocol with reference to relevant studies that substantiates the decision made during the protocol writing phase.

Remember the certainty of the evidence from a meta-analysis will always depend on more than a Trial Sequential Analysis (TSA) showing that the required information size has been reached.

2.1 Trial Sequential Analysis

Clearly state that TSA will be used in the systematic review meta-analysis and what version of the software that is used. Describe all details about the analysis in the protocol (see below).

If the R version of TSA is used, one can find the version via:

#> [1] '0.2.2'

2.2 Outcome

All outcomes intended to be analysed with TSA should be clearly defined in the protocol. At this point it is advisable to consider if these are continuous or dichotomous outcomes. Please be aware that the TSA cannot be calculated with standardised mean differences (SMD).

For dichotomous and continuous outcomes, the following needs to be defined:

2.2.1 alpha-level

Normally, meta-analyses will operate with the conventional 5% significance limit. Multiple testing of outcomes is at risk of increasing the type I error as seen when performing multiple interim analysis in randomised clinical trials [3]. Furthermore, if systematic reviews are investigating more than one outcome or investigating the same outcome at different time points, the risk of type-I-errors increases [4, 5].

Therefore, it is recommended to adjust the alpha level according to the number of outcomes going to be analysed. There are different ways to do this, with the Bonferroni correction being the most conservative. Jakobsen and colleagues [4] suggests dividing the conventional alpha level with the number between no correction (1) and the Bonferroni correction (number of outcomes). So, when three primary outcomes are defined the conventional alpha level should be divided by 1.5, if four outcomes exist divide by 2 and so on [4].

In RTSA the alpha level is set in the function by the argument alpha. For a significance level of 5%, which is also the default value, set:

RTSA(alpha = 0.05, ...)

The alpha-level chosen for each outcome should be clearly described in the protocol. ### Level of power

Traditionally, systematic reviews are not concerned with the level of power [6]. As new recommendations in the GRADE system emerges, it is evident that an optimal required information size is needed to judge imprecision and the power of the analysis [7–9]. Normally, a power of 80% is accepted but for a more precise estimate a 90% power or higher is recommended.

In RTSA the power is set in the function by the argument beta, where \(Power = 1 - beta\). Hence for 90% power, which is also the default value, set:

RTSA(beta = 0.1, ...)

The level of power should be clearly described for each outcome in the protocol.

2.2.2 Heterogeneity

Heterogeneity can be present in meta-analysis, which means that there is between-trial variation. This changes the interpretation of the meta-analysis and the sample size and trial size requirements for the meta-analysis to reach the wanted level of power.

For traditional meta-analysis it has been custom to use the inconsistency (\(I^2\)) for estimating heterogeneity [2]. This can also be used in the TSA, but we recommend to calculate and adjust for diversity (\(D^2\)) when using the original stand-alone TSA software [10]. Diversity is a more conservative estimate for heterogeneity affecting the required information size if heterogeneity is present. The TSA will then present a diversity adjusted required information size (DARIS) [10]. For meta-analysis where the required information size is not reached and a 0% heterogeneity is found, a predefined sensitivity analysis adjusting for diversity between 25% to 50% could be relevant. Including more future trials will increase the risk of an increase in heterogeneity.

It has been found in [16], that it is not always sufficient to adjust for Diversity as it only increases the number of participants. In RTSA the meta-analysis is per default adjusted by the estimated heterogeneity, if the meta-analysis is a retrospective meta-analysis and the meta-analysis is using a random-effects model.

RTSA(fixed = FALSE, tau2 = 0.05, ...) # adjusted by a specific size of heterogeneity

If one wants to explore different levels of diversity, say 50%, the following can be achieved by:

RTSA(fixed = FALSE, D2 = 0.5, ...) # adjusted by diversity

How heterogeneity is going to be incorporated in the calculation of random-effect meta-analysis should be clearly described in the protocol.

2.3 Dichotomous outcomes

The following variables should be predefined in the protocol before conducting a TSA for each dichotomous outcome.

2.3.1 Proportion of events in the control group (Pc)

For calculating the TSA for a dichotomous outcome, a proportion of events in the control group must be defined. Ideally this is derived from a previous well conducted and large trial or systematic review with low risk of bias [1]. If such a trial does not exist a large observational study at low risk of bias can be used. Data from the control group of the conducted meta-analysis can also be used if the mentioned data are not accessible.

In RTSA the proportion of events in the control group is set using the argument pC:

RTSA(pC = 0.1, ...)

How the probability of event in the control group was decided upon should be predefined in a protocol.

2.3.2 Relative risk reduction (RRR)

The relative risk is an estimate of the effect of reducing the risk in the intervention group over the risk in the control group [6, 11]. If data from previous trials or systematic reviews at low risk of bias exists, these trials’ can be used to estimate the RRR. Usually in health research relatively small benefits are gained from one intervention. But very small RRR requires very large required information sizes (RIS). On the other hand, a large RRR can underestimate the power needed in the meta-analysis and increase the risk of random errors. Nevertheless, RRR should always be a realistic estimate.

In RTSA for relative risks, both the argument RRR or the argument mc can be used. For a relative risk reduction of 20%, set RRR = 0.2 or mc = 0.8 (RR of 80%):

RTSA(RRR = 0.2, ...)
RTSA(mc = 0.8, ...)

How one decides on the size of RRR should be specified in the protocol.

2.3.3 Zero events

In dichotomous outcomes zero events may occur in one or more groups of the included trials. In the protocol it is important to define how these events are handled. RevMan 5.1 [12] does by default replace a zero event with a 0.5 value. In the TSA in Java, higher or lower values can be defined by the authors using continuity adjustment techniques such as constant, reciprocal or empirical [13].

In RTSA the default zero adjustment is 0.5:

RTSA(zero_adj = 0.5, ...)

How to handle zero events should be specified in the protocol.

2.4 Continuous outcomes

For continuous outcomes the following should be predefined in the protocol besides the alpha-level and level of power.

2.4.1 Minimally relevant difference (MIREDIF)

The minimally relevant difference is the change on a continues scale that is clinically relevant for the patient to experience. TSA takes the MIREDIF into account when calculating a RIS for a continuous outcome. MIREDIF should be set to the minimal clinical relevant difference. If this is not of interest, MIREDIF can be derived from previously conducted randomised clinical trials on patients with the same conditions.

In RTSA the argument to specify MIREDIF is mc, for a MIREDIF of 5:

RTSA(mc = 5, ...)

MIREDIF should be specified in the protocol.

2.4.2 Variance

The variance is the spread of numbers in a dataset. Again, the variance of an outcome can be derived from previous studies examining the same outcome in the same group of patients. If this is not possible, variance can be calculated as the squared standard deviation.

In RTSA the standard deviation is used:

RTSA(sd_mc = X, ...)

The expected variance should be specified in the protocol.

2.5 Preparing the TSA

2.5.1 Meta-analytic methods

The type of method used can alter the results of the TSA. Pre-defined methods whether fixed effect or random-effects model and type of model to be used should be described in the protocol. As the TSA test the optimal information size by calculating the DARIS or HARIS (heterogeneity adjusted required information size), the TSA should follow the same methodological approach as the meta-analysis performed. For example, using the random-effects model (DerSirmonian and Laird or the Hartung-Knapp-Sidik-Jonkman adjustment) in the meta-analysis should transfer to the TSA.


RTSA(fixed = FALSE, re_method = "DL") # for DerSimonian-Laird.
RTSA(fixed = FALSE, re_method = "DL_HKSJ") # for the Hartung-Knapp-Sidik-Jonkman adjustment of DerSimonian-Laird.

2.5.2 Calculating the TSA-adjusted confidence interval (CI)

The TSA-adjusted CI gives a wider CI adjusted for the variables mentioned above. TSA-adjusted CI is, along with the DARIS and HARIS tools for examining imprecision in the analysis. State clearly in the protocol if these will be applied.

2.6 Primary, secondary, and sub-group analysis

Trial Sequential Analysis is typically performed on the more decisive outcomes in the systematic review such as primary or secondary outcomes chosen to investigate benefits and harms of an intervention. If predefined sub-groups are planned (e.g. low compared to high risk of bias) and the authors wish to control these outcomes for type I and type II errors, a thorough description of this should be given in the protocol.

3 Decision flowchart for the analysis (the Figure 1)

Description of the process for preparing a Trial Sequential Analysis (TSA).

4 Reporting the analysis - ensure transparency and increase trustworthiness

When reporting the conducted analysis all deviations from the publicly available protocol should be explicitly stated. For transparency, it is highly recommended to add, e.g. as an online supplementary, a Trial Sequential Analysis report showing all details about the analysis.

Provide the information alongside each analysis for relevant variables. For dichotomous outcome state the chosen pc, RRR, alpha level, level of power and the heterogeneity or diversity or inconsistency adjusted required information size, or if no heterogeneity report this explicitly.

For continues outcomes provide the minimally relevant difference, the variance, alpha level, level of power and the diversity or inconsistency adjusted required information size, or if no heterogeneity report this explicitly.

5 References

  1. Wetterslev J, Jakobsen JC, Gluud C. Trial Sequential Analysis in systematic reviews with meta-analysis. BMC Med Res Methodol. 2017;17. doi:10.1186/S12874-017-0315-7.

  2. Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.3 (updated February 2022). Cochrane, 2022. Available from www.training.cochrane.org/handbook.

  3. Brok J, Thorlund K, Wetterslev J, Gluud C. Apparently conclusive meta-analyses may be inconclusive–Trial sequential analysis adjustment of random error risk due to repetitive testing of accumulating data in apparently conclusive neonatal meta-analyses. Int J Epidemiol. 2009;38:287–98. doi:10.1093/ije/dyn188.

  4. Jakobsen JC, Wetterslev J, Winkel P, Lange T, Gluud C. Thresholds for statistical and clinical significance in systematic reviews with meta-analytic methods. BMC Med Res Methodol. 2014;14.

  5. Bender R, Bunce C, Clarke M, Gates S, Lange S, Pace NL, et al. Attention should be given to multiplicity issues in systematic reviews. J Clin Epidemiol. 2008;61:857–65. doi:10.1016/j.jclinepi.2008.03.004.

  6. Turner RM, Bird SM, Higgins JPTT. The impact of study size on meta-analyses: Examination of underpowered studies in Cochrane reviews. PLoS One. 2013;8:e59202. doi:10.1371/journal.pone.0059202.

  7. Guyatt GH, Oxman AD, Kunz R, Brozek J, Alonso-Coello P, Rind D, et al. GRADE guidelines 6. Rating the quality of evidence–imprecision. J Clin Epidemiol. 2011;64:1283–93. doi:10.1016/j.jclinepi.2011.01.012.

  8. Zhang Y, Coello PA, Guyatt GH, Yepes-Nuñez JJ, Akl EA, Hazlewood G, et al. GRADE guidelines: 20. Assessing the certainty of evidence in the importance of outcomes or values and preferences-inconsistency, imprecision, and other domains. J Clin Epidemiol. 2019;111:83–93. doi:10.1016/J.JCLINEPI.2018.05.011.

  9. Zeng L, Brignardello-Petersen R, Hultcrantz M, Mustafa RA, Murad MH, Iorio A, et al. GRADE Guidance 34: update on rating imprecision using a minimally contextualized approach. J Clin Epidemiol. 2022;0. doi:10.1016/j.jclinepi.2022.07.014.

  10. Wetterslev J, Thorlund K, Brok J, Gluud C. Estimating required information size by quantifying diversity in random-effects model meta-analyses. BMC Med Res Methodol. 2009;9:86. doi:10.1186/1471-2288-9-86.

  11. Wetterslev J, Thorlund K, Brok J, Gluud C. Trial sequential analysis may establish when firm evidence is reached in cumulative meta-analysis. J Clin Epidemiol. 2008;61:64–75.

  12. Nordic Cochrane Centre. Review Manager 5 (RevMan 5). https://training.cochrane.org/online-learning/core-software/revman.

  13. Imberger G, Thorlund K, Gluud C, Wetterslev J. False-positive findings in Cochrane meta-analyses with and without application of trial sequential analysis: an empirical review. BMJ Open. 2016;6:e011890. doi:10.1136/BMJOPEN-2016-011890.

  14. Norman GR, Sloan JA, Wyrwich KW. Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Med Care. 2003;41:582–92. doi:10.1097/01.MLR.0000062554.74615.4C.

  15. Tsujimoto Y, Fujii T, Tsutsumi Y, Kataoka Y, Tajika A, Okada Y, et al. Minimal important changes in standard deviation units are highly variable and no universally applicable value can be determined. J Clin Epidemiol. 2022;145:92–100. doi:10.1016/J.JCLINEPI.2022.01.017.

  16. Kulinskaya E, Wood J. Trial sequential methods for meta-analysis. Res Synth Methods. 2014 Sep;5(3):212-20. doi: 10.1002/jrsm.1104. Epub 2013 Nov 28. PMID: 26052847.