**A Bayesian interpretation of Pfizer vaccine efficacy**

Pfizer has just announced that their vaccine trial had concluded with an impressive result: 95% efficacy [1]. One does not need to do any fancy analysis to see that the efficacy is real. Here, I would like to offer a Bayesian interpretation of the efficacy.

Briefly, the trial has enrolled 43,661 volunteers, and as of 13/11/2020 41135 volunteers have received a second dose of the vaccine. Assuming that half the 41135 participants have been randomly assigned to the vaccine group, and the rest to placebo. According to the press release, 170 participants were infected with SARS-Cov-2, and among whom, only 8 were in the vaccinated group compared with 162 in the placebo group. So, the incidence/risk of infection was:

· vaccined group: 0.039%

· control group: 0.788%

The relative risk is then: RR = 0.039/0.788 = 0.05, and vaccine efficacy (VE) is 1 — RR = 95%. The company also reported that the P-value for VE was <0.0001.

Of course, the 95% figure is only a point estimate, and it is subject to sampling variation. If another independent study is carried out (very unlikely) then VE may be different from the above figure.

My question is then: *what is the probability that vaccine efficacy is above 90%*. In probability language, I want to find out

Pr(RR < 0.10) = ?

That question cannot be answered by the traditional frequentist statistical analysis (eg P-value, confidence interval). The question can however be answered by a Bayesian analysis. Bayesian analysis is a way of updating knowledge that takes into account the existing evidence within the context of prior knowledge.

We are primarily concerned with RR here. However, for mathematical convenience we will work with log(RR). Let us denote theta = log(RR). In Bayesian analysis, the parameter of interest (eg theta) is treated as a random, not fixed, variable. And, a random variable must have a probability distribution. For vaccine efficacy, it is reasonable to assume that theta follows the Normal distribution with mean *m *and standard deviation *s*, or symbolically: theta ~ N(*m*, *s*). We are going to use this distribution to quantify our knowledge about vaccine efficacy.

What is our prior knowledge about the efficacy of Pfizer vaccine? None. We have no idea what are possible values of theta. We can however start off by assuming that on average theta = 0 (eg the vaccine is not efficacious), but it could equally likely range from negative to positive values. We can quantify this lack of knowledge by a Normal distribution with mean 0 and large standard deviation:

theta_prior ~ N(*m*0 = 0, *s*0 = 10)

which can be visualized as follows. This figure implies that although on average the vaccine has no efficacy, there is a 50% chance that it has positive efficacy and a 50% chance of negative efficacy (eg harm). The figure also implies that there is a low probability that the vaccine is extremely harmful or extremely efficacious.

OK, let us now turn to the existing evidence that is reported by Pfizer. The RR is reported as 0.039/0.788 = 0.05. So, the mean value of theta is *m*1 = log(0.039/0.788) = -3.00. The standard deviation of theta can be estimated as *s*1 = sqrt(1/8 + 1/162–1/20567–1/20567) = 0.36. So, we can quality the evidence as follows:

theta_data ~ N(*m*1 = -3.0, *s*1 = 0.36)

Now that we have quantified our prior knowledge and current evidence, the next step is to integrate the two pieces of information. The integration can be done via Bayesian theorem which in essence states that (sorry, I cannot type the proportional sign):

Pr(theta | data) = Pr(data | theta) x Pr(theta)

the left hand side of the equation is called “posterior probability”, which is a function of prior probability Pr(theta) and the likelihood Pr(data | theta). The above equation can be more practically stated as:

N(m, s) = N(*m*1, *s*1) x N(*m*0, *s*0)

with *m* = (*m*0/*s*0² + *m*1/*s*1²) / (1 / *s*0² + 1 / *s*1²) and s = 1 / sqrt(1/*s*0² + 1 / *s*1²).

My calculation gives: *m* = -2.99, and *s* = 0.359. Actually, the whole exercise can be done via an online application [2].

In other words, the prior knowledge does not sway the evidence in any significant way. The posterior mean of vaccine efficacy is 1 — exp(-2.99) = 0.95. Based on this posterior distribution, the probability that vaccine efficacy > 90% is 0.975. The evidence of efficacy is overwhelming. There is little doubt that the vaccine worked.

**Postscript**: *Since the publication of this note, Pfizer has submitted their data to FDA and a new picture has emerged from those data. Now, the reported efficacy of 95% was derived from the 170 PCR-confirmed infections (8 in the vaccine group + 162 in the placebo group).*

*However, the number of suspected but not PCR-confirmed cases of infection was much higher: 1594 in the vaccine group and 1816 in the placebo group. These cases were not mentioned in the New England Journal of Medicine paper, but were reported in the FDA submission. If these cases were included in the analysis, then vaccine efficacy was only 18%.*

*Peter Doshi (associate editor of BMJ) also raised an interesting issue of analysis. Apparently, 371 individuals were excluded from the analysis because they had violated the study’s protocol. The problem is that the majority of the cases (311) was in the vaccine group, which could seriously bias the efficacy result.*

*Ref: **Peter Doshi. Pfizer and Moderna’s “95% effective” vaccines — we need more details and the raw data*

____

[2] https://stephens999.github.io/fiveMinuteStats/shiny_normal_example.html