A few weeks ago, a randomized trial of remdesivir showed that it seemed to reduce mortality risk for coronavirus patients, from 11.6% to 8%. The p-value for this comparison was p=0.059. But what does that really mean? Does remdesivir work, or not? How likely is it to work? 94.1%?

Pyro is a powerful probabilistic programming library based on PyTorch that can answer questions like this, and much, much more. To start off, consider a hypothetical disease, the Pikachu virus. We can create a disease severity scale to measure how bad each case of Pikachu is, from 0 being no symptoms at all, to 100 being severely ill. Let’s assume the patients we get follow a beta distribution (rescaled), with the center at 50 and a standard deviation of 20. We can then simulate groups of patients getting sick:

```def pikachu_patients(n):
mode0 = common.make_tensor(0.5)
k0 = common.make_tensor(8.1269)
scaling = common.make_tensor(100.0)

# beta reparameterized
# https://bit.ly/2ZqjILG
patient_dist = common.beta(mode0, k0)

patients = []
for i in range(n):
patient = pyro.sample("severity", patient_dist)
patients.append(patient * scaling)

```

(This is, of course, a very simplified model! More elaborate and realistic models will be explored in later posts.)

This gives us a handy distribution of data points to play with (n = 50):

Suppose someone runs a two-arm, n=100, randomized controlled trial for the use of rubber gloves in treating Pikachu. The data shown above is the non-treatment group, with a nice mean severity of 49.052, while the treatment group looks like this:

The mean severity of this group is 42.6573, p=0.0423, which is less than the standard 0.05. The treatment works! (In fact, xkcd-style, this was made by just running the sampler a bunch of times until something suitable came out.) But how likely is it to work, really? And with what effect size?

Before the trial starts, we already know that most treatments don’t work. Let’s assume that there’s an 80% chance of no effect (zero distribution), a 10% chance of a small effect (beta distribution, mode = 0, a = 1, b = 9, rescaled to range 0-30), and a 10% chance of a significant effect of unknown size (beta distribution, mode = 0.5, a = 1, b = 1, rescaled):

```PRIORS = {
'p_effect_sizes': [0.8, 0.1, 0.1],
'mode_effect_size': 0.5, # before scaling
'effect_k': 2.0, # uniform distribution
'small_effect_k': 10.0,
'scale': 30.0
}
```

Our prior distribution looks like this:

Now, we make a simulation of what might happen to patients under different possible outcomes, and match that against the observed “trial data”:

```def make_distributions(mode_effect_size, effect_k, small_effect_k):
return [common.zero_dist(),
common.beta(0, small_effect_k),
common.beta(mode_effect_size, effect_k)]

def model(patients):
patients = patients.clone()
scaling = common.make_tensor(100.0)
patients /= scaling

p_effect_sizes = common.make_tensor(PRIORS['p_effect_sizes'])
mode_effect_size = common.make_tensor(PRIORS['mode_effect_size'])
effect_k = common.make_tensor(PRIORS['effect_k'])
small_effect_k = common.make_tensor(PRIORS['small_effect_k'])
scale = common.make_tensor(PRIORS['scale']) / scaling

distributions = make_distributions(mode_effect_size, effect_k, small_effect_k)
which_world = pyro.sample("which_world", dist.Categorical(p_effect_sizes))
effect_size_raw = pyro.sample("effect_size", distributions[which_world])
effect_size = effect_size_raw * scale

mode0 = common.make_tensor(0.5)
k0 = common.make_tensor(8.1269)

# beta reparameterized
# https://bit.ly/2ZqjILGoriginal
original_dist = common.beta(mode0, k0)
new_dist = common.beta(mode0 - effect_size, k0)

with pyro.plate("trial", len(patients)):
pyro.sample("severity_trial", new_dist,
obs=patients)

with pyro.plate("test", len(patients)):
pyro.sample("severity_test", original_dist,
obs=patients)
```

(Note that in this simple model, since we already know the underlying distribution given no intervention, the “control” data has no effect on the findings. In real life, of course, control data would be used to find other parameters.)

We then create a “variational distribution”, or “guide”, to approximate the posterior:

```def guide(patients):
p_effect_sizes = pyro.param(
"p_effect_sizes", common.make_tensor(PRIORS['p_effect_sizes']),
constraint=constraints.simplex)

mode_effect_size = pyro.param(
"mode_effect_size", common.make_tensor(PRIORS['mode_effect_size']),
constraint=constraints.interval(0, 1))

effect_k = pyro.param(
"effect_k", common.make_tensor(PRIORS['effect_k']),
constraint=constraints.positive)

small_effect_k = pyro.param(
"small_effect_k", common.make_tensor(PRIORS['small_effect_k']),
constraint=constraints.positive)

distributions = make_distributions(mode_effect_size, effect_k, small_effect_k)
which_world = pyro.sample("which_world", dist.Categorical(p_effect_sizes))
effect_size_raw = pyro.sample("effect_size", distributions[which_world])
```

And run some inference:

```def find_params(model, guide, data, steps, callable_log):
# setup the optimizer
adam_params = {"lr": 0.002, "betas": (0.90, 0.999)}
scheduler = MultiStepLR(
'milestones': [100, 200], 'gamma': 0.2})

# setup the inference algorithm
svi = SVI(model, guide, scheduler, loss=Trace_ELBO())