Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Configurable constraints per-modifier #1829

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

kratsg
Copy link
Contributor

@kratsg kratsg commented Mar 30, 2022

Description

I only added functionality for staterror right now, but this lets you define a modifier with a constraint parameter like so

"modifiers": [ 
    { 
        "data": [ 
            0.4581917541908765
        ],
        "name": "staterror_SR0L_Lnj_Imeff_cuts",
        "type": "staterror",
        "constraint": "poisson"
    },
]

which should get picked up correctly. Should also allow for an unconstrained option in the future too (to turn off the constraint).

#760 is kinda related.

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR

@alexander-held
Copy link
Member

For debugging purposes, here are likelihood scans:

import numpy as np
import pyhf
import matplotlib as mpl
import matplotlib.pyplot as plt

spec = {
    "channels": [
        {
            "name": "ch",
            "samples": [
                {
                    "data": [1000.0],
                    "modifiers": [
                        {"data": None, "name": "mu_sig", "type": "normfactor"},
                        {"data": [150.0], "name": "staterror_ch", "type": "staterror"},
                    ],
                    "name": "signal",
                }
            ],
        }
    ],
    "measurements": [{"config": {"parameters": [], "poi": "mu_sig"}, "name": "meas"}],
    "observations": [{"data": [1000], "name": "ch"}],
    "version": "1.0.0",
}

model_gaussian = pyhf.Workspace(spec).model()
print(pyhf.Workspace(spec).data(model_gaussian))

spec["channels"][0]["samples"][0]["modifiers"][1].update({"constraint": "poisson"})
model_poisson = pyhf.Workspace(spec).model()
print(pyhf.Workspace(spec).data(model_poisson))

spec["channels"][0]["samples"][0]["modifiers"][1] = {
    "data": [150.0],
    "name": "shapesys",
    "type": "shapesys",
}
model_shapesys = pyhf.Workspace(spec).model()
print(pyhf.Workspace(spec).data(model_shapesys))

par_range = np.linspace(0.1, 1.9, 19)
ll_gaussian = []
ll_poisson = []
ll_shapesys = []

for par in par_range:
    auxdata_stat = pyhf.tensorlib.astensor([1.0])
    auxdata_shape = pyhf.tensorlib.astensor([(1000 / 150) ** 2])
    parameters = pyhf.tensorlib.astensor([1.0, par])
    ll_gaussian.append(model_gaussian.constraint_logpdf(auxdata_stat, parameters))
    ll_poisson.append(model_poisson.constraint_logpdf(auxdata_stat, parameters))
    ll_shapesys.append(model_shapesys.constraint_logpdf(auxdata_shape, parameters))

mpl.style.use("fivethirtyeight")

plt.plot(par_range, ll_gaussian, label="staterror (Gaussian)")
plt.plot(par_range, ll_poisson, label="staterror (Poisson)")
plt.plot(par_range, ll_shapesys, label="shapesys (Poisson)")
plt.legend()
plt.xlabel("parameter value")
plt.ylabel("log likelihood")
plt.tight_layout()
plt.savefig("scan.pdf")

scan

I am unsure about what the auxdata for staterror with Poisson constraint should be, and whether it should change like in the shapesys. I would expect that a Poisson staterror scan matches shapesys, at least up to some constant offset in case there are some normalization differences.

@alexander-held
Copy link
Member

alexander-held commented Mar 31, 2022

The maximum for the Poisson staterror case in the example above is at 1000/150 = 6.67 (data / staterror), which looks a bit suspicious. Is there something wrong with the test setup? Making the auxdata (1000/150)^2 (like shapesys) is not the solution either, it moves the maximum likelihood over by the same factor.

@alexander-held
Copy link
Member

Looking what actually goes into the final Poisson call in the backend: for a shapesys, the rate is the parameter times auxdata (which itself is (1000/150)^2).
For the staterror with Poisson term as currently implemented, the rate is (staterror/yield) times the parameter, which is 150/1000*parameter for the example above. Together with the (currently default) auxdata of 1, that leads to a maximum in that case at 1000/150.

@matthewfeickert matthewfeickert changed the base branch from master to main September 21, 2022 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants