Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow toys to fail #1427

Open
annmwang opened this issue Apr 30, 2021 · 7 comments · May be fixed by #2128
Open

Allow toys to fail #1427

annmwang opened this issue Apr 30, 2021 · 7 comments · May be fixed by #2128
Labels
user request Request coming form a pyhf user

Comments

@annmwang
Copy link

Description

I am also running into some trouble while running toys with minuit. Sometimes one of the fits fails, and the whole script will exit. For me it's not problematic to have a few toys fail, as long as there's some log information that counts the number of failures.

Describe the solution you'd like

It'd be great for pyhf to allow some fraction of the toys to fail without completely exiting.

Thanks!

@matthewfeickert
Copy link
Member

@annmwang Can you provide some sort of reproducible example?

@matthewfeickert matthewfeickert added the user request Request coming form a pyhf user label Apr 30, 2021
@annmwang
Copy link
Author

Hi @matthewfeickert ,

Here's an example workspace. I'm using version 0.6.1. The code snippet I'm running is :

pyhf.infer.hypotest(0.3, data, model, model.config.suggested_init(), model.config.suggested_bounds(), calctype="toybased", ntoys = 10, return_expected_set=True, )

my backend / optimizer combo is

pyhf.set_backend(pyhf.tensor.numpy_backend(),pyhf.optimize.minuit_optimizer())

Thanks!

@matthewfeickert
Copy link
Member

Hm. Seems that this is MINUIT related problem as

import json
from pathlib import Path

import pyhf

if __name__ == "__main__":
    with open(Path("/tmp").joinpath("sample_449312.json")) as serialized:
        spec = json.load(serialized)

    workspace = pyhf.Workspace(spec)
    model = workspace.model()
    data = workspace.data(model)
    test_poi = 0.3

    pyhf.set_backend("numpy")

    cls_obs, cls_exp_band = pyhf.infer.hypotest(
        test_poi,
        data,
        model,
        model.config.suggested_init(),
        model.config.suggested_bounds(),
        calctype="toybased",
        ntoys=10,
        return_expected_set=True,
    )
    print(f"CLs obs: {cls_obs}")
    print(f"CLs,exp band: {cls_exp_band}")

works fine

$ curl -sL https://anwang.web.cern.ch/anwang/dedx/sample_449312.json -o /tmp/sample_449312.json
$ python /tmp/debug.py
/home/feickert/Code/GitHub/pyhf/src/pyhf/tensor/numpy_backend.py:352: RuntimeWarning: invalid value encountered in log
  return n * np.log(lam) - lam - gammaln(n + 1.0)
CLs obs: 0.25                                                                                                                                                                                              
CLs,exp band: [array(0.), array(0.), array(0.225), array(1.), array(1.)]

but changing the optimizer to minuit causes your failure

import json
from pathlib import Path

import pyhf

if __name__ == "__main__":
    with open(Path("/tmp").joinpath("sample_449312.json")) as serialized:
        spec = json.load(serialized)

    workspace = pyhf.Workspace(spec)
    model = workspace.model()
    data = workspace.data(model)
    test_poi = 0.3

    pyhf.set_backend("numpy", "minuit")

    cls_obs, cls_exp_band = pyhf.infer.hypotest(
        test_poi,
        data,
        model,
        model.config.suggested_init(),
        model.config.suggested_bounds(),
        calctype="toybased",
        ntoys=10,
        return_expected_set=True,
    )
    print(f"CLs obs: {cls_obs}")
    print(f"CLs,exp band: {cls_exp_band}")

gives

$ curl -sL https://anwang.web.cern.ch/anwang/dedx/sample_449312.json -o /tmp/sample_449312.json
$ python /tmp/debug.py
/home/feickert/Code/GitHub/pyhf/src/pyhf/tensor/numpy_backend.py:352: RuntimeWarning: invalid value encountered in log
  return n * np.log(lam) - lam - gammaln(n + 1.0)
/home/feickert/Code/GitHub/pyhf/src/pyhf/interpolators/code4p.py:60: RuntimeWarning: invalid value encountered in greater
  alphasets > 1, self.mask_on, self.mask_off
/home/feickert/Code/GitHub/pyhf/src/pyhf/interpolators/code4p.py:64: RuntimeWarning: invalid value encountered in less
  alphasets < -1, self.mask_on, self.mask_off
/home/feickert/Code/GitHub/pyhf/src/pyhf/interpolators/code4.py:174: RuntimeWarning: invalid value encountered in greater_equal
  alphasets >= self.__alpha0, self.mask_on, self.mask_off
/home/feickert/Code/GitHub/pyhf/src/pyhf/interpolators/code4.py:185: RuntimeWarning: invalid value encountered in greater
  alphasets > -self.__alpha0, self.mask_on, self.mask_off
/home/feickert/Code/GitHub/pyhf/src/pyhf/interpolators/code4.py:204: RuntimeWarning: invalid value encountered in greater_equal
  exponents >= self.__alpha0, exponents, self.ones
Signal-like:   0%|                                                                                                                                                                 | 0/10 [00:00<?, ?toy/s]     corr: None
      fun: 105.51081895812985
 hess_inv: None
  message: 'Optimization failed. Estimated distance to minimum too large.'
   minuit: <FMin edm=32.25506407014024 edm_goal=0.0002 errordef=1.0 fval=105.51081895812985 has_accurate_covar=True has_covariance=True has_made_posdef_covar=False has_parameters_at_limit=False has_posdef_covar=True has_reached_call_limit=False has_valid_parameters=True hesse_failed=False is_above_max_edm=True is_valid=False nfcn=736 ngrad=0>
(Param(number=0, name='x0', value=0.3, error=0.0, merror=None, is_const=False, is_fixed=True, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=0.0, upper_limit=10.0), Param(number=1, name='x1', value=0.0, error=0.9931107289828836, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=2, name='x2', value=0.0, error=0.9932910149895315, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=3, name='x3', value=0.0, error=0.9933393318734485, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=4, name='x4', value=0.0, error=0.9940175809982357, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=5, name='x5', value=0.0, error=0.9933518811119759, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=6, name='x6', value=0.0, error=1.000870060883898, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=7, name='x7', value=0.0, error=0.9933481611110668, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=8, name='x8', value=1.0, error=0.01711493433614214, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=1e-10, upper_limit=10.0), Param(number=9, name='x9', value=0.0, error=0.9933273809227003, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=10, name='x10', value=0.0, error=0.9936867160093494, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=11, name='x11', value=1.0, error=0.041452196141164965, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=1e-10, upper_limit=10.0), Param(number=12, name='x12', value=1.0, error=0.045856471293794165, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=1e-10, upper_limit=10.0), Param(number=13, name='x13', value=0.0, error=0.9932442678595423, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=14, name='x14', value=0.0, error=0.99283362423918, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=15, name='x15', value=0.0, error=0.37922418126761137, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=16, name='x16', value=0.0, error=0.9863023032766534, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=17, name='x17', value=1.0, error=0.020016524824436055, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=1e-10, upper_limit=10.0), Param(number=18, name='x18', value=0.0, error=0.9931322407671437, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=-5.0, upper_limit=5.0), Param(number=19, name='x19', value=1.0, error=0.018213408912382767, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=1e-10, upper_limit=10.0), Param(number=20, name='x20', value=1.0, error=0.04148819634376388, merror=None, is_const=False, is_fixed=False, has_limits=True, has_lower_limit=True, has_upper_limit=True, lower_limit=1e-10, upper_limit=10.0))
[[ 0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00]
 [ 0.00000000e+00  9.99518613e-01  9.09768186e-05  7.99537944e-05
  -6.68575247e-06 -2.24288478e-08 -2.13207323e-05 -4.16009176e-07
  -2.48760551e-08  1.21483159e-04  1.33328944e-04  4.62564162e-06
   2.90903899e-05 -4.88509475e-04  2.76114330e-04  1.61096965e-02
   1.79646910e-03 -9.35630056e-07  1.51547041e-04  2.34244091e-06
   1.70872644e-07]
 [ 0.00000000e+00  9.09768186e-05  9.99886462e-01 -2.17774659e-06
  -2.78045351e-04 -9.34580240e-07  6.90122761e-06  1.17469428e-07
  -1.03416651e-06 -1.05879521e-05 -1.59629609e-07 -3.27773704e-08
   6.40660292e-10  6.09712305e-04 -3.44904652e-04  4.15185938e-04
   8.40764779e-05  1.16775643e-06 -2.06456060e-05 -3.23705904e-07
   1.03724400e-09]
 [ 0.00000000e+00  7.99537944e-05 -2.17774659e-06  9.99985058e-01
  -1.15926391e-04 -3.79565605e-07 -8.50752964e-05 -1.17064151e-06
  -4.31230663e-07 -2.05157750e-05  1.35800983e-04  2.95523515e-05
  -1.14872247e-06  1.19606362e-05 -6.71977937e-06 -3.43569121e-03
  -4.09986146e-04  2.29482632e-08 -2.77849392e-05 -4.31715116e-07
  -1.99966508e-08]
 [ 0.00000000e+00 -6.68575247e-06 -2.78045351e-04 -1.15926391e-04
   1.00136963e+00  3.18929854e-03 -2.37665549e-02 -3.55284604e-04
  -1.31213754e-06 -3.58435642e-06  5.80744480e-04  1.15553978e-04
   8.50443122e-06 -1.74841496e-07  9.49440162e-08  1.85642481e-07
   2.08172141e-08 -3.50779699e-10  8.10993923e-09  1.02652085e-10
   1.23429462e-12]
 [ 0.00000000e+00 -2.24288478e-08 -9.34580240e-07 -3.79565605e-07
   3.18929854e-03  1.00001067e+00 -7.92768414e-05 -1.17799911e-06
   1.18627899e-05 -1.90255619e-08  1.92294901e-06  3.85331531e-07
   2.83680937e-08 -7.67241110e-09  2.95237629e-10  3.71426765e-10
  -6.17689707e-09 -2.95868083e-11  7.72213533e-11 -2.24429767e-11
  -6.50798362e-16]
 [ 0.00000000e+00 -2.13207323e-05  6.90122761e-06 -8.50752964e-05
  -2.37665549e-02 -7.92768414e-05  1.01541419e+00  2.15801883e-04
  -8.84009057e-05  1.52993476e-04 -2.48071736e-02 -4.93613036e-03
  -3.63147587e-04  1.52828007e-08 -1.26211333e-08 -5.49723182e-07
  -3.99728181e-08  7.24204445e-11 -6.10283000e-09 -6.77486319e-11
  -1.72810926e-11]
 [ 0.00000000e+00 -4.16009176e-07  1.17469428e-07 -1.17064151e-06
  -3.55284604e-04 -1.17799911e-06  2.15801883e-04  1.00000308e+00
  -1.32150272e-06  2.33806080e-06 -3.76871502e-04 -6.81355123e-05
  -1.40103501e-05  4.09878843e-10  6.87475999e-09 -1.03351199e-08
  -9.23723940e-10  8.01170642e-13  6.99186703e-09 -1.67818369e-12
   5.82189059e-11]
 [ 0.00000000e+00 -2.48760551e-08 -1.03416651e-06 -4.31230663e-07
  -1.31213754e-06  1.18627899e-05 -8.84009057e-05 -1.32150272e-06
   2.92924155e-04 -1.32849261e-08  2.16010850e-06  4.29808876e-07
   3.16331423e-08 -6.44634665e-10  3.53138411e-10  6.96195531e-10
   3.02041115e-11 -1.28200871e-12  3.01635187e-11  4.00577804e-13
   4.56216035e-15]
 [ 0.00000000e+00  1.21483159e-04 -1.05879521e-05 -2.05157750e-05
  -3.58435642e-06 -1.90255619e-08  1.52993476e-04  2.33806080e-06
  -1.32849261e-08  9.99960670e-01 -4.12161533e-04 -4.68360102e-05
  -4.96092878e-05  5.68842035e-05 -3.21439585e-05 -3.27738064e-03
  -2.80314737e-04  1.08948666e-07 -3.29123953e-05 -4.96443941e-07
  -1.00394150e-07]
 [ 0.00000000e+00  1.33328944e-04 -1.59629609e-07  1.35800983e-04
   5.80744480e-04  1.92294901e-06 -2.48071736e-02 -3.76871502e-04
   2.16010850e-06 -4.12161533e-04  1.00069408e+00  1.02624958e-04
   5.15930077e-06 -9.40336769e-08  4.92171858e-08  3.03211308e-06
   3.07189201e-07 -1.95990833e-10  3.00561654e-08  4.35080613e-10
   6.14460857e-11]
 [ 0.00000000e+00  4.62564162e-06 -3.27773704e-08  2.95523515e-05
   1.15553978e-04  3.85331531e-07 -4.93613036e-03 -6.81355123e-05
   4.29808876e-07 -4.68360102e-05  1.02624958e-04  1.71839393e-03
   1.76871520e-06 -4.50129161e-09  2.59688910e-09  1.26552255e-07
   9.36564692e-09 -8.70830181e-12  1.42311251e-09  2.14366093e-11
   4.89862715e-12]
 [ 0.00000000e+00  2.90903899e-05  6.40660292e-10 -1.14872247e-06
   8.50443122e-06  2.83680937e-08 -3.63147587e-04 -1.40103501e-05
   3.16331423e-08 -4.96092878e-05  5.15930077e-06  1.76871520e-06
   2.10297975e-03 -1.71203179e-08  9.64094715e-09  6.35523090e-07
   6.66171297e-08 -3.29405691e-11  6.07699371e-09  9.31030719e-11
   9.97739302e-12]
 [ 0.00000000e+00 -4.88509475e-04  6.09712305e-04  1.19606362e-05
  -1.74841496e-07 -7.67241110e-09  1.52828007e-08  4.09878843e-10
  -6.44634665e-10  5.68842035e-05 -9.40336769e-08 -4.50129161e-09
  -1.71203179e-08  9.99791074e-01  1.85024388e-03 -2.23067035e-03
  -4.49217466e-04 -3.34886402e-06  1.10876779e-04  1.73894192e-06
  -5.31522957e-09]
 [ 0.00000000e+00  2.76114330e-04 -3.44904652e-04 -6.71977937e-06
   9.49440162e-08  2.95237629e-10 -1.26211333e-08  6.87475999e-09
   3.53138411e-10 -3.21439585e-05  4.92171858e-08  2.59688910e-09
   9.64094715e-09  1.85024388e-03  9.98953357e-01  1.26006198e-03
   2.55148298e-04  3.54373672e-06 -6.26363662e-05 -9.82447660e-07
   2.97041167e-09]
 [ 0.00000000e+00  1.61096965e-02  4.15185938e-04 -3.43569121e-03
   1.85642481e-07  3.71426765e-10 -5.49723182e-07 -1.03351199e-08
   6.96195531e-10 -3.27738064e-03  3.03211308e-06  1.26552255e-07
   6.35523090e-07 -2.23067035e-03  1.26006198e-03  1.44087583e-01
  -1.06488475e-01 -4.27233108e-06 -6.35722816e-03 -9.96972470e-05
   2.33683893e-07]
 [ 0.00000000e+00  1.79646910e-03  8.40764779e-05 -4.09986146e-04
   2.08172141e-08 -6.17689707e-09 -3.99728181e-08 -9.23723940e-10
   3.02041115e-11 -2.80314737e-04  3.07189201e-07  9.36564692e-09
   6.66171297e-08 -4.49217466e-04  2.55148298e-04 -1.06488475e-01
   9.85678487e-01 -8.60432038e-07 -7.65749920e-04 -1.21045380e-05
   5.24115454e-07]
 [ 0.00000000e+00 -9.35630056e-07  1.16775643e-06  2.29482632e-08
  -3.50779699e-10 -2.95868083e-11  7.24204445e-11  8.01170642e-13
  -1.28200871e-12  1.08948666e-07 -1.95990833e-10 -8.70830181e-12
  -3.29405691e-11 -3.34886402e-06  3.54373672e-06 -4.27233108e-06
  -8.60432038e-07  4.00667212e-04  2.12373905e-07  3.33054489e-09
  -1.03023117e-11]
 [ 0.00000000e+00  1.51547041e-04 -2.06456060e-05 -2.77849392e-05
   8.10993923e-09  7.72213533e-11 -6.10283000e-09  6.99186703e-09
   3.01635187e-11 -3.29123953e-05  3.00561654e-08  1.42311251e-09
   6.07699371e-09  1.10876779e-04 -6.26363662e-05 -6.35722816e-03
  -7.65749920e-04  2.12373905e-07  9.99562502e-01 -2.50263029e-06
   4.58498441e-08]
 [ 0.00000000e+00  2.34244091e-06 -3.23705904e-07 -4.31715116e-07
   1.02652085e-10 -2.24429767e-11 -6.77486319e-11 -1.67818369e-12
   4.00577804e-13 -4.96443941e-07  4.35080613e-10  2.14366093e-11
   9.31030719e-11  1.73894192e-06 -9.82447660e-07 -9.96972470e-05
  -1.21045380e-05  3.33054489e-09 -2.50263029e-06  3.31732340e-04
  -2.25932163e-10]
 [ 0.00000000e+00  1.70872644e-07  1.03724400e-09 -1.99966508e-08
   1.23429462e-12 -6.50798362e-16 -1.72810926e-11  5.82189059e-11
   4.56216035e-15 -1.00394150e-07  6.14460857e-11  4.89862715e-12
   9.97739302e-12 -5.31522957e-09  2.97041167e-09  2.33683893e-07
   5.24115454e-07 -1.03023117e-11  4.58498441e-08 -2.25932163e-10
   1.72138018e-03]]
     nfev: 736
     njev: 0
  success: False
      unc: None
        x: <ValueView x0=0.3 x1=0.0 x2=0.0 x3=0.0 x4=0.0 x5=0.0 x6=0.0 x7=0.0 x8=1.0 x9=0.0 x10=0.0 x11=1.0 x12=1.0 x13=0.0 x14=0.0 x15=0.0 x16=0.0 x17=1.0 x18=0.0 x19=1.0 x20=1.0>
Traceback (most recent call last):
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/optimize/mixins.py", line 49, in _internal_minimize
    assert result.success
AssertionError
Traceback (most recent call last):                                                                                                                                                                         
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/optimize/mixins.py", line 49, in _internal_minimize
    assert result.success
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/debug.py", line 18, in <module>
    cls_obs, cls_exp_band = pyhf.infer.hypotest(
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/infer/__init__.py", line 160, in hypotest
    sig_plus_bkg_distribution, bkg_only_distribution = calc.distributions(poi_test)
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/infer/calculators.py", line 726, in distributions
    teststat_func(
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/infer/test_statistics.py", line 189, in qmu_tilde
    return _qmu_like(mu, data, pdf, init_pars, par_bounds, fixed_params)
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/infer/test_statistics.py", line 25, in _qmu_like
    tmu_like_stat, (_, muhatbhat) = _tmu_like(
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/infer/test_statistics.py", line 44, in _tmu_like
    mubhathat, fixed_poi_fit_lhood_val = fixed_poi_fit(
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/infer/mle.py", line 202, in fixed_poi_fit
    return fit(data, pdf, init_pars, par_bounds, fixed_params, **kwargs)
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/infer/mle.py", line 131, in fit
    return opt.minimize(
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/optimize/mixins.py", line 160, in minimize
    result = self._internal_minimize(**minimizer_kwargs, options=kwargs)
  File "/home/feickert/Code/GitHub/pyhf/src/pyhf/optimize/mixins.py", line 52, in _internal_minimize
    raise exceptions.FailedMinimization(result)
pyhf.exceptions.FailedMinimization: Optimization failed. Estimated distance to minimum too large.

@annmwang
Copy link
Author

annmwang commented May 3, 2021

Hi @matthewfeickert ,

Yes- that's also what I see.

My thinking is that the feature request would make it a bit easier for the user to debug and to quantify the fraction of toys which fail. For example for some fits that I'm running 99.5% of the toys are fine if I put in some patchy error handling, which shouldn't bias my result much and a few failed fits is no problem. Sometimes it's more like 50% of the toys which is more of a problem. And in my experience a few failed minuit fits isn't unexpected...?

But let me know if you have any other suggestions about how to debug this and/or configure pyhf.

@christian-hash
Copy link

Hi @matthewfeickert, @annmwang,

I have the same issues. Are there any updates / proposed solutions related to the topic?

Thanks a lot!

@christian-hash
Copy link

Edit: I see similar issues with the scipy optimizer pyhf.optimize.scipy_optimizer. If the minimizer does not find a minimum for one toy, the code crashes:

Background-like:  72%|█████████████████████████████████████████████████████████████████████████████▎                              | 716/1000 [03:08<01:13,  3.86toy/s]     fun: nan
     jac: array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan])
 message: 'Singular matrix E in LSQ subproblem'
    nfev: 392
     nit: 20
    njev: 20
  status: 5
 success: False
       x: array([ 9.93598374e-01,  1.13655082e+00, -5.50204674e-02, -3.99622534e-02,
       -7.15058311e-09,  8.77072704e-02,  3.24876371e-02, -6.16418883e-02,
        3.92078482e-02, -1.33711330e-02, -6.64234379e-02, -6.65690883e-02,
        1.66666667e+00,  9.97339820e-01])
Traceback (most recent call last):
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/optimize/mixins.py", line 49, in _internal_minimize
    assert result.success
AssertionError
Traceback (most recent call last):                                                                                                                                    
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/optimize/mixins.py", line 49, in _internal_minimize
    assert result.success
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "limit_setting.py", line 129, in <module>
    init_pars=init_pars, par_bounds=par_bounds, fixed_params=fixed_params, ntoys=1000))
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/infer/__init__.py", line 160, in hypotest
    sig_plus_bkg_distribution, bkg_only_distribution = calc.distributions(poi_test)
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/infer/calculators.py", line 745, in distributions
    self.fixed_params,
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/infer/test_statistics.py", line 189, in qmu_tilde
    return _qmu_like(mu, data, pdf, init_pars, par_bounds, fixed_params)
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/infer/test_statistics.py", line 26, in _qmu_like
    mu, data, pdf, init_pars, par_bounds, fixed_params, return_fitted_pars=True
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/infer/test_statistics.py", line 45, in _tmu_like
    mu, data, pdf, init_pars, par_bounds, fixed_params, return_fitted_val=True
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/infer/mle.py", line 202, in fixed_poi_fit
    return fit(data, pdf, init_pars, par_bounds, fixed_params, **kwargs)
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/infer/mle.py", line 132, in fit
    twice_nll, data, pdf, init_pars, par_bounds, fixed_vals, **kwargs
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/optimize/mixins.py", line 160, in minimize
    result = self._internal_minimize(**minimizer_kwargs, options=kwargs)
  File "/eos/home-c/cappelt/DHNLLimits/pyhf/pyhf/lib/python3.7/site-packages/pyhf/optimize/mixins.py", line 52, in _internal_minimize
    raise exceptions.FailedMinimization(result)
pyhf.exceptions.FailedMinimization: Singular matrix E in LSQ subproblem

@christian-hash
Copy link

christian-hash commented Jul 28, 2021

I found a solution that works for me: I've added a allowed_failures parameter to the ToyCalculator class and modified the code starting at line 723:

        allowed_failures = self.allowed_failures # fraction of toys that are allowed to fail

        failures = 0
        signal_teststat = []
        for sample in tqdm.tqdm(signal_sample, **tqdm_options, desc='Signal-like'):
            try:
                signal_teststat.append(
                    teststat_func(
                        poi_test,
                        sample,
                        self.pdf,
                        self.init_pars,
                        self.par_bounds,
                        self.fixed_params,
                    )
                )
            except KeyboardInterrupt:
                raise KeyboardInterrupt
            except:
                failures = failures + 1
                if failures / self.ntoys > allowed_failures:
                    raise AssertionError(f'more than {allowed_failures * 100} % of the toys failed')
                continue
        if failures > 0: print(f'\n{failures} / {self.ntoys} failed (signal-like)\n')

        failures = 0
        bkg_teststat = []
        for sample in tqdm.tqdm(bkg_sample, **tqdm_options, desc='Background-like'):
            try:
                bkg_teststat.append(
                    teststat_func(
                        poi_test,
                        sample,
                        self.pdf,
                        self.init_pars,
                        self.par_bounds,
                        self.fixed_params,
                    )
                )
            except KeyboardInterrupt:
                raise KeyboardInterrupt
            except:
                failures = failures + 1
                if failures / self.ntoys > allowed_failures:
                    raise AssertionError(f'more than {allowed_failures * 100} % of the toys failed')
                continue
        if failures > 0: print(f'\n{failures} / {self.ntoys} failed (background-like)\n')

I.e., allowed_failures = 0.05 allows 5% of the toys to fail without crashing the code.

This can be used like:

pyhf.infer.hypotest(test_poi, data, model, test_stat="qtilde", calctype='toybased',
                    return_expected_set=True, return_tail_probs=True,
                    init_pars=init_pars, par_bounds=par_bounds, fixed_params=fixed_params, ntoys=ntoys, allowed_failures=0.05)

I'm not sure if this is the best way to do it. I can do an MR if people are interested. The default (allowed_failures=0.0) behaves like before, which means that the code crashes when a minimization error occurs.

@lukasheinrich lukasheinrich linked a pull request Mar 3, 2023 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user request Request coming form a pyhf user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants