gpt4 book ai didi

python - 最小 pyhf 示例失败并显示 'Inequality constraints incompatible'

转载 作者:行者123 更新时间:2023-12-02 03:53:12 28 4
gpt4 key购买 nike

我正在尝试构建一个非常小的 pyhf 示例:两个高斯、一个信号和一个背景,但我无法让它工作。我的Python代码是:

import pyhf.readxml
import os
from ROOT import TH1F, TFile, TF1

mygaus = TF1("mygaus","TMath::Gaus(x,100,.5)",95, 115)
mygaus2 = TF1("mygaus2","TMath::Gaus(x,110,.2)",95, 115)
mygaus_data = TF1("mygaus_data","TMath::Gaus(x,110,.2)+TMath::Gaus(x,100,.5)",95, 115)

bkg_nominal = TH1F('bkg_nominal', '', 80, 95, 115)
bkg_nominal.FillRandom("mygaus", 10000)

sig_nominal = TH1F('sig_nominal', '', 80, 95, 115)
sig_nominal.FillRandom("mygaus2", 5000)

data_nominal = TH1F('data_nominal', '', 80, 95, 115)
data_nominal.FillRandom("mygaus_data", 10000)

meas = TFile('meas.root', 'RECREATE')
bkg_nominal.Write()
sig_nominal.Write()
data_nominal.Write()
meas.Close()

spec = pyhf.readxml.parse('meas.xml', os.getcwd())
workspace = pyhf.Workspace(spec)

pdf = workspace.model(measurement_name='meas')
data = workspace.data(pdf)
workspace.get_measurement(measurement_name='meas')
best_fit = pyhf.infer.mle.fit(data, pdf)

XML文件,我基本上是从文档中的示例复制的,是这样写的

meas.xml

<!DOCTYPE Combination  SYSTEM 'HistFactorySchema.dtd'>

<Combination OutputFilePrefix="workspace" >


<Input>./meas_channel1.xml</Input>

<Measurement Name="meas" Lumi='1' LumiRelErr='0.1' ExportOnly="False" >
<POI>signorm</POI>
</Measurement>

</Combination>

meas_channel1.xml

<!DOCTYPE Channel  SYSTEM 'HistFactorySchema.dtd'>

<Channel Name="channel1" InputFile="" >

<Data HistoName="data_nominal" InputFile="meas.root" />

<StatErrorConfig RelErrorThreshold="0.05" ConstraintType="Gaussian" />

<Sample Name="bkg" HistoName="bkg_nominal" InputFile="meas.root" NormalizeByTheory="True" >
<NormFactor Name="bkgnorm" Val="1" High="3" Low="0" Const="False" />
</Sample>

<Sample Name="sig" HistoName="sig_nominal" InputFile="meas.root" NormalizeByTheory="True" >
<NormFactor Name="signorm" Val="1" High="3" Low="0" Const="False" />
</Sample>

</Channel>

它看起来非常简单,我可以绘制直方图。但是,当我收到此错误消息时:

ERROR:pyhf.optimize.opt_scipy:     fun: nan
jac: array([nan, nan, nan])
message: 'Inequality constraints incompatible'
nfev: 5
nit: 1
njev: 1
status: 4
success: False
x: array([1., 1., 1.])
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-14-54e7c2f0a645> in <module>
2 data = workspace.data(pdf)
3 workspace.get_measurement(measurement_name='meas')
----> 4 best_fit = pyhf.infer.mle.fit(data, pdf)

/usr/local/lib/python3.7/site-packages/pyhf/infer/mle.py in fit(data, pdf, init_pars, par_bounds, **kwargs)
34 init_pars = init_pars or pdf.config.suggested_init()
35 par_bounds = par_bounds or pdf.config.suggested_bounds()
---> 36 return opt.minimize(twice_nll, data, pdf, init_pars, par_bounds, **kwargs)
37
38

/usr/local/lib/python3.7/site-packages/pyhf/optimize/opt_scipy.py in minimize(self, objective, data, pdf, init_pars, par_bounds, fixed_vals, return_fitted_val)
45 )
46 try:
---> 47 assert result.success
48 except AssertionError:
49 log.error(result)

AssertionError:

这很奇怪,因为我没有任何不平等约束。我觉得我在做一些愚蠢的事情,你能帮忙吗?谢谢!

最佳答案

感谢@robsol90提出的好问题。

如果我们目视检查模型的内容(打开 ROOT 文件并查看 TBrowser 中的直方图)或仅打印内容(将 XML+ROOT 转换为 JSON 后)

>>> import json
>>> with open("meas.json") as spec_file:
... spec = json.load(spec_file)
...
>>> print(json.dumps(spec, indent=2, sort_keys=True))

我们看到模型中有许多个带有零的箱。这是一个问题,因为 HistFactory 是基于泊松的,并且泊松 p.m.f.严格定义为大于 0 的速率参数,这些真正的 0 bin 将导致错误(确实如此)。然而,如果我们简单地解析规范并添加一个非常小偏移量(epsilon),那么拟合就可以毫无问题地进行。所以这个问题实际上最终与这个问题( Fit convergence failure in pyhf for small signal model )非常相似,但并不是立即显而易见的。

我们知道您设置的玩具模型应该是最小且简单的,但实际上您几乎永远不会遇到如此稀疏的分析区域,这个玩具问题变得很困难。 我们将来会努力自动屏蔽模型中真正零的容器,以避免用户出现此问题。

我还将在下面提供一些解决上述问题的代码以及一些其他示例代码。

<小时/>

首先,要非常明确的是,让我们建立我们的环境

环境

$ "$(which python3)" --version
Python 3.7.5
$ python3 -m venv "${HOME}/.venvs/question"
$ . "${HOME}/.venvs/question/bin/activate"
(question) $ cat requirements.txt
pyhf[xmlio]~=0.4.0
black
(question) $ python -m pip install -r requirements.txt
(question) $ root-config --version
6.18/04

代码

我们还将代码分解为多个步骤。首先,让我们看一下 XML 到 ROOT 的代码片段,我对其进行了修改,以便对观察到的数据中显示的模型进行更合理的采样(但不需要,因为您的原始代码会也在这里工作)。

# XML_to_ROOT.py
from ROOT import TH1F, TFile, TF1


def main():
left_bound = 95
right_bound = 115
n_bins = 80

# Model makeup
frac_bkg = 0.95
frac_sig = round(1.0 - frac_bkg, 2)

bkg_model = TF1("bkg_model", "TMath::Gaus(x,100,0.5,true)", left_bound, right_bound)
sig_model = TF1("sig_model", "TMath::Gaus(x,105,0.2,true)", left_bound, right_bound)
obs_model = TF1(
"obs_model",
f"({frac_bkg}*bkg_model)+({frac_sig}*sig_model)",
left_bound,
right_bound,
)

# Samples from model
n_sample = 10000
n_bkg = int(frac_bkg * n_sample)
n_sig = int(frac_sig * n_sample)

bkg_nominal = TH1F("bkg_nominal", "", n_bins, left_bound, right_bound)
bkg_nominal.FillRandom("bkg_model", n_bkg)

sig_nominal = TH1F("sig_nominal", "", n_bins, left_bound, right_bound)
sig_nominal.FillRandom("sig_model", n_sig)

data_nominal = TH1F("data_nominal", "", n_bins, left_bound, right_bound)
data_nominal.FillRandom("obs_model", n_sample)

meas = TFile("meas.root", "RECREATE")
bkg_nominal.Write()
sig_nominal.Write()
data_nominal.Write()
meas.Close()


if __name__ == "__main__":
main()

现在为了让事情变得更容易,让我们生成 XML 和 ROOT 文件,然后将它们转换为 JSON 规范

(question) $ python XML_to_ROOT.py
(question) $ pyhf xml2json --output-file meas.json meas.xml

现在,最后,让我们调整问题中的代码,通过用偏移量 1e-20 填充所有 bin,确保模型中的任何 bin 都不包含真正的 0。 code> (只是为了证明唯一重要的是它们非零)

# answer.py
import os
import json
import pyhf.readxml
import numpy as np


def main():
with open("meas.json") as spec_file:
spec = json.load(spec_file)

# Pad true zeros to avoid error with evaluating Poisson(x|0)
epsilon = 1e-20
bkg = np.asarray(spec["channels"][0]["samples"][0]["data"]) + epsilon
sig = np.asarray(spec["channels"][0]["samples"][1]["data"]) + epsilon
spec["channels"][0]["samples"][0]["data"] = bkg.tolist()
spec["channels"][0]["samples"][1]["data"] = sig.tolist()

workspace = pyhf.Workspace(spec)

model = workspace.model(measurement_name="meas")
data = workspace.data(model)

best_fit_pars = pyhf.infer.mle.fit(data, model)
print(f"initialization parameters: {model.config.suggested_init()}")
print(
f"best fit parameters:\
\n * signal strength: {best_fit_pars[0]}\
\n * nuisance parameters: {best_fit_pars[1:]}"
)


if __name__ == "__main__":
main()

现在运行我们得到

(question) $ python answer.py 
initialization parameters: [1.0, 1.0, 1.0]
best fit parameters:
* signal strength: 1.000000316044688
* nuisance parameters: [0.99884051 1.02202245]
<小时/>

作为额外的演示,这确实只是由于真正的零,请考虑以下 2 个 bin 示例,该示例被设计为因错误而失败。

# fail.py
import os
import json
import pyhf.readxml
import numpy as np


def main():
with open("meas.json") as spec_file:
spec = json.load(spec_file)

# Fails
bkg = np.asarray([0, 0])
sig = np.asarray([0, 1])
obs = np.asarray([1, 1])
# # Fails
# bkg = np.asarray([1, 0])
# sig = np.asarray([0, 0])
# obs = np.asarray([1, 1])
# # Fails
# bkg = np.asarray([0, 0])
# sig = np.asarray([0, 0])
# obs = np.asarray([1, 1])
# # Pass
# bkg = np.asarray([1e-9, 0])
# sig = np.asarray([0, 1e-9])
# obs = np.asarray([1, 1])
spec["channels"][0]["samples"][0]["data"] = bkg.tolist()
spec["channels"][0]["samples"][1]["data"] = sig.tolist()
spec["observations"][0]["data"] = obs.tolist()

workspace = pyhf.Workspace(spec)

model = workspace.model(measurement_name="meas")
data = workspace.data(model)

best_fit_pars = pyhf.infer.mle.fit(data, model)


if __name__ == "__main__":
main()

关于python - 最小 pyhf 示例失败并显示 'Inequality constraints incompatible',我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60514470/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com