matlab - fitensemble 中先验向量的正确顺序是什么？-6ren

matlab - fitensemble 中先验向量的正确顺序是什么？

转载作者：太空宇宙更新时间：2023-11-03 20:07:06

当使用 matlabs fitensemble 学习分类器时，我可以指定参数 prior 以及参数 classnames。

两个向量中元素的顺序是否相同？真/假类的标准值是多少？

更具体地说:假设真类的先验概率为 0.6，假类的先验概率为 0.4；我应该使用:

ens = fitensemble(...,'prior',[0.6 0.4]) 或

ens = fitensemble(...,'prior',[0.4 0.6]) 或

ens = fitensemble(...,'prior',[0.4 0.6],'classnames',[true false]) 或

ens = fitensemble(...,'prior',[0.4 0.6],'classnames',[false,true]) ?

我在 documentation 中找不到答案.

perfcurve 的文档更具体:

Prior: Either string or array with two elements. It represents prior probabilities for the positive and negative class, respectively. Default is 'empirical', that is, perfcurve derives prior probabilities from class frequencies. If set to 'uniform', perfcurve sets all prior probabilities equal.

最佳答案

ens = fitensemble(X,Y,method,nlearn,learners) 创建一个集成模型来预测对数据的响应。集成由学习者中列出的模型组成。

第一部分

您必须按照类标签的字母顺序使用 prior。

因此，如果标签是 ['A','B']，则使用 'prior',[P(A) P(B)],

或者如果标签是['true','false']，则使用'prior',[P(false) P(true)],

或者如果标签是 [-1 10]，则使用 'prior',[P(-1) P(10)]。

第二部分

关于 classnames，使用此选项是为了让您可以为数据中较少的类调用 fitensemble。

假设您有四个类 A、B、C、D，因此您的 Y 将类似于:

Y = [A;A;B;D;B;A;C;A;A;A;D, ... ];

现在你可以写 'classnames',['A';'B'], 如果你只想要两个类的 fitensemble 并且它与'类名',['B';'A'],。

我知道这是一个迟到的答案，我希望它能有所帮助。

示例

我使用了“fisheriris”数据库，它具有三个类(setosa'、versicolor、virginica`)。

因为它有 150 个案例和每个类别的 50，所以我将数据随机化并选择了 100 个样本。

load fisheriris
rng(12);
idx = randperm(size(meas,1));
meas = meas(idx,:);
species = species(idx,:); 
meas = meas(1 : 100,:);
species = species(1 : 100,:);
trueprior = [ sum(strcmp(species,'setosa')),...
              sum(strcmp(species,'versicolor')),...
              sum(strcmp(species,'virginica'))] / 100;

trueprior = [0.32,0.30,0.38] 显示真实的先验概率。

在下面的代码中，我训练了三个 fitensembles，第一个带有默认选项，所以先验概率是 empirical(与 trueprior);第二个是用 pprior 设置为 trueprior 训练的，这将与第一个结果相同(因为 trueprior 是按类标签的字母顺序排列的) .第三个是按非字母顺序训练的，显示的结果与前两个不同。

ada1 = fitensemble(meas,species,'AdaBoostM2',20,'tree');
subplot(311)
plot(resubLoss(ada1,'mode','individual'));
title('Resubstitution error for default prior (empirical)');
ada2 = fitensemble(meas,species,'AdaBoostM2',20,'tree','prior',trueprior);
subplot(312)
plot(resubLoss(ada2,'mode','individual'));
title('Resubstitution error for prior with alphabetical order of class labels');
ada3 = fitensemble(meas,species,'AdaBoostM2',20,'tree','prior',trueprior(end:-1:1));
subplot(313)
plot(resubLoss(ada3,'mode','individual'));
title('Resubstitution error for prior with random order');

enter image description here

我还使用 classnames 选项训练了一个只有两个类的 fitensemble

ada4 = fitensemble(meas,species,'AdaBoostM1',20,'tree','classnames',...   
       {'versicolor','virginica'});

作为不支持多于两个类的 AdaBoosM1 的证明，这里仅用两个类就可以正常工作。

关于matlab - fitensemble 中先验向量的正确顺序是什么？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18270650/

文章推荐：具有自身后备值的 CSS 变量

文章推荐： c# - 自定义控件 - 首先绘制 OK，第二个不

文章推荐： javascript - 如何获取 JQuery 最接近的输入 ID 和数据属性？

文章推荐： c# - 我怎样才能使窗体背景透明而不是其他控件

r - 如何在 brms 中正确使用 set_prior() 以及从矩阵中提取的值，例如先验(正常(先验[i，1]，先验[i，2]))
我想创建一组参数用于 R 中的 brms 模型: library(brms) tmp <- prior(normal(10,2), nlpar = "x") 理想情况下，我想从导入的矩阵中提取每个先验
python - PyMC3 大 MvNormal 先验
我想在 PyMC3 中指定大型多元正态分布作为先验。该分布的精度矩阵的行列式在数值上等于零。看来这是 PyMC3 的问题。有什么建议么？我只需要最大化后验，无论行列式的值如何，都可以这样做。最佳答案
maven - 如何配置 Typesafe Activator *先验* 以使用现有的本地 Maven 存储库？
(未在 Activator 文档中找到) 通过在文件 build.sbt 中添加以下条目(粗体)，似乎可以让 Activator 也使用现有的本地 Maven 存储库: 解析器 += Seq( 位于“
python - 使用 `LKJCorr` 先验 : PositiveDefiniteError using `NUTS` 在 PyMC3 中修改 BPMF
我之前实现了原始的 Bayesian Probabilistic Matrix Factorization (BPMF) pymc3 中的模型。 See my previous question供引用

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

matlab - fitensemble 中先验向量的正确顺序是什么？