python - 如何以概率输出 Shap 值并从二元分类器制作 force

python - 如何以概率输出 Shap 值并从二元分类器制作 force_plot

转载作者：行者123 更新时间：2023-12-05 01:52:51

25

4

我需要绘制每个特征如何影响我的 LightGBM 二元分类器中每个样本的预测概率。所以我需要输出概率的 Shap 值，而不是正常的 Shap 值。它似乎没有任何概率输出选项。

下面的示例代码是我用来生成 Shap 值的数据帧并为第一个数据样本执行 force_plot 的代码。有谁知道我应该如何修改代码来改变输出？我是 Shap 值和 Shap 包的新手。非常感谢。

import pandas as pd
import numpy as np
import shap
import lightgbm as lgbm
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y,  test_size=0.2)
model = lgbm.LGBMClassifier()
model.fit(X_train, y_train)


explainer = shap.TreeExplainer(model)
shap_values = explainer(X_train)

# force plot of first row for class 1
class_idx = 1
row_idx = 0
expected_value = explainer.expected_value[class_idx]
shap_value = shap_values[:,:,class_idx].values[row_idx]

shap.force_plot (base_value = expected_value,  shap_values = shap_value, features = X_train.iloc[row_idx, :], matplotlib=True)

# dataframe of shap values for class 1
shap_df = pd.DataFrame(shap_values[:,:, 1 ].values, columns = shap_values.feature_names)

最佳答案

长话短说:

您可以在force_plot 方法中使用link="logit" 在概率空间中绘制结果:

import pandas as pd
import numpy as np
import shap
import lightgbm as lgbm
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
from scipy.special import expit

shap.initjs()

data = load_breast_cancer()

X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

model = lgbm.LGBMClassifier()
model.fit(X_train, y_train)

explainer_raw = shap.TreeExplainer(model)
shap_values = explainer_raw(X_train)

# force plot of first row for class 1
class_idx = 1
row_idx = 0
expected_value = explainer_raw.expected_value[class_idx]
shap_value = shap_values[:, :, class_idx].values[row_idx]

shap.force_plot(
    base_value=expected_value,
    shap_values=shap_value,
    features=X_train.iloc[row_idx, :],
    link="logit",
)

预期输出:

或者，您可以通过以下方式实现相同的效果，明确指定您有兴趣解释的 model_output="probability":

explainer = shap.TreeExplainer(
    model,
    data=X_train,
    feature_perturbation="interventional",
    model_output="probability",
)
shap_values = explainer(X_train)

# force plot of first row for class 1
class_idx = 1
row_idx = 0

shap_value = shap_values.values[row_idx]

shap.force_plot(
    base_value=expected_value, 
    shap_values=shap_value, 
    features=X_train.iloc[row_idx, :]
)

预期输出:

但是，了解这些数字的来源可能更有趣:

兴趣点的目标概率:

model_proba= model.predict_proba(X_train.iloc[[row_idx]])
model_proba
# array([[0.00275887, 0.99724113]])

以 X_train 作为背景的模型的原始基本案例(注意，LightGBM 为类 1 输出原始数据):

model.predict(X_train, raw_score=True).mean()
# 2.4839751932445577

来自 SHAP 的原始基本案例(注意，它们是对称的):

bv = explainer_raw(X_train).base_values[0]
bv
# array([-2.48397519,  2.48397519])

兴趣点的原始 SHAP 值:

sv_0 = explainer_raw(X_train).values[row_idx].sum(0)
sv_0
# array([-3.40619584,  3.40619584])

从 SHAP 值(通过 sigmoid)推断的 Proba:

shap_proba = expit(bv + sv_0)
shap_proba
# array([0.00275887, 0.99724113])

检查:

assert np.allclose(model_proba, shap_proba)

有什么不明白的地方请提问。

边注

Proba might be misleading if you're analyzing raw size effect of different features because sigmoid is non-linear and saturates after reaching certain threshold.

Some people expect to see SHAP values in probability space as well, but this is not feasible because:

SHAP values are additive by construction (to be precise SHapley Additive exPlanations are average marginal contributions over all possible feature coalitions)

exp(a + b) != exp(a) + exp(b)

您可能会发现有用:

二元分类中的特征重要性和仅提取其中一个类的 SHAP 值 answer
使用 SHAP 时如何解释 GBT 分类器的 base_value？ answer

关于python - 如何以概率输出 Shap 值并从二元分类器制作 force_plot，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/71446065/

25

4

0

文章推荐： java - 取消 future 与停止线程

文章推荐： html - 我怎样才能改变唯一元素的差距

javascript - 如何让 JavaScript 制作(制作)新页面？
我想在一个页面上做一个按钮，可以在同一页面调用一个JS函数。该函数将需要创建(打开)新窗口，其 HTML 代码由 JS 函数本身提供。我该怎么做？这样做的目的是从特定页面生成一个打印友好的页面。请
php - 项目一半用 mysql 制作，一半用 mysqli 制作
我一直在用 php 开发这个项目。该项目的一半是使用 mysql_query 完成的，最新的模块是使用 mysqli 制作的。有很多模块，我不想更改代码。如果是这样的话会不会产生问题。或者我应该将其全
c++ - "Could not determine which "制作 "command to run. Check the "制作 "step in the build configuration."Qt 创建者
我安装了好几次 qt creator，但它从来没有像我现在的 PC 那样花钱；首先，我使用我的 Pendrive(Qt 5.8 的)上一直有的安装程序，告诉我我无法下载一些存储库，我下载了相同安装程序
c++ - “Could not determine which ” 制作 “command to run. Check the ” 制作 “step in the build configuration.” Qt 创建者
我安装了 Qt Creator 5.10.1，当我构建项目时出现错误:“无法确定要运行哪个”make“命令。检查构建配置中的”make“步骤。”。我已经在另一台 PC 上安装了 Qt，我看到了这个问
scripting - 制作/制作文件进度指示!
看看这个 makefile，它有某种原始的进度指示(可能是一个进度条)。请给我建议/意见! # BUILD 最初是未定义的 ifndef 构建 # max 等于 256 个 x 十六:= x x x
jquery - 制作/改进图像预览的智能方法
这个问题会有点长，对此我很抱歉:) 我花了几天时间寻找最好的解决方案，以在 asp mvc 和 JQuery 中制作图像库。主要问题是当用户点击拇指时显示图像。我想让整个浏览器 View 变成黑色
Python 制作 list
我是Python方面的 super 高手。我一直在努力寻找适当的解决方案。这是列表，L = [0, 0, 0, 3, 4, 5, 6, 0, 0, 0, 0, 11, 12, 13, 14, 0, 0
c++ - 制作。异常行为
让我们考虑两个简化的 CMakeLists.txt set(GTEST "/usr/local/lib/libgtest.a") set(GMOCK "/usr/local/lib/libgmock.
c++ - 制作 Makefile
我如何制作 Makefile，因为这是按源代码分发程序的最佳方式。请记住，这是针对 C++ 程序的，而我是从 C 开发领域开始的。但是可以为我的 Python 程序制作 Makefile 吗？最佳答
haskell - 制作 Ord 类的新类型实例
由于 Ord 是 Eq 的子类，我发现很难理解创建该类的新类型实例的样子。我已经设法做到了: newtype NT1 = NT1 Integer instance Eq NT1 wh
powershell - 制作 PowerShell 所需的众多参数中的至少一个
在 PowerShell 中，我想编写一个函数，它接受不同的选项作为参数。没关系，如果它接收多个参数，但它必须接收至少一个参数。我想通过参数定义而不是之后的代码来强制执行它。我可以使用以下代码让它工作
heroku - 在没有手册页的情况下编译/制作 ffmpeg
我正在通过构建包使用 enable-ssl 在 heroku (ubuntu) 上安装 ffmpeg。我能够一直构建到这些错误: install: cannot create regular file
php - 制作 FFmpeg 缩略图？
我是 FFmpeg 的新手，但作为一个学习一些 mysql 数据库的项目，我正在尝试创建一个视频上传网站。当我尝试使用此代码制作缩略图时: shell_exec("/usr/local/bin/ff
libgdx - 制作 Actor 剪辑子图像
我想要一个绘制可绘制对象的 Actor ，但将其剪辑为 Actor 的大小。我从 Widget 派生这个类，并使用一些硬编码的值作为一个简单的测试: public class MyWidget ext
build - 制作 Erlang 版本的最佳实践是什么？
我一直在查看 Faxien+Sinan 和 Rebar，Erlang OTP 的基本理念似乎是，在单个 Erlang 镜像实例上安装应用程序和版本。保持发布自包含的最佳实践是什么？有没有办法打包发布，
svn - 制作 svn 存储库的独立副本
我正在尝试克隆存储库，但它应该是彼此独立的副本。这背后有什么魔法吗，或者只是使用 svn 客户端并克隆它？谢谢最佳答案试试 svnadmin hotcopy .您可以在 repo mainten
TYPO3 制作 2 级菜单
我想做一个这样的菜单: Item 1 Item 2 Item 3 Subitem 1 Subitem 2 但我得到了这个:
yii2 - 制作 Yii2 扩展时的最佳实践
为 Yii 创建扩展的最佳方式是什么？这是我到目前为止所做的我希望它可以通过 composer 安装，所以我为它创建了一个 github repo。我在文件夹 vendor/githubname
java - 制作 ActionListener 时遇到问题
我尝试制作一个ActionListener，但它给了我一个错误。我导入了事件，但它仍然不起作用。这是我的代码: send.addActionListener(new jj); private clas
jQuery 制作 HTML 的副本并存储它以供以后检索
我需要能够将 div 内的 HTML 代码恢复为页面就绪状态。我需要这个，因为我想在页面准备好后对 HTML 代码进行一些更改，然后在需要时将其恢复到页面准备好时的状态.. 我想使用克隆，但是如何只复

首页

博学

6Ren·AI

商城

python - 如何以概率输出 Shap 值并从二元分类器制作 force_plot