python - 让 OneHotEncoder 管理转换步骤中看不见的值-6ren

python - 让 OneHotEncoder 管理转换步骤中看不见的值

转载作者：行者123 更新时间：2023-12-01 05:40:18

我正在使用sklearn.preprocessing.OneHotEncoder对表单的分类数据进行编码

A=array([[1,4,1],[0,3,2]])
B=array([[1,4,7],[0,3,2]])

假设我在 .fit(A) 步骤中使用 A 并在某个时刻使用 B 作为 .transform 的新数据(B)。如果 B 包含与 A 相关的未见值，则这样做会产生特征越界错误。是否可以让 B 包含新的未见值，以便转换步骤将相关值的所有二进制文件设置为零？

ValueError: Feature out of bounds. Try setting n_values.

我知道我可以在 .fit 时间更改功能范围。但如果我使用 A 作为训练数据，每次我得到一个新的 B 集来预测时，我就不得不打乱我的初始编码。

谢谢。

最佳答案

Is it possible to have B containing new unseen values such that the transform step sets all binaries to zero for the concerned value?

不，但如果 OneHotEncoder 这样做就好了，所以我打开了一个 issue为了这。目前，您只需将 n_values 设置得高一点即可。

关于python - 让 OneHotEncoder 管理转换步骤中看不见的值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17715723/

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章