python - 我试图使我的数据平衡，因为我的目标变量具有多类，并且我想对其进行过采样以使我的数据平衡-6ren

python - 我试图使我的数据平衡，因为我的目标变量具有多类，并且我想对其进行过采样以使我的数据平衡

转载作者：行者123 更新时间：2023-11-30 09:03:32

30

4

让x包含变量:print(x)

    Restaurant  Cuisines    Average_Cost    Rating  Votes   Reviews Area
    0   3.526361    0.693147    5.303305    1.504077    2.564949    1.609438    7.214504
    1   1.386294    4.127134    4.615121    1.504077    2.484907    1.609438    5.905362
    2   2.772589    1.386294    5.017280    1.526056    4.605170    3.433987    6.131226
    3   3.912023    2.833213    5.525453    1.547563    5.176150    4.564348    7.643483
    4   3.526361    2.708050    5.303305    1.435085    5.948035    5.046646    6.126869
    ... ... ... ... ... ... ... ...
    11089   3.912023    0.693147    5.525453    1.648659    5.789960    5.046646    3.135494
    11090   1.386294    6.028279    4.615121    1.526056    3.610918    2.833213    7.643483
    11091   1.386294    2.397895    4.615121    1.504077    3.828641    2.944439    5.814131
    11092   1.386294    6.028279    4.615121    1.410987    3.218876    2.302585    5.905362
    11093   1.386294    6.028279    4.615121    1.029619    0.000000    0.000000    5.564520
    11094 rows × 7 columns

并让 y 为多类目标变量。 打印(y.value_counts())

    30 minutes     7406
    45 minutes     2665
    65 minutes      923
    120 minutes      62
    20 minutes       20
    80 minutes       14
    10 minutes        4
    Name: Delivery_Time, dtype: int64

探索 y 变量后，我们可以看到 30 分钟 类别与其他类别相比具有更高的计数。

为了平衡这些，我尝试 SMOTETomek 对数据进行过采样。但我得到了一个错误:

from imblearn.combine import SMOTETomek
smk = SMOTEtomek(ratio = 1)
x_res, y_res = smk.fit_sample(x,y)

错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-54-426e8b86623d> in <module>()
        1 from imblearn.combine import SMOTETomek
        2 smk = SMOTETomek(ratio = 1)
----> 3 x_res, y_res = smk.fit_sample(x,y)

2 frames
/usr/local/lib/python3.6/dist-packages/imblearn/utils/_validation.py in _sampling_strategy_float(sampling_strategy, y, sampling_type)
    311     if type_y != 'binary':
    312         raise ValueError(
--> 313             '"sampling_strategy" can be a float only when the type '
    314             'of target is binary. For multi-class, use a dict.')
    315     target_stats = _count_class_sample(y)

ValueError: "sampling_strategy" can be a float only when the type of target is binary. For multi-class, use a dict.

最佳答案

您可以看到Smote的实际实现: https://github.com/scikit-learn-contrib/imbalanced-learn/blob/master/imblearn/utils/_validation.py#L355

您只需传递错误中提到的字典即可。但SMOTE算法内部负责多类设置。

做:

from imblearn.oversampling import SMOTE
smote=SMOTE("minority")
X,Y=smote.fit_sample(x_train,y_train)

When dict, the keys correspond to the targeted classes. The
values correspond to the desired number of samples for each targeted
class.

关于python - 我试图使我的数据平衡，因为我的目标变量具有多类，并且我想对其进行过采样以使我的数据平衡，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58872043/

30

4

0

文章推荐： javascript - 如何在 Javascript 中根据变量设置属性值

c++ - 试图 Cout 函数的返回
我是 C++ 的新手，我在使用这段代码时遇到了问题: string output_date(int day, int month, int year){ string date; if
linux - 试图 tar 一个目录而不是列出目录
所以我这样做了 tar cvzf test.zip FP 为了创建目录 FP 的 zip 但是，它会列出 zip 中的目录 FP/ FP/php/ FP/php/pdf/ FP/php/docs/ F
swift - 试图 swift 概括
我正在尝试在 Swift、Xcode 7.3(所以是 Swift 2.2)中创建一个通用类，但我似乎无法让它通过编译器: protocol Struct1Protocol { } struct Str
PHPUnit - 试图@cover 不存在的方法
我的测试用例是这样的: class FooTest extends PHPUnit_Framework_TestCase { /** @covers MyClass::bar */ f
homebrew - SHA256不匹配，试图 brew Wine
我正在尝试将brew install wine作为使electron-builder工作的一步。但是我所能得到的只是以下响应: ==> Installing dependencies for wine
c# - 试图 int.parse 多数组字符串
我这样做: string[,] string1 = {{"one", "0"},{"Two", "5"},{"Three","1"}}; int b = 0; for(int i = 0; i <=
c++ - 试图 Hook Notepad.exe
我正在尝试使用 SetWindowsHookEx 键盘 Hook Notepad.exe。如您所见，工作线程正在将其 ASCII 代码(即 wParam)发送到指定的服务器。 UINT WINAPI
java - android listview null 试图 setAdapter()
我正在尝试将 ListView 实现到我的 Fragment 中，但无论我尝试什么，我都会得到一个 NullPointerException。我检查对象是否为 null 并记录是否为 null，看起来
html - 试图 float 两个 div 但有问题
我尝试在一行中对齐两个 div。使用 float left 属性，一切顺利。但是当我在 div 中使用图像时，它开始产生问题。所以这是我的示例代码:- Some headi
python-3.x - 试图 reshape 我的 numpy 数组以获得额外的维度
我目前正在使用此代码来获取图像的灰度图像表示并以 (512, 370, 1) 的格式表示它大批。 img_instance = cv2.imread(df.iloc[i][x_col]) / 255.
c++ - 试图 Hook 一个窗口的窗口过程。 SetWindowsHookEx 失败返回 NULL HHOOK 并且 GetLastError 返回错误代码 126
总结我正在创建一个简单的应用程序，它允许用户选择一个包含顶级窗口的进程。用户首先键入 native DLL(而非托管 DLL)的路径。然后用户键入将在 Hook 过程中调用的方法的名称。该方法不得返

首页

博学

6Ren·AI

商城

python - 我试图使我的数据平衡，因为我的目标变量具有多类，并且我想对其进行过采样以使我的数据平衡