gpt4 book ai didi

python - Sklearn LogisticRegression 求解器需要 2 类数据

转载 作者:太空宇宙 更新时间:2023-11-03 21:13:39 24 4
gpt4 key购买 nike

我正在尝试通过 sklearn 运行逻辑回归:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
import datetime as dt
import pandas as pd
import numpy as np
import talib
import matplotlib.pyplot as plt
import seaborn as sns

col_names = ['dates','prices']
# load dataset
df = pd.read_csv("DJI2.csv", header=None, names=col_names)

df.drop('dates', axis=1, inplace=True)
print(df.shape)
df['3day MA'] = df['prices'].shift(1).rolling(window = 3).mean()
df['10day MA'] = df['prices'].shift(1).rolling(window = 10).mean()
df['30day MA'] = df['prices'].shift(1).rolling(window = 30).mean()
df['Std_dev']= df['prices'].rolling(5).std()
df['RSI'] = talib.RSI(df['prices'].values, timeperiod = 9)
df['Price_Rise'] = np.where(df['prices'].shift(-1) > df['prices'], 1, 0)
df = df.dropna()

xCols = ['3day MA', '10day MA', '30day MA', 'Std_dev', 'RSI', 'prices']
X = df[xCols]
X = X.astype('int')
Y = df['Price_Rise']
Y = Y.astype('int')

logreg = LogisticRegression()

for i in range(len(X)):
#Without this case below I get: ValueError: Found array with 0 sample(s) (shape=(0, 6)) while a minimum of 1 is required.
if(i == 0):
continue
logreg.fit(X[:i], Y[:i])

但是,当我尝试运行此代码时,出现以下错误:

ValueError: 
This solver needs samples of at least 2 classes in the data, but the data contains only one class: 58

我的 X 数据的形状是:(27779, 6)我的 Y 数据的形状是:(27779,)

这是一个 df.head(3) 示例,用于查看我的数据:

     prices    3day MA  10day MA   30day MA   Std_dev        RSI  Price_Rise
30 58.11 57.973333 57.277 55.602333 0.247123 81.932338 1
31 58.42 58.043333 57.480 55.718667 0.213542 84.279674 1
32 58.51 58.216667 57.667 55.774000 0.249139 84.919586 0

我尝试搜索自己从哪里得到这个问题,但我只找到了 these two答案,两者都将这个问题视为 sklearn 中的错误,但它们都是大约。两岁了,所以我不认为我有同样的问题。

最佳答案

您应该确保 Y[:i] 中有两个唯一值。因此,在循环之前,添加如下内容:

starting_i = 0
for i in range(len(X)):
if np.unique(Y[:i]) == 2:
starting_i = i

然后在运行主循环之前检查starting_i 是否不为0。或者更简单,您可以找到第一个出现的位置 Y[i] != Y[0]。

关于python - Sklearn LogisticRegression 求解器需要 2 类数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54873428/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com