gpt4 book ai didi

python - 如何使用fastai实现sklearns StratifiedKfold?

转载 作者:行者123 更新时间:2023-11-30 08:56:38 32 4
gpt4 key购买 nike

我正在参加 APTOS 2019 Kaggle 竞赛,并尝试组合 5 层折叠,但在正确实现 StratifiedKFold 时遇到问题。

我尝试用谷歌搜索 fastai 讨论,但没有看到任何解决方案。我正在使用 fastai 库并有一个预训练的模型。

def get_df():
base_image_dir = os.path.join('..', 'input/aptos2019-blindness-
detection/')
train_dir = os.path.join(base_image_dir,'train_images/')
df = pd.read_csv(os.path.join(base_image_dir, 'train.csv'))
df['path'] = df['id_code'].map(lambda x:
os.path.join(train_dir,'{}.png'.format(x)))
df = df.drop(columns=['id_code'])
df = df.sample(frac=1).reset_index(drop=True) #shuffle dataframe
test_df = pd.read_csv('../input/aptos2019-blindness-
detection/sample_submission.csv')
return df, test_df

df, test_df = get_df()

random_state = np.random.seed(2019)
skf = StratifiedKFold(n_splits=5, random_state=random_state, shuffle=True)

X = df['path']
y = df['diagnosis']

#getting the splits
for train_index, test_index in skf.split(X, y):
print('##')
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
train = X_train, y_train
test = X_test, y_test
train_list = [list(x) for x in train]
test_list = [list(x) for x in test]


data = (ImageList.from_df(df=df,path='./',cols='path')
.split_by_rand_pct(0.2)
.label_from_df(cols='diagnosis',label_cls=FloatList)
.transform(tfms,size=sz,resize_method=ResizeMethod.SQUISH,padding_mode='zeros')
.databunch(bs=bs,num_workers=4)
.normalize(imagenet_stats)
)

learn = Learner(data,
md_ef,
metrics = [qk],
model_dir="models").to_fp16()
learn.data.add_test(ImageList.from_df(test_df,
'../input/aptos2019-blindness-detection',
folder='test_images',
suffix='.png'))

我想使用从 skf.split 获得的折叠来训练我的模型,但我不知道该怎么做。

最佳答案

有两种方法可以做到这一点。

  1. 将“split_by_idxs”与索引结合使用
    data = (ImageList.from_df(df=df,path='./',cols='path')
.split_by_idxs(train_idx=train_index, valid_idx=test_index)
.label_from_df(cols='diagnosis',label_cls=FloatList)
.transform(tfms,size=sz,resize_method=ResizeMethod.SQUISH,padding_mode='zeros')
.databunch(bs=bs,num_workers=4)
.normalize(imagenet_stats)
)
  • 使用“split_by_list”
  •    il = ImageList.from_df(df=df,path='./',cols='path')

    data = (il.split_by_list(train=il[train_index], valid=il[test_index])
    .label_from_df(cols='diagnosis',label_cls=FloatList)
    .transform(tfms,size=sz,resize_method=ResizeMethod.SQUISH,padding_mode='zeros')
    .databunch(bs=bs,num_workers=4)
    .normalize(imagenet_stats)
    )

    关于python - 如何使用fastai实现sklearns StratifiedKfold?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57781005/

    32 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com