gpt4 book ai didi

python - 随机分割数据帧(取决于唯一值)

转载 作者:行者123 更新时间:2023-11-30 22:34:31 25 4
gpt4 key购买 nike

我有一个 DataFrame df ,如下所示:

|  A    |  B  | ... |
---------------------
| one | ... | ... |
| one | ... | ... |
| one | ... | ... |
| two | ... | ... |
| three | ... | ... |
| three | ... | ... |
| four | ... | ... |
| five | ... | ... |
| five | ... | ... |

正如您所见,A 有 5 个唯一值。我想随机分割 DataFrame。例如,我想要 DataFrame df1 中有 3 个唯一值,DataFrame df2 中有 2 个唯一值。我的问题是它们并不是独一无二的。我不想将这些唯一值拆分为两个数据帧。

因此生成的 DataFrame 可能如下所示:

DataFrame df1 具有 3 个唯一值:

|  A    |  B  | ... |
---------------------
| one | ... | ... |
| one | ... | ... |
| one | ... | ... |
| three | ... | ... |
| three | ... | ... |
| five | ... | ... |
| five | ... | ... |

DataFrame df2 具有 2 个唯一值:

|  A    |  B  | ... |
---------------------
| two | ... | ... |
| four | ... | ... |

有什么办法可以轻松实现这一点吗?我考虑过分组,但我不知道如何从中拆分...

最佳答案

设置

df=pd.DataFrame({'A': {0: 'one',
1: 'one',
2: 'one',
3: 'two',
4: 'three',
5: 'three',
6: 'four',
7: 'five',
8: 'five'},
'B': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8}})

解决方案

#get 2 unique keys from column A for df1. You can control the split either
# by absolute number in each group, or by a percentage. Check docs for the .sample() func.
df1_keys = df.A.drop_duplicates().sample(2)
df1 = df[df.A.isin(df1_keys)]
#anything not in df1_keys will be assigned to df2
df2 = df[~df.A.isin(df1_keys)]

df1_keys
Out[294]:
7 five
0 one
Name: A, dtype: object

df1
Out[295]:
A B
0 one 0
1 one 1
2 one 2
7 five 7
8 five 8

df2
Out[296]:
A B
3 two 3
4 three 4
5 three 5
6 four 6

关于python - 随机分割数据帧(取决于唯一值),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44821090/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com