gpt4 book ai didi

python - 根据 Python 中的条件拆分列

转载 作者:行者123 更新时间:2023-12-05 05:33:35 28 4
gpt4 key购买 nike

我有一个数据集,其中一列中有多个值,问题是这些列中可能有一些空值。我需要从此列创建三个不同的列,其中字符数和位置都不固定。

之前的数据:

df=pd.DataFrame({'Date':['2-18-2019','2-18-2019','2-19-2019','2-19-2019','2-20-2019','2-21-2019','2-21-2019','2-22-2019'],'Item':['NY01','Ld01','Du02','Du01','Ps55','L55','Du85','L85'],'SizeAgeQuantity':['13 3/8 5 846','4 1/2 557 85','9 5/8 47 4464','30 58','32 304 304','32 304 304 ','7 6588 685','4118 587','29']})


Date | Item | SizeAgeQuantity
2-18-2019 | NY01 | 13 3/8 5 846
2-18-2019 | Ld01 | 4 1/2 557 85
2-19-2019 | Du02 | 9 5/8 47 4464
2-19-2019 | Du01 | 30 58
2-20-2019 | Ps55 | 32 304 304
2-21-2019 | L55 | 7 6588 685
2-21-2019 | Du85 | 4118 587
2-22-2019 | L85 | 29

我要找的结果是这样的:

   Date    |    Item    |    Size    |    Age   |   Quantity
2-18-2019 | NY01 | 13 3/8 | 5 | 846
2-18-2019 | Ld01 | 4 1/2 | 557 | 85
2-19-2019 | Du02 | 9 5/8 | 47 | 4464
2-19-2019 | Du01 | 30 | 58 |
2-20-2019 | Ps55 | 32 | 304 | 304
2-21-2019 | L55 | 7 | 6588 | 685
2-21-2019 | Du85 | | 4118 | 587
2-22-2019 | L85 | | 29 |

唯一一致的是“尺寸”列应该只包含以下集合中的值(“4 1/2”,“7”,“9 5/8”,“13 3/8”,“18” ", "30", "32")

我尝试了以下代码:df['Size'], df['FrakS'], df['Age'], df['Quantity'] = df['SizeAgeQuantity'].str。拆分(' ', 3).str

但结果如下:

   Date    |    Item    |    Size    |   FrakS   |    Age   |   Quantity
2-18-2019 | NY01 | 13 | 3/8 | 5 | 846
2-18-2019 | Ld01 | 4 | 1/2 | 557 | 85
2-19-2019 | Du02 | 9 | 5/8 | 47 | 4464
2-19-2019 | Du01 | 30 | 58 | |
2-20-2019 | Ps55 | 32 | 304 | 304 |
2-21-2019 | L55 | 7 | 658 | 685 |
2-21-2019 | Du85 | 4118 | 587 | |
2-22-2019 | L85 | 29 | | |

如果有人能帮助我,我将不胜感激

最佳答案

这应该可以解决问题:

import pandas as pd
import numpy as np

df = pd.DataFrame({
'Date':['2-18-2019','2-18-2019','2-19-2019','2-19-2019','2-20-2019','2-21-2019','2-21-2019','2-22-2019'],
'Item':['NY01','Ld01','Du02','Du01','Ps55','L55','Du85','L85'],
'SizeAgeQuantity':['13 3/8 5 846','4 1/2 557 85','9 5/8 47 4464','30 58','32 304 304','7 6588 685','4118 587','29']})

size_list = ["4 1/2", "7", "9 5/8", "13 3/8", "18", "30", "32"]

columns = []

for row in df.SizeAgeQuantity:
values = row.split()
# if there's a "/" in the row,
# combine values 1 and 2
if "/" in row:
size = " ".join(values[:2])
del values[0:2]
values.insert(0, size)

# add nan padding to the values list
values = values + [np.nan] * (3-len(values))

# if the 1st value is not size, shift list right
if values[0] not in size_list:
values = values[-1:] + values[:-1]

columns.append(values)

saq = pd.DataFrame(columns, columns=["Size", "Age", "Quantity"])
out = pd.concat([df.drop("SizeAgeQuantity", axis=1), saq], axis=1)
print(out)

输出:

        Date  Item    Size   Age Quantity
0 2-18-2019 NY01 13 3/8 5 846
1 2-18-2019 Ld01 4 1/2 557 85
2 2-19-2019 Du02 9 5/8 47 4464
3 2-19-2019 Du01 30 58 NaN
4 2-20-2019 Ps55 32 304 304
5 2-21-2019 L55 7 6588 685
6 2-21-2019 Du85 NaN 4118 587
7 2-22-2019 L85 NaN 29 NaN

关于python - 根据 Python 中的条件拆分列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73785055/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com