gpt4 book ai didi

python - Pydantic 检查列表字段是否唯一

转载 作者:行者123 更新时间:2023-12-05 05:42:36 29 4
gpt4 key购买 nike

目前,我正在尝试为 Pandas 数据框创建一个 pydantic 模型。我想通过以下检查列是否唯一

import pandas as pd
from typing import List
from pydantic import BaseModel

class CustomerRecord(BaseModel):

id: int
name: str
address: str

class CustomerRecordDF(BaseModel):

__root__: List[CustomerRecord]


df = pd.DataFrame({'id':[1,2,3],
'name':['Bob','Joe','Justin'],
'address': ['123 Fake St', '125 Fake St', '123 Fake St']})

df_dict = df.to_dict(orient='records')

CustomerRecordDF.parse_obj(df_dict)

我现在想在这里运行验证并让它失败,因为地址不是唯一的。

下面返回我需要的

from pydantic import root_validator

class CustomerRecordDF(BaseModel):

__root__: List[CustomerRecord]

@root_validator(pre=True)
def unique_values(cls, values):
root_values = values.get('__root__')
value_set = set()
for value in root_values:
print(value['address'])


if value['address'] in value_set:
raise ValueError('Duplicate Address')
else:
value_set.add(value['address'])
return values

CustomerRecordDF.parse_obj(df_dict)
>>> ValidationError: 1 validation error for CustomerRecordDF
__root__
Duplicate Address (type=value_error)

但我希望能够将此验证器重复用于我创建的其他其他数据框,并在多个列上传递此唯一检查。不仅仅是地址。

理想情况下是这样的

from pydantic import root_validator

class CustomerRecordDF(BaseModel):

__root__: List[CustomerRecord]

_validate_unique_name = root_unique_validator('name')
_validate_unique_address = root_unique_validator('address')

最佳答案

您可以使用内部函数和 allow_reuse 参数:

def root_unique_validator(field):
def validator(cls, values):
# Use the field arg to validate a specific field
...

return root_validator(pre=True, allow_reuse=True)(validator)

完整示例:

import pandas as pd
from typing import List
from pydantic import BaseModel, root_validator


class CustomerRecord(BaseModel):
id: int
name: str
address: str


def root_unique_validator(field):
def validator(cls, values):
root_values = values.get("__root__")
value_set = set()
for value in root_values:
if value[field] in value_set:
raise ValueError(f"Duplicate {field}")
else:
value_set.add(value[field])
return values

return root_validator(pre=True, allow_reuse=True)(validator)


class CustomerRecordDF(BaseModel):
__root__: List[CustomerRecord]

_validate_unique_name = root_unique_validator("name")
_validate_unique_address = root_unique_validator("address")


df = pd.DataFrame(
{
"id": [1, 2, 3],
"name": ["Bob", "Joe", "Justin"],
"address": ["123 Fake St", "125 Fake St", "123 Fake St"],
}
)

df_dict = df.to_dict(orient="records")

CustomerRecordDF.parse_obj(df_dict)

# Output:
# pydantic.error_wrappers.ValidationError: 1 validation error for CustomerRecordDF
# __root__
# Duplicate address (type=value_error)

如果您使用重复的名称:

# Here goes the most part of the full example above

df = pd.DataFrame(
{
"id": [1, 2, 3],
"name": ["Bob", "Joe", "Bob"],
"address": ["123 Fake St", "125 Fake St", "127 Fake St"],
}
)

df_dict = df.to_dict(orient="records")

CustomerRecordDF.parse_obj(df_dict)

# Output:
# pydantic.error_wrappers.ValidationError: 1 validation error for CustomerRecordDF
# __root__
# Duplicate name (type=value_error)

您还可以接收多个字段,并有一个验证所有字段的根验证器。这可能会使 allow_reuse 参数变得不必要。

关于python - Pydantic 检查列表字段是否唯一,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72003987/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com