gpt4 book ai didi

python - Pandas - CSV 到 Dataframe(使用 Base64 编码的列)

转载 作者:行者123 更新时间:2023-12-01 02:33:48 24 4
gpt4 key购买 nike

以下代码用于使用 Pandas 将防火墙日志从 csv 提取到数据帧中。

df = pd.read_csv('/Users/alistairgillespie/Documents/Projects/COMP5310/Akamai Data/FINAL/data.csv', dtype = {"_time": str, "city": str,"country": str,"lat": str,"long": str,"region": str,"UA": str,"bytes": str,"cliIP": str,"reqHost": str, "reqMethod": str, "reqPath": str,"reqPort": str,"respCT": str,"respLen": str,"status": str,"referer": str,"date": str,"conn": str,"denyData": str,"denyRules": str,"policy": str,"ruleSet": str,"warnRules": str,"warnData": str,"warnSlrs": str,"warnTags": str})

* 请原谅长行列

进入数据帧后,我想迭代每一行并使用 unquote 和 base64decode 函数调用解码“denyData”列字段(如果不是 NaN)。我正在尝试使用以下代码来执行此操作:

for i, row in df.iterrows():
print(pd.notnull(row))
temp = parse.unquote(row['denyData'])
new = base64.b64decode(temp)
df2.loc[i, 'denyData'] = new

产生以下错误:

TypeError: argument of type 'float' is not iterable

将 csv 字节列处理为 Pandas 数据帧的正确方法是什么?这是清理此类数据的正确方法吗?数据示例如下。

Example of the column with encoded data

最佳答案

你可以尝试if-else,因为错误显然意味着无法处理NaN:

for i, row in df.iterrows():
print(pd.notnull(row))
if pd.notnull(row):
df.loc[i, 'denyData'] = base64.b64decode(parse.unquote(row['denyData']))
else:
df.loc[i, 'denyData'] = np.nan

关于python - Pandas - CSV 到 Dataframe(使用 Base64 编码的列),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46482422/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com