gpt4 book ai didi

python - 不同列名的 WHERE 子句

转载 作者:行者123 更新时间:2023-12-01 06:25:09 27 4
gpt4 key购买 nike

下面的脚本反射(reflect)了我更新、编辑的尝试(遵循下面的建议),用操作数据库中的表中的行填充维度表,前提是 PANDAS DataFrame 中的主键是通过连接 OPDB 中相关表中的 ID 列创建的,不存在于维度表中。

import mysql.connector
import pandas as pd

...

op_cursor = op_connector.cursor
dwh_cursor = dwh_connector.cursor

...

class dimension_table:
def __init__(self, dwh_cols, op_cols, dim_id, dwh_table_name, op_table_name,op_args=None, dwh_args=None):
self.dwh_cols = ('')
self.op_cols = ('')
self.dim_id = dim_id
self.dwh_table_name = dwh_table_name
self.op_table_name = '`*opdb.*`.' + op_table_name
self.op_args = ",".join(op_cols)
self.dwh_args = ",".join(dwh_cols)

...


billing_address_data = dimension_table(("id","address", "alias", "postal_code", "type", "city", "country",
"geolocation"),
("id","address", "alias", "postal_code", "type", "city", "country",
"geolocation"),
billing_address_dim_id,'billing_address_dim', 'billing_address')

...

def load_dim(instance):
sql = """INSERT INTO {dwh} ({dwh_cols})
SELECT {op_cols}
FROM {op}
WHERE {pk} NOT IN
(SELECT {pk} FROM {dwh} WHERE id = %s)
LIMIT 1
"""
for key in instance.dim_id:

try:
# ID APPEND
dwh_cursor.execute(sql.format(dwh = instance.dwh_table_name,
dwh_cols = instance.dwh_args,
op_cols = instance.op_args,
op = instance.op_table_name,
pk = 'id'),

str(key))

dwh_connector.commit()

except mysql.connector.ProgrammingError as err:
# ORDER_ID APPEND
dwh_cursor.execute(sql.format(dwh = instance.dwh_table_name,
dwh_cols = instance.dwh_args,
op_cols = instance.op_args,
op = instance.op_table_name,
pk = 'order_id'),

str(key))

dwh_connector.commit()

billing_profile_op_id = dwh_cursor.lastrowid

...

load_dim(order_items_data)

我的最新问题是由于运行脚本中的最后一行代码而导致的错误,load_dim(order_items_data)。它是带有 order_id PK 的 order_items 表。

ProgrammingError: 1054 (42S22): Unknown column 'id' in 'where clause'

最佳答案

考虑 try/except 并通过使用带有 IN 子句的纯插入选择 SQL 查询来避免所有查询构建和 fetch 检查这反射(reflect)了非重复追加查询的需求。请参阅NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL .

下面使用LIMIT 1替换fetchone(),否则使用TOP 1仅获取前1行 取决于 RDBMS。另外,参数占位符使用 %s,否则使用 ?,具体取决于 Python DB-API。在以后的文章中,始终标记 RDBMS 并使用 import 行显示 DB-API。

def load_dim(instance):
sql = """INSERT INTO {dwh} ({dwh_cols})
SELECT {op_cols}
FROM {op}
WHERE {pk} NOT IN
(SELECT {pk} FROM {dwh} WHERE {pk} = %s)
LIMIT 1
"""
for key in instance.dim_id:

try:
# ID APPEND
dwh_cursor.execute(sql.format(dwh = instance.dwh_table_name,
dwh_cols = instance.dwh_args,
op_cols = instance.op_args,
op = instance.op_table_name,
pk = 'id'),
(str(key),))

dwh_connector.commit()

except Exception as e: # ADJUST TO DB-API SPECIFIC Error
# ORDER_ID APPEND
dwh_cursor.execute(sql.format(dwh = instance.dwh_table_name,
dwh_cols = instance.dwh_args,
op_cols = instance.op_args,
op = instance.op_table_name,
pk = 'order_id'),
(str(key),))

dwh_connector.commit()

billing_profile_op_id = dwh_cursor.lastrowid # RETURNS 0 IF NO DATA APPENDED

关于python - 不同列名的 WHERE 子句,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60192959/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com