gpt4 book ai didi

sqlalchemy - 急于在 SQLAlchemy 中加载具有显式自连接和 contains_eager 的分层子项

转载 作者:行者123 更新时间:2023-12-02 00:02:16 25 4
gpt4 key购买 nike

给定以下关系:

- 1 MasterProduct parent -> many MasterProduct children
- 1 MasterProduct child -> many StoreProducts
- 1 StoreProduct -> 1 Store

我在 SQLAlchemy 中定义了以下声明式模型:

class MasterProduct(Base):
__tablename__ = 'master_products'
id = Column(Integer, primary_key=True)
pid = Column(Integer, ForeignKey('master_products.id'))
children = relationship('MasterProduct', join_depth=1,
backref=backref('parent', remote_side=[id]))
store_products = relationship('StoreProduct', backref='master_product')


class StoreProduct(Base):
__tablename__ = 'store_products'
id = Column(Integer, primary_key=True)
mid = Column(Integer, ForeignKey('master_products.id'))
sid = Column(Integer, ForeignKey('stores.id'))
timestamp = Column(DateTime)
store = relationship('Store', uselist=False)


class Store(Base):
__tablename__ = 'stores'
id = Column(Integer, primary_key=True)

我的目标是通过预先加载在 SQLAlchemy 中复制以下查询:

SELECT *
FROM master_products mp_parent
INNER JOIN master_products mp_child ON mp_child.pid = mp_parent.id
INNER JOIN store_products sp1 ON sp1.mid = mp_child.id
LEFT JOIN store_products sp2
ON sp1.mid = sp2.mid AND sp1.sid = sp2.sid AND sp1.timestamp < sp2.timestamp
WHERE mp_parent.id = 6752 AND sp2.id IS NULL

查询选择父 6752 的所有 MasterProduct 子代和所有使用 NULL 按最近时间戳分组的相应商店产品自加入(每组最大 n)。有 82 件商店产品从查询,有 14 个主产品 child 。

我试过以下方法都无济于事:

mp_child = aliased(MasterProduct)
sp1 = aliased(StoreProduct)
sp2 = aliased(StoreProduct)

q = db.session.query(MasterProduct).filter_by(id=6752) \
.join(mp_child, MasterProduct.children) \
.join(sp1, mp_child.store_products) \
.outerjoin(sp2, and_(sp1.mid == sp2.mid, sp1.sid == sp2.sid, sp1.timestamp < sp2.timestamp)) \
.filter(sp2.id == None) \
.options(contains_eager(MasterProduct.children, alias=mp_child),
contains_eager(MasterProduct.children, mp_child.store_products, alias=sp1))

>>> mp_parent = q.first() # the query below looks ok!
SELECT <all columns from master_products, master_products_1, and store_products_1>
FROM master_products INNER JOIN master_products AS master_products_1 ON master_products.id = master_products_1.pid INNER JOIN store_products AS store_products_1 ON master_products_1.id = store_products_1.mid LEFT OUTER JOIN store_products AS store_products_2 ON store_products_1.mid = store_products_2.mid AND store_products_1.sid = store_products_2.sid AND store_products_1.timestamp < store_products_2.timestamp
WHERE master_products.id = %s AND store_products_2.id IS NULL
LIMIT %s
>>> mp_parent.children # only *one* child is eagerly loaded (expected 14)
[<app.models.MasterProduct object at 0x2463850>]
>>> mp_parent.children[0].id # this is correct, 6762 is one of the children
6762L
>>> mp_parent.children[0].pid # this is correct
6752L
>>> mp_parent.children[0].store_products # only *one* store product is eagerly loaded (expected 7 for this child)
[<app.models.StoreProduct object at 0x24543d0>]

退后一步,简化查询以急切地只加载 child 也导致只有 1 个 child 被急切加载而不是所有 14 个:

mp_child = aliased(MasterProduct)
q = db.session.query(MasterProduct).filter_by(id=6752) \
.join(mp_child, MasterProduct.children)
.options(contains_eager(MasterProduct.children, alias=mp_child))

但是,当我使用 joinedloadjoinedload_allsubqueryload 时,all14个 child 急切加载,即:

q = db.session.query(MasterProduct).filter_by(id=6752) \ 
.options(joinedload_all('children.store_products', innerjoin=True))

所以问题似乎是从使用 contains_eager 显式加入。

任何人都可以发现我的方式中的错误或帮助我指明正确的方向吗?

最佳答案

好的,您可能会在 SQL 中观察到出现“LIMIT 1”。那是因为您正在使用 first()。我们可以只比较前两个查询,contains eager 和 joinedload:

join() + contains_eager():

SELECT master_products_1.id AS master_products_1_id, master_products_1.pid AS master_products_1_pid, master_products.id AS master_products_id, master_products.pid AS master_products_pid 
FROM master_products JOIN master_products AS master_products_1 ON master_products.id = master_products_1.pid
WHERE master_products.id = ?
LIMIT ? OFFSET ?

加入加载():

SELECT anon_1.master_products_id AS anon_1_master_products_id, anon_1.master_products_pid AS anon_1_master_products_pid, master_products_1.id AS master_products_1_id, master_products_1.pid AS master_products_1_pid 
FROM (SELECT master_products.id AS master_products_id, master_products.pid AS master_products_pid
FROM master_products
WHERE master_products.id = ?
LIMIT ? OFFSET ?) AS anon_1 JOIN master_products AS master_products_1 ON anon_1.master_products_id = master_products_1.pid

你可以看到第二个查询是完全不同的;因为 first() 意味着应用 LIMIT,joinedload() 知道将“条件”查询包装在子查询中,对其应用限制,然后再应用 JOIN。在 join+contains_eager 的情况下,LIMIT 应用于集合本身,您得到错误的行数。

只需将底部的脚本更改为:

for q, query_label in queries:
mp_parent = q.all()[0]

我得到了您期望的输出:

[explicit join with contains_eager] children=3, store_products=27
[joinedload] children=3, store_products=27
[joinedload_all] children=3, store_products=27
[subqueryload] children=3, store_products=27
[subqueryload_all] children=3, store_products=27
[explicit joins with contains_eager, filtered by left-join] children=3, store_products=9

(这就是为什么获得用户创建的示例如此重要的原因)

关于sqlalchemy - 急于在 SQLAlchemy 中加载具有显式自连接和 contains_eager 的分层子项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20620393/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com