gpt4 book ai didi

php - 将 NOT REGEXP 与 REGEXP 一起使用会弄乱我的结果

转载 作者:行者123 更新时间:2023-11-29 19:42:53 26 4
gpt4 key购买 nike

在我的数据库中,我有一个公司表。该表有一个名为 tags 的字段,其中包含以下内容:

Furniture Retail E-commerce B2C Home & Furniture Consumer Discretionary Furniture British manufacturer retailer contemporary vintage furniture products Designs

我想要做的是查询这些标签,并根据该字段是否包含用户可能输入的任何关键字或短语来返回公司。

例如,如果用户想要查找标签包含单词 Retail 的公司,它将生成如下查询:

SELECT
company.domain,
company.company_name,
CONCAT_WS(
',',
company.business_sector,
company.tags
) AS 'tags',
GROUP_CONCAT(
employee.employee_id SEPARATOR ','
) AS 'employee_ids',
COUNT(employee.employee_id) AS 'employees'
FROM
company
INNER JOIN employee ON company.domain = employee.domain
WHERE
company.tags REGEXP '^Retail| Retail |Retail$'
OR company.business_sector LIKE '%Retail%'
AND company.domain NOT IN (
'@hotmail.com',
'@gmail.com',
'@aol.com'
)
GROUP BY
company.domain

此精确查询返回 11424 个结果,这非常棒。

现在,它失败的部分是当用户输入不应该在此字段内的关键字时。

假设我们不希望其中有任何 Apparel,它会生成以下查询:

SELECT
company.domain,
company.company_name,
CONCAT_WS(
',',
company.business_sector,
company.tags
) AS 'tags',
GROUP_CONCAT(
employee.employee_id SEPARATOR ','
) AS 'employee_ids',
COUNT(employee.employee_id) AS 'employees'
FROM
company
INNER JOIN employee ON company.domain = employee.domain
WHERE
company.tags REGEXP '^Retail| Retail |Retail$'
OR company.business_sector LIKE '%Retail%'
AND (
company.tags NOT REGEXP '^Apparel| Apparel |Apparel$'
AND company.business_sector NOT LIKE '%Apparel%'
)
AND company.domain NOT IN (
'@hotmail.com',
'@gmail.com',
'@aol.com'
)
GROUP BY
company.domain

此精确查询返回 112 个结果,这绝对不应该是这种情况,因为我的搜索结果中不存在包含关键字 Apparel11312 公司数据库。

关于我做错了什么的任何想法

编辑

不是重复...我可以修改我的查询,但这不是问题所在。

例如,让我们从 Retail 中获取这些 11424 结果,并放入一个随机短语,我们知道该短语永远不会出现在任何结果中,我们应该得到相同的 11424 记录:

SELECT
company.domain,
company.company_name,
CONCAT_WS(
',',
company.business_sector,
company.tags
) AS 'tags',
GROUP_CONCAT(
employee.employee_id SEPARATOR ','
) AS 'employee_ids',
COUNT(employee.employee_id) AS 'employees'
FROM
company
INNER JOIN employee ON company.domain = employee.domain
WHERE
(
company.tags REGEXP '^Retail| Retail |Retail$'
OR company.business_sector LIKE '%Retail%'
)
AND (
company.tags NOT REGEXP '^This phrase will never occur| This phrase will never occur |This phrase will never occur$'
AND company.business_sector NOT LIKE '%This phrase will never occur%'
)
AND company.domain NOT IN (
'@hotmail.com',
'@gmail.com',
'@aol.com'
)
GROUP BY
company.domain

我从中获得了 135 条记录,而不是 11424。怎么办?

最佳答案

您确实应该标准化您的数据,将标签存储在单独的表中,这样您就不必执行像这样的 super 复杂逻辑。

与此同时,您的问题出在 bool 组上。 AND 优先于 OR,因此您的查询应该是

SELECT
company.domain,
company.company_name,
CONCAT_WS(
',',
company.business_sector,
company.tags
) AS 'tags',
GROUP_CONCAT(
employee.employee_id SEPARATOR ','
) AS 'employee_ids',
COUNT(employee.employee_id) AS 'employees'
FROM
company
INNER JOIN employee ON company.domain = employee.domain
WHERE
(company.tags REGEXP '^Retail| Retail |Retail$'
OR company.business_sector LIKE '%Retail%')
AND company.tags NOT REGEXP '^Apparel| Apparel |Apparel$'
AND company.business_sector NOT LIKE '%Apparel%'
AND company.domain NOT IN (
'@hotmail.com',
'@gmail.com',
'@aol.com'
)
GROUP BY
company.domain

密切注意括号位置。

关于php - 将 NOT REGEXP 与 REGEXP 一起使用会弄乱我的结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41242042/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com