sqlite - likelihood() 在什么情况下有用？-6ren

sqlite - likelihood() 在什么情况下有用？

转载作者：IT王子更新时间：2023-10-29 06:22:37

25

4

通过阅读 sqlite 文档，我发现了以下函数:

http://www.sqlite.org/lang_corefunc.html#likelihood

The likelihood(X,Y) function returns argument X unchanged. The value Y in likelihood(X,Y) must be a floating point constant between 0.0 and 1.0, inclusive. The likelihood(X) function is a no-op that the code generator optimizes away so that it consumes no CPU cycles during run-time (that is, during calls to sqlite3_step()). The purpose of the likelihood(X,Y) function is to provide a hint to the query planner that the argument X is a boolean that is true with a probability of approximately Y. The unlikely(X) function is short-hand for likelihood(X,0.0625).

假设我知道 1 会在 75% 的时间内返回，那将如何:

select likelihood(x,.75)

帮助查询优化器？

最佳答案

original example是这样的:

Consider the following schema and query:
CREATE TABLE composer(
  cid INTEGER PRIMARY KEY,
  cname TEXT
);
CREATE TABLE album(
  aid INTEGER PRIMARY KEY,
  aname TEXT
);
CREATE TABLE track(
  tid INTEGER PRIMARY KEY,
  cid INTEGER REFERENCES composer,
  aid INTEGER REFERENCES album,
  title TEXT
);
CREATE INDEX track_i1 ON track(cid);
CREATE INDEX track_i2 ON track(aid);

SELECT DISTINCT aname
  FROM album, composer, track
 WHERE cname LIKE '%bach%'
   AND composer.cid=track.cid
   AND album.aid=track.aid;
The schema is for a (simplified) music catalog application, though similar kinds of schemas come up in other situations. There is a large number of albums. Each album contains one or more tracks. Each track has a composer. Each composer might be associated with multiple tracks.

The query asks for the name of every album that contains a track with a composer whose name matches '%bach%'.

The query planner needs to choose among several alternative algorithms for this query. The best choices hinges on how well the expression "cname LIKE '%bach%'" filters the results. Let's give this expression a "filter value" which is a number between 1.0 and 0.0. A value of 1.0 means that cname LIKE '%bach%' is true for every row in the composer table. A value of 0.0 means the expression is never true.

The current query planner (in version 3.8.0) assumes a filter value of 1.0. In other words, it assumes that the expression is always true. The planner is assuming the worst case so that it will pick a plan that minimizes worst case run-time. That's a safe approach, but it is not optimal. The plan chosen for a filter of 1.0 is track-album-composer. That means that the "track" table is in the outer loop. For each row of track, an indexed lookup occurs on album. And then an indexed lookup occurs on composer, then the LIKE expression is run to see if the album name should be output.

A better plan would be track-composer-album. This second plan avoids the album lookup if the LIKE expression is false. The current planner would choose this second algorithm if the filter value was just slightly less than 1.0. Say 0.99. In other words, if the planner thought that the LIKE expression would be false for 1 out of every 100 rows, then it would choose the second plan. That is the correct (fastest) choice for when the filter value is large.

But in the common case of a music library, the filter value is probably much closer to 0.0 than it is to 1.0. In other words, the string "bach" is unlikely to be found in most composer names. And for values near 0.0, the best plan is composer-track-album. The composer-track-album plan is to scan the composer table once looking for entries that match '%bach%" and for each matching entry use indices to look up the track and then the album. The current 3.8.0 query planner chooses this third plan when the filter value is less than about 0.1.

likelihood 函数为数据库提供(希望)更好的过滤器选择性估计。使用示例查询，它看起来像这样:

SELECT DISTINCT aname
  FROM album, composer, track
 WHERE likelihood(cname LIKE '%bach%', 0.05)
   AND composer.cid=track.cid
   AND album.aid=track.aid;

关于sqlite - likelihood() 在什么情况下有用？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20981158/

25

4

0

文章推荐： sqlite - 从 SQLite 中的日期时间字段中仅选择时间

文章推荐： ruby-on-rails - Ruby on Rails 中的电子邮件验证？

文章推荐： android - 使用 insertWithOnConflict 进行更新或插入

python - 预测模型输出百分比 'likelihood' ？
假设我想预测三年级学生大学毕业的可能性百分比 (1-100%)。我有一个包含 100 个观察值的训练数据集，所有这些观察值都包含被分类为“极有可能毕业”的学生的示例。我有另一个数据集，其中包含 500
sqlite - likelihood() 在什么情况下有用？
通过阅读 sqlite 文档，我发现了以下函数: http://www.sqlite.org/lang_corefunc.html#likelihood The likelihood(X,Y) fun
R 调查包函数 svyciprop 与 "likelihood"方法
我正在尝试使用 R 调查包函数 svyciprop 和“似然”方法来计算比例的置信区间。下面是一些示例代码: df svyciprop(~I(var == "a"), survey_design,
R:从 GLMNet 获取 AIC/BIC/Likelihood
我想知道是否可以从 GLMNet 获取 AIC 和 BIC。我发现 glmnet.cr 似乎能够做到这一点，但我的 react 是时间，而不是序数。我可以根据可能性自己计算它，但 glmnet 也不会
python - pymc normal prior + normal likelihood 没有正确收敛？
我是 pymc 和贝叶斯统计的新手。在这里，我试图实现一个极其简单的 pymc 模型，以便与理论结果进行比较。在我的测试用例中，我假设正常先验为 mu~N(20,20) 并且可能性假设为 data~N
optimization - 高斯过程 : Maximum Log-likelihood gives infinite results
我觉得很愚蠢，不明白这里有什么不起作用。我想用一些数据拟合高斯过程。我的协方差函数是基本的平方指数函数: k(x,x0) =σ0²*exp(-(x-x0)²/(2*λ²)) 我有三个超参数来拟合我的
python - 收敛警告 : Maximum Likelihood slows kernel-run-time?
我使用非常简单的代码对象 arma_order_select_ic 来找到用于选择 p 和 q 值的最低信息标准。我不确定我是否做对了，或者代码是否偶然发现了一些错误... 在: y = index

首页

博学

6Ren·AI

商城

sqlite - likelihood() 在什么情况下有用？