gpt4 book ai didi

MySQL "greater than"条件有时返回具有相等值的行

转载 作者:可可西里 更新时间:2023-11-01 06:39:06 24 4
gpt4 key购买 nike

我在使用基本 MySQL 查询时遇到了一个令人费解的问题。

这是我的 table :

id | rating
1 | 1317.17
2 | 1280.59
3 | 995.12
4 | 973.88

现在,我试图找到 rating 列大于特定值的所有行。如果我尝试以下查询:

SELECT * FROM (`users`) WHERE `rating` > '995.12'

它正确返回 2

但是,如果我尝试

SELECT * FROM (`users`) WHERE `rating` > '973.88'

它返回 4!所以就好像它认为表中的973.88大于973.88,但它不会和995.12犯同样的错误。无论我是从 PHP 脚本还是在 phpMyAdmin 中运行查询,都会发生这种情况。

有什么想法吗?

最佳答案

决定和后果

这就是您因为决定使用 float 据类型而得到的结果。 float 不精确。这意味着:是的,你可以得到 a>a = true

例如,您的第四行:

mysql> SELECT *  FROM t WHERE id=4;+------+--------+| id   | rating |+------+--------+|    4 | 973.88 |+------+--------+1 row in set (0.00 sec)

I've left data type as you've posted, it's FLOAT. Here we are:

mysql> SELECT rating>973.88 FROM t WHERE id=4;+---------------+| rating>973.88 |+---------------+|             1 |+---------------+1 row in set (0.00 sec)

Oops!


Behind the screen

Why? To understand why it is so, you should realize how floating-point data type is represented. Long story is here. But - I'll take a brief overview.

Here how it is represented: enter image description here where:

  • s is the sign
  • b is the base. It's meaning is same as radix
  • e is the exponent.

That means we can represent one number in different ways - and that depends of which base we'll chose. Most common is b=2. But not all real numbers can be represented exactly with this base, even if in decimal base they look "good". Famous example is 0.1 - which can not be represented in b=2 precisely - so it is stored approximately. Again, long story you can see here - but I'll just note, that it's impossible to represent it precisely with base 2.

The result is: even if number is precise in decimal radix, it still may be impossible to represent it precisely - and, therefore, it will be stored approximately. That's how it works and, in fact, this is intended - because of structure of floats itself.


What to do

Fixed precision

Well, first, you should ask yourself: do you really need float? Attention: I said: float. Because - there are also fixed point numbers. They will represent number with fixed precision. To say it easy: with fixed-point data type you may be sure that you'll store exactly what you see on the screen. So if it's 973.88 - then it's 973.88 and not 973.8800000439234. Moving to the deal:

mysql> ALTER TABLE t CHANGE rating rating DECIMAL(8,2);Query OK, 4 rows affected, 4 warnings (0.47 sec)Records: 4  Duplicates: 0  Warnings: 4

and ..

mysql> SELECT rating>973.88 FROM t WHERE id=4;+---------------+| rating>973.88 |+---------------+|             0 |+---------------+1 row in set (0.00 sec)

TADA! Magic happens. Your number is now stored with fixed precision, thus, such comparison failed.

Using float

Then, may be there are use-cases when you're stuck with floats (however, in case of DBMS it's hard for me to remember even one such use-case - if only not the case with large amount of calculations, which may cause performance impact, see description below). Then there's still a way to make it work. You should decide what precision is applicable for you. That is: from which point will you treat numbers as equals.

You're storing only two significant digits, so I assume that precision of 1E-5 would be more than enough. Then, your query will look like:

mysql> set @eps=1E-5;Query OK, 0 rows affected (0.00 sec)

and use it with:

SELECT * FROM t WHERE rating>973.88+@eps

这将导致

+------+---------+| id   | rating  |+------+---------+|    1 | 1317.17 ||    2 | 1280.59 ||    3 |  995.12 |+------+---------+

哪个更好?

要意识到这一点,您需要再次隐藏起来。我已经简要概述了什么是 float 数据类型以及它为什么不精确。然而,fixed 数据类型也有它的弱点。 在 DBMS 的上下文中,这可能不是我们应该担心的事情,但我会提到它:fixed 数据类型通常会对性能产生影响。这将取决于您将在 DBMS 中进行多少计算。

在 MySQL 中,fixed-point data types (例如 DECIMAL)被实现为 BCD strings (所以长话短说 - 再次,这里是 wiki 链接)。这意味着与 float 相比,它会导致性能问题。但是,如果您不打算经常在 DBMS 中进行计算,那么这种影响甚至不会引人注意 - 我之所以提到它,是因为浮点和定点这两种类型都有其自身的问题。


结论

DBMS 与所有其他计算机设备一样,并不完美。它只是使用一些内部的东西来完成工作。这意味着:在某些情况下,您必须了解内部事物的工作原理才能理解为什么会得到一些奇怪的结果。

特别是, float 不精确。是的,互联网上有大量类似的答案,但我会重复一遍。它们精确。当涉及 float 时,您不应该依赖精度。并且 - 在几乎所有的 DBMS 中都有定点数据类型。并且 - 在像您这样的情况下,您应该使用它们。他们将做同样的工作,但有了他们,您将确定所选的精度。

但是,如果要在 DBMS 中进行过多计算,您可能希望使用 float 。但是,另一方面,那是关于——你为什么要那样做?为什么不使用应用程序来生成这些计算(因此,避免使用定点数据类型的性能影响和 float 的预测问题 - 因为使用具有平均计算量的定点是可以的)

关于MySQL "greater than"条件有时返回具有相等值的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23341763/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com