Python Yaml parse inf as float(Python YAML将inf解析为浮点型)-6ren

Python Yaml parse inf as float(Python YAML将inf解析为浮点型)

转载作者：bug小助手更新时间：2023-10-25 21:55:04

In PyYaml or ruamel.yaml I'm wondering if there is a way to handle parsing of specific strings. Specifically, I'd like to be able to parse "[inf, nan]" as [float('inf'), float('nan')]. I'll also note that I would like "['inf', 'nan']" to continue to parse as ['inf', 'nan'], so it's just the unquoted variant that I'd like to intercept and change the current behavior.

在PyYaml或ruamel.yaml中，我想知道是否有一种方法可以处理特定字符串的解析。具体地说，我希望能够将“[inf，nan]”解析为[浮点(‘inf’)，浮点(‘nan’)]。我还会注意到，我希望“[‘inf’，‘nan’]”继续解析为[‘inf’，‘nan’]，因此它只是我想截取并更改当前行为的未加引号的变体。

I'm aware that currently I could use "[.inf, .nan]" or "[!!float inf, !!float nan]", but I'm curious if I could extend the Loader to allow for the syntax that I expected would have worked (but doesn't).

我知道目前我可以使用“[.inf，.nan]”或“[！！Float Inf，！！Float NaN]”，但我很好奇是否可以扩展Loader以支持我期望的语法(但没有)。

Perhaps I'm just making a footgun by allowing "nan" and "inf" to be parsed as floats rather than strings - and I'm interested in hearing compelling reasons that I should not allow for this type of parsing. But I'm not too woried about the case where other parses would parse my configs incorrectly (but maybe I'm underestimating the pain that will cause in the future). I plan to use this as a one way convineince in parsing arguments on the command line, and I don't expect actual config files to be written like this.

也许我允许将“nan”和“inf”解析为浮点数而不是字符串，这只是在制造麻烦--我很感兴趣地听到一些令人信服的理由，我不应该允许这种类型的解析。但我并不太担心其他解析器会错误地解析我的配置的情况(但我可能低估了将来会造成的痛苦)。我计划将其用作在命令行上解析参数的一种方便方法，我并不期望实际的配置文件是这样编写的。

In any case I'd still be interested in how it could be done, even if the conclusion is that it shouldn't be done.

无论如何，我仍然对如何做这件事感兴趣，即使结论是它不应该做。

更多回答

优秀答案推荐

Based on the confusion that I have seen caused by Yes, On, No and Off being
interpreted as boolean values in YAML 1.1, I don't think this is a good idea.

基于我所看到的由YAML1.1中的Yes、On、No和Off解释为布尔值造成的混乱，我认为这不是一个好主意。

But it is possible to do this both in ruamel.yaml and PyYAML, by changing the regex
that recognises floats (i.e. that assigns the implicit tag tag:yaml.org,2002:float to the scalar)
and then to make sure the routine constructing a float from a scalar handles these additional
scalars. The three main improvements (with regard to this) in ruamel.yaml are that
it has different regexes for YAML 1.1 and YAML 1.2 parsing (the latter being the default,
the former having to be specified either by a directive, or by setting .version on the YAML() instance);
that the various Resolvers each have a copy of these regexes instead of sharing
one (as in PyYAML, which makes having multiple, differently behaving parsers in one program difficult);
and that regex compilation is delayed until they are actually needed.

但在ruamel.yaml和PyYAML中都可以做到这一点，方法是更改识别浮点数的正则表达式(即将隐式标记：yaml.org，2002：Float分配给标量)，然后确保从标量构造浮点数的例程处理这些额外的标量。在这方面，ruamel.yaml的三个主要改进是，它对YAML 1.1和YAML 1.2的解析有不同的正则表达式(后者是默认的，前者必须通过指令或通过在YAML()实例上设置.Version来指定)；不同的解析器每个都有这些正则表达式的副本，而不是共享一个(就像在PyYAML中一样，这使得在一个程序中有多个行为不同的解析器变得困难)；正则表达式编译被延迟，直到真正需要它们。

Given the differences, the following will only apply to ruamel.yaml

考虑到这些差异，以下内容将仅适用于ruamel.yaml

You need to create a resolver, and replace its regex recognition for all floats,
and then create a constructor that constructs the floats based on the
recognised scalars:

您需要创建一个解析器，并替换其对所有浮点数的正则表达式识别，然后创建一个基于识别的标量构造浮点数的构造函数：

import re, sys
import ruamel.yaml

class NanInfResolver(ruamel.yaml.resolver.VersionedResolver):
    pass

# difference with the regex in resolver.py is the ? after \\.
# as well as recognising N and I as starting chars
# no delayed compile of the regex here
NanInfResolver.add_implicit_resolver(
    'tag:yaml.org,2002:float',
    re.compile('''^(?:
     [-+]?(?:[0-9][0-9_]*)\\.[0-9_]*(?:[eE][-+]?[0-9]+)?
    |[-+]?(?:[0-9][0-9_]*)(?:[eE][-+]?[0-9]+)
    |[-+]?\\.[0-9_]+(?:[eE][-+][0-9]+)?
    |[-+]?\\.?(?:inf|Inf|INF)       
    |\\.?(?:nan|NaN|NAN))$''', re.X),
    list('-+0123456789.niNI')
)

class NanInfConstructor(ruamel.yaml.constructor.RoundTripConstructor):
    def construct_yaml_float(self, node):
        value = self.construct_scalar(node).lower()
        sign = +1
        if value[0] == '-':
            sign = -1
        if value[0] in '+-':
            value_s = value_s[1:]
        if value == 'inf':
            return sign * self.inf_value
        if value == 'nan':
            return self.nan_value
        return super().construct_yaml_float(node)

NanInfConstructor.add_constructor(
    'tag:yaml.org,2002:float', NanInfConstructor.construct_yaml_float
)



yaml_str = """\
[nano, 1.0, .NaN, inf, nan]  # some extra values to test
"""
    
yaml = ruamel.yaml.YAML()
yaml.Resolver = NanInfResolver
yaml.Constructor = NanInfConstructor

data = yaml.load(yaml_str)
for x in data:
    print(type(x), x)
print()
yaml.dump(data, sys.stdout)

which gives:

这提供了：

<class 'str'> nano
<class 'ruamel.yaml.scalarfloat.ScalarFloat'> 1.0
<class 'float'> nan
<class 'float'> inf
<class 'float'> nan

[nano, 1.0, .nan, .inf, .nan] # some extra values to test

That 1.0 is loaded as a ScalarFloat is necessary to preserve its formatting when
dumping. It is possible to preserve the different ways of writing .nan, .inf, nan and inf in a similar way, but you would
have to make a special representer and either extend ScalarFloat or make one
or more explicit types that keep the the original scalar string value. Either way you
would lose the possibility to test with x is float('nan') which may be a problem
in real programs (which is also the
reason why ruamel.yaml doesn't preserve the different forms of null during round-trip).

1.0作为ScalarFloat加载，这是在转储时保留其格式所必需的。可以以类似的方式保留写入.nan、.inf、NaN和inf的不同方式，但您必须创建一个特殊的表示者，并扩展ScalarFloat或创建一个或多个保留原始标量字符串值的显式类型。无论哪种方式，您都将失去使用x is Float(‘nan’)进行测试的可能性，这在实际程序中可能是一个问题(这也是ruamel.yaml在往返过程中不保留不同形式的NULL的原因)。

更多回答

文章推荐： Reactjs routing composed routes(Reactjs路由组成的路由)

css - float float float float ？
我知道问题的标题听起来很奇怪，但我不知道该怎么调用它。首先，我有一个网格布局，我希望我的 .search-wrapper 宽度为 50% 并向右浮动。在我的演示中 jsfiddle整个 .searc
c++ - "float = float - float"中是否存在隐式类型提升？
我们正在使用 QA-C 来实现 MISRA C++ 一致性，但是该工具会为这样的代码喷出错误: float a = foo(); float b = bar(); float c = a - b; 据
c - float* 类型的变量应该指向单个 float 还是一系列 float ？
考虑 float a[] = { 0.1, 0.2, 0.3}; 我很困惑a稍后传递给函数 foo(float* A) .不应该是 float* 类型的变量指向单个浮点数，对吗？就像这里提到的tu
c# - 存在从 'float' 和 'float' 以及从 'float' 到 'float' 的隐式转换
这可能是我一段时间以来收到的最好的错误消息，我很好奇出了什么问题。原代码 float currElbowAngle = LeftArm ? Elbow.transform.localRotation
types - 类型 'float -> float' 与类型 'float' 不匹配
刚开始学习 F#，我正在尝试为 e 生成和评估泰勒级数的前 10 项。我最初编写了这段代码来计算它: let fact n = function | 0 -> 1 | _ -> [1
floating-point - 如何从二进制文件中读取单精度 float 并转换为 Erlang float ？
我已经使用 Erlang 读取二进制文件中的 4 个字节(小端)。在尝试将二进制转换为浮点时，我一直遇到以下错误: ** exception error: bad argument in
c - 为什么将一个小 float 添加到一个大 float 中只会删除小 float ？
假设我有: float a = 3 // (gdb) p/f a = 3 float b = 299792458 // (gdb) p/f b = 29979244
css - Float right 不会在框内 float ，而是在框外 float
我每次都想在浏览器顶部修复这个框。但是右边有一些问题我不知道如何解决所以我寻求帮助。 #StickyBar #RightSideOfStickyBar { float : right ; }
c# - 为什么 (int)==(float) 总是编译为 (float)==(float)
我正在研究 C# 编译器并试图理解数学运算规则。我发现在两种不同的原始类型之间使用 == 运算符时会出现难以理解的行为。 int a = 1; float b = 1.0f; Cons
c - 为什么将小 float 添加到大 float 只会降低小 float ？
假设我有: float a = 3 // (gdb) p/f a = 3 float b = 299792458 // (gdb) p/f b = 29979244
floating-point - 从硬件架构的角度来看，为什么非规范化 float 比其他 float 慢得多？
Denormals众所周知，与正常情况相比，表现严重不佳，大约是 100 倍。这经常导致 unexpected软件 problems . 我很好奇，从 CPU 架构的角度来看，为什么非规范化必须是那
iphone - 在 float 和 float 之间获取随机 float 的最佳方法是什么？
我有一个由两个 float 组成的区间，并且需要生成 20 个随机数，看起来介于两个 float 定义的区间之间。比方说: float a = 12.49953f float b = 39.1123
c++ - 错误 : no matching function for call to ‘QGenericMatrix<4, 3, float>::QGenericMatrix(const float&, const float&, ..., float)’
我正在构建如下矩阵: QMatrix4x3 floatPos4x3 = QMatrix4x3( floatPos0.at(0), floatPos1.at(0), floatPos2.at(0),
floating-point - 标准化 float f之后(之前)的下一个标准化 float 是什么？
给定归一化的浮点数f，在f之前/之后的下一个归一化浮点数是多少。通过微动，提取尾数和指数，我得到了: next_normalized(double&){ if mantissa is n
CSS float : Why is float applied to the element just prior to the element to be floated?
关于 CSS“float”属性的某些东西一直让我感到困惑。为什么将“float”属性应用到您希望 float 的元素之前的元素？为了帮助可视化我的问题，我创建了以下 jsFiddle http://
CSS float : Why is float applied to the element just prior to the element to be floated?
关于 CSS“float”属性的某些东西一直让我感到困惑。为什么将“float”属性应用到您希望 float 的元素之前的元素？为了帮助可视化我的问题，我创建了以下 jsFiddle http://
css - 如何在 float 跨度内将 float 跨度包裹在另一个 float 跨度下方 [包括图表]？
我有一个新闻源/聊天框。每个条目包含两个跨度:#user 和#message。我希望#user 向左浮动，而#message 向左浮动。如果#message 导致行超过容器宽度，#message 应该
css-float - CSS float，清除一个 "row"的 float 元素
我想创建一个“记分卡”网格来输出一些数据。如果每个 div.item 中的数据都具有相同的高度，那么在每个 div.item 上留下一个简单的 float 会提供一个漂亮的均匀布局，它可以根据浏览器大
html - CSS float 属性 - float div 与 float 段落时的行为差异
我正在学习使用 CSS float 属性。我想了解此属性的特定效果。考虑以下简单的 HTML 元素: div1 div2 This is a paragraph 以及以下 CSS 规则: div {
用于 float 或整数的 Python 正则表达式，但不将 float 拆分为两个 float
我正在尝试从可以是 int 或 float 的文件中提取数据。我发现这个正则表达式将从文件 (\d+(\.\d+)?) 中提取这两种类型，但我遇到的问题是它将 float 拆分为两个。 >>> imp

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Python Yaml parse inf as float(Python YAML将inf解析为浮点型)