Python Yaml parse inf as float(Python YAML将inf解析为浮点型)-6ren

Python Yaml parse inf as float(Python YAML将inf解析为浮点型)

转载作者：bug小助手更新时间：2023-10-27 20:53:30

In PyYaml or ruamel.yaml I'm wondering if there is a way to handle parsing of specific strings. Specifically, I'd like to be able to parse "[inf, nan]" as [float('inf'), float('nan')]. I'll also note that I would like "['inf', 'nan']" to continue to parse as ['inf', 'nan'], so it's just the unquoted variant that I'd like to intercept and change the current behavior.

在PyYaml或ruamel.yaml中，我想知道是否有一种方法可以处理特定字符串的解析。具体地说，我希望能够将“[inf，nan]”解析为[浮点(‘inf’)，浮点(‘nan’)]。我还会注意到，我希望“[‘inf’，‘nan’]”继续解析为[‘inf’，‘nan’]，因此它只是我想截取并更改当前行为的未加引号的变体。

I'm aware that currently I could use "[.inf, .nan]" or "[!!float inf, !!float nan]", but I'm curious if I could extend the Loader to allow for the syntax that I expected would have worked (but doesn't).

我知道目前我可以使用“[.inf，.nan]”或“[！！Float Inf，！！Float NaN]”，但我很好奇是否可以扩展Loader以支持我期望的语法(但没有)。

Perhaps I'm just making a footgun by allowing "nan" and "inf" to be parsed as floats rather than strings - and I'm interested in hearing compelling reasons that I should not allow for this type of parsing. But I'm not too woried about the case where other parses would parse my configs incorrectly (but maybe I'm underestimating the pain that will cause in the future). I plan to use this as a one way convineince in parsing arguments on the command line, and I don't expect actual config files to be written like this.

也许我只是把“nan”和“inf”作为浮点数而不是字符串来解析，这是一个例子--我很想听听不应该允许这种解析的令人信服的理由。但我并不太担心其他解析器会错误地解析我的语法（但也许我低估了将来会造成的痛苦）。我计划在命令行上使用它作为解析参数的一种方法，我不希望实际的配置文件是这样写的。

In any case I'd still be interested in how it could be done, even if the conclusion is that it shouldn't be done.

无论如何，我仍然对如何做这件事感兴趣，即使结论是它不应该做。

更多回答

优秀答案推荐

Based on the confusion that I have seen caused by Yes, On, No and Off being
interpreted as boolean values in YAML 1.1, I don't think this is a good idea.

基于我所看到的由YAML1.1中的Yes、On、No和Off解释为布尔值造成的混乱，我认为这不是一个好主意。

But it is possible to do this both in ruamel.yaml and PyYAML, by changing the regex
that recognises floats (i.e. that assigns the implicit tag tag:yaml.org,2002:float to the scalar)
and then to make sure the routine constructing a float from a scalar handles these additional
scalars. The three main improvements (with regard to this) in ruamel.yaml are that
it has different regexes for YAML 1.1 and YAML 1.2 parsing (the latter being the default,
the former having to be specified either by a directive, or by setting .version on the YAML() instance);
that the various Resolvers each have a copy of these regexes instead of sharing
one (as in PyYAML, which makes having multiple, differently behaving parsers in one program difficult);
and that regex compilation is delayed until they are actually needed.

但在ruamel.yaml和PyYAML中都可以做到这一点，方法是更改识别浮点数的正则表达式(即将隐式标记：yaml.org，2002：Float分配给标量)，然后确保从标量构造浮点数的例程处理这些额外的标量。在这方面，ruamel.yaml的三个主要改进是，它对YAML 1.1和YAML 1.2的解析有不同的正则表达式(后者是默认的，前者必须通过指令或通过在YAML()实例上设置.Version来指定)；不同的解析器每个都有这些正则表达式的副本，而不是共享一个(就像在PyYAML中一样，这使得在一个程序中有多个行为不同的解析器变得困难)；正则表达式编译被延迟，直到真正需要它们。

Given the differences, the following will only apply to ruamel.yaml

考虑到这些差异，以下内容将仅适用于ruamel.yaml

You need to create a resolver, and replace its regex recognition for all floats,
and then create a constructor that constructs the floats based on the
recognised scalars:

您需要创建一个解析器，并替换其对所有浮点数的正则表达式识别，然后创建一个基于识别的标量构造浮点数的构造函数：

import re, sys
import ruamel.yaml

class NanInfResolver(ruamel.yaml.resolver.VersionedResolver):
    pass

# difference with the regex in resolver.py is the ? after \\.
# as well as recognising N and I as starting chars
# no delayed compile of the regex here
NanInfResolver.add_implicit_resolver(
    'tag:yaml.org,2002:float',
    re.compile('''^(?:
     [-+]?(?:[0-9][0-9_]*)\\.[0-9_]*(?:[eE][-+]?[0-9]+)?
    |[-+]?(?:[0-9][0-9_]*)(?:[eE][-+]?[0-9]+)
    |[-+]?\\.[0-9_]+(?:[eE][-+][0-9]+)?
    |[-+]?\\.?(?:inf|Inf|INF)       
    |\\.?(?:nan|NaN|NAN))$''', re.X),
    list('-+0123456789.niNI')
)

class NanInfConstructor(ruamel.yaml.constructor.RoundTripConstructor):
    def construct_yaml_float(self, node):
        value = self.construct_scalar(node).lower()
        sign = +1
        if value[0] == '-':
            sign = -1
        if value[0] in '+-':
            value_s = value_s[1:]
        if value == 'inf':
            return sign * self.inf_value
        if value == 'nan':
            return self.nan_value
        return super().construct_yaml_float(node)

NanInfConstructor.add_constructor(
    'tag:yaml.org,2002:float', NanInfConstructor.construct_yaml_float
)



yaml_str = """\
[nano, 1.0, .NaN, inf, nan]  # some extra values to test
"""
    
yaml = ruamel.yaml.YAML()
yaml.Resolver = NanInfResolver
yaml.Constructor = NanInfConstructor

data = yaml.load(yaml_str)
for x in data:
    print(type(x), x)
print()
yaml.dump(data, sys.stdout)

which gives:

这提供了：

<class 'str'> nano
<class 'ruamel.yaml.scalarfloat.ScalarFloat'> 1.0
<class 'float'> nan
<class 'float'> inf
<class 'float'> nan

[nano, 1.0, .nan, .inf, .nan] # some extra values to test

That 1.0 is loaded as a ScalarFloat is necessary to preserve its formatting when
dumping. It is possible to preserve the different ways of writing .nan, .inf, nan and inf in a similar way, but you would
have to make a special representer and either extend ScalarFloat or make one
or more explicit types that keep the the original scalar string value. Either way you
would lose the possibility to test with x is float('nan') which may be a problem
in real programs (which is also the
reason why ruamel.yaml doesn't preserve the different forms of null during round-trip).

1.0作为ScalarFloat加载，这是在转储时保留其格式所必需的。可以以类似的方式保留写入.nan、.inf、NaN和inf的不同方式，但您必须创建一个特殊的表示者，并扩展ScalarFloat或创建一个或多个保留原始标量字符串值的显式类型。无论哪种方式，您都将失去使用x is Float(‘nan’)进行测试的可能性，这在实际程序中可能是一个问题(这也是ruamel.yaml在往返过程中不保留不同形式的NULL的原因)。

更多回答

javascript - 控制台错误 - 解析 AJAX JSON 解析
我一直在使用 AJAX 从我正在创建的网络服务中解析 JSON 数组时遇到问题。我的前端是一个简单的 ajax 和 jquery 组合，用于显示从我正在创建的网络服务返回的结果。尽管知道我的数据库查
xml - Json 解析 vs xml 解析？
很难说出这里要问什么。这个问题模棱两可、含糊不清、不完整、过于宽泛或夸夸其谈，无法以目前的形式得到合理的回答。如需帮助澄清此问题以便重新打开，visit the help center . 关闭 1
android - java.lang.NoClassDefFoundError : com. 解析。解析
我在尝试运行 Android 应用程序时遇到问题并收到以下错误 java.lang.NoClassDefFoundError: com.parse.Parse 当我尝试运行该应用时。最佳答案在这
python - 解析 HTML 内容时防止 etree 解析 HTML 实体
有什么办法可以防止etree在解析HTML内容时解析HTML实体吗？ html = etree.HTML('&') html.find('.//body').text 这给了我 '&' 但我想
javascript - 使用 JSON 解析/解析 js 对象时，返回方法中的函数范围会丢失
我有一个有点疯狂的例子，但对于那些 JavaScript 函数作用域专家来说，它看起来是一个很好的练习: (function (global) { // our module number one
java - 使用 Java 解析 HTML 数据(DOM 解析)
关闭。此题需要details or clarity 。目前不接受答案。想要改进这个问题吗？通过 editing this post 添加详细信息并澄清问题. 已关闭 8 年前。 Improve th
php - 在服务器上用 PHP 解析 HTML 还是在最终用户端用 JavaScript 解析 HTML 会更好？
我需要编写一个脚本来获取链接并解析链接页面的 HTML 以提取标题和其他一些数据，例如可能是简短的描述，就像您链接到 Facebook 上的内容一样。当用户向站点添加链接时将调用它，因此在客户端启动
node.js - 为什么 npm 包从/AppData 解析，而不是从 local/node_modules 解析？
在 VS Code 中本地开发时，包解析为 C:/Users//AppData/Local/Microsoft/TypeScript/3.5/node_modules/@types//index而不是
php - 解析 json 错误 : SyntaxError: JSON. 解析:JSON 数据的第 1 行第 2 列出现意外字符
我在将 json 从 php 解析为 javascript 时遇到问题这是我的示例代码: //function MethodAjax = function (wsFile, param) {
php - 解析 json 错误 : SyntaxError: JSON. 解析:JSON 数据的第 1 行第 2 列出现意外字符
我在将 json 从 php 解析为 javascript 时遇到问题这是我的示例代码: //function MethodAjax = function (wsFile, param) {
解析，在哪里可以了解
我被赋予了将一种语言“翻译”成另一种语言的工作。对于使用正则表达式的简单逐行方法来说，源代码过于灵活(复杂)。我在哪里可以了解更多关于词法分析和解析器的信息？最佳答案如果你想对这个主题产生“情绪化
正则表达式 {} 解析
您好，我在解析此文本时遇到问题 { { { {[system1];1;1;0.612509325}; {[system2];1;
JavaScript 解析？
我正在为 adobe after effects 在 extendscript 中编写一些代码，最终变成了 javascript。我有一个数组，我想只搜索单词“assemble”并返回整个 jc3_
JavaScript 解析
我有这段代码: $(document).ready(function() { // }); 问题:FB_RequireFeatures block 外部的代码先于其内部的代码执行。因此 who
解析.netcore项目中IStartupFilter使用教程
背景： netcore项目中有些服务是在通过中间件来通信的，比如orleans组件。它里面服务和客户端会指定网关和端口，我们只需要开放客户端给外界，服务端关闭端口。相当于去掉host，这样省掉了些
解析:继承ViewGroup后的子类如何重写onMeasure方法
1.首先贴上我试验成功的代码复制代码代码如下: protected void onMeasure(int widthMeasureSpec, int heightMeasureSpec)
Python如何对XML 解析
什么是 XML？ XML 指可扩展标记语言（eXtensible Markup Language），标准通用标记语言的子集，是一种用于标记电子文件使其具有结构性的标记语言。你可以通过本站学习 X
解析:php调用MsSQL存储过程使用内置RETVAL获取过程中的return值
【PHP代码】复制代码代码如下: $stmt = mssql_init('P__Global_Test', $conn) or die("initialize sto
解析:清除SQL被注入恶意病毒代码的语句
在SQL查询分析器执行以下代码就可以了。复制代码代码如下: declare @t varchar(255),@c varchar(255) declare table_cursor curs
【JavaScript】前端算法题40道题+解析
前言最近练习了一些前端算法题，现在做个总结，以下题目都是个人写法，并不是标准答案，如有错误欢迎指出，有对某道题有新的想法的友友也可以在评论区发表想法，互相学习🤭 题目题目一: 二维数组中的

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Python Yaml parse inf as float(Python YAML将inf解析为浮点型)