python - 我可以从实例方法中产生吗

转载作者：太空宇宙更新时间：2023-11-03 11:00:29

在类的实例方法中使用 yield 语句可以吗？例如，

# Similar to itertools.islice
class Nth(object):
    def __init__(self, n):
        self.n = n
        self.i = 0
        self.nout = 0

    def itervalues(self, x):
        for xi in x:
            self.i += 1
            if self.i == self.n:
                self.i = 0
                self.nout += 1
                yield self.nout, xi

Python 对此并不提示，简单的案例似乎也能奏效。但是，我只看到了常规函数产生 yield 的示例。

当我尝试将它与 itertools 函数一起使用时，我开始遇到问题。例如，假设我有两个大数据流 X 和 Y，它们存储在多个文件中，我想通过一次数据循环来计算它们的和与差。我可以像下图那样使用 itertools.tee 和 itertools.izip

在代码中它会是这样的(抱歉，它很长)

from itertools import izip_longest, izip, tee
import random

def add(x,y):
    for xi,yi in izip(x,y):
        yield xi + yi

def sub(x,y):
    for xi,yi in izip(x,y):
        yield xi - yi

class NthSumDiff(object):
    def __init__(self, n):
        self.nthsum = Nth(n)
        self.nthdiff = Nth(n)

    def itervalues(self, x, y):
        xadd, xsub = tee(x)
        yadd, ysub = tee(y)
        gen_sum = self.nthsum.itervalues(add(xadd, yadd))
        gen_diff = self.nthdiff.itervalues(sub(xsub, ysub))
        # Have to use izip_longest here, but why?
        #for (i,nthsum), (j,nthdiff) in izip_longest(gen_sum, gen_diff):
        for (i,nthsum), (j,nthdiff) in izip(gen_sum, gen_diff):
            assert i==j, "sum row %d != diff row %d" % (i,j)
            yield nthsum, nthdiff

nskip = 12
ns = Nth(nskip)
nd = Nth(nskip)
nsd = NthSumDiff(nskip)
nfiles = 10
for i in range(nfiles):
    # Generate some data.
    # If the block length is a multiple of nskip there's no problem.
    #n = random.randint(5000, 10000) * nskip
    n = random.randint(50000, 100000)
    print 'file %d n=%d' % (i, n)
    x = range(n)
    y = range(100,n+100)
    # Independent processing is no problem but requires two loops.
    for i, nthsum in ns.itervalues(add(x,y)):
        pass
    for j, nthdiff in nd.itervalues(sub(x,y)):
        pass
    assert i==j
    # Trying to do both with one loops causes problems.
    for nthsum, nthdiff in nsd.itervalues(x,y):
        # If izip_longest is necessary, why don't I ever get a fillvalue?
        assert nthsum is not None
        assert nthdiff is not None
    # After each block of data the two iterators should have the same state.
    assert nsd.nthsum.nout == nsd.nthdiff.nout, \
           "sum nout %d != diff nout %d" % (nsd.nthsum.nout, nsd.nthdiff.nout)

但这会失败，除非我将 itertools.izip 换成 itertools.izip_longest，即使迭代器具有相同的长度。这是最后一个被命中的断言，输出如下

file 0 n=58581
file 1 n=87978
Traceback (most recent call last):
  File "test.py", line 71, in <module>
    "sum nout %d != diff nout %d" % (nsd.nthsum.nout, nsd.nthdiff.nout)
AssertionError: sum nout 12213 != diff nout 12212

编辑:我想从我写的例子来看这并不明显，但输入数据 X 和 Y 仅在 block 中可用(在我的实际问题中，它们在文件中分块)。这很重要，因为我需要维护 block 之间的状态。在上面的玩具示例中，这意味着 Nth 需要产生等同于

>>> x1 = range(0,10)
>>> x2 = range(10,20)
>>> (x1 + x2)[::3]
[0, 3, 6, 9, 12, 15, 18]

不等同于

>>> x1[::3] + x2[::3]
[0, 3, 6, 9, 10, 13, 16, 19]

我可以使用 itertools.chain 提前加入 block ，然后调用 Nth.itervalues，但我想了解有什么问题在调用之间的 Nth 类中维护状态(我的真实应用程序是涉及更多保存状态的图像处理，而不是简单的 Nth/add/subtract)。

我不明白当它们的长度相同时，我的 Nth 实例如何以不同的状态结束。例如，如果我给 izip 两个等长的字符串

>>> [''.join(x) for x in izip('ABCD','abcd')]
['Aa', 'Bb', 'Cc', 'Dd']

我得到相同长度的结果；为什么我的 Nth.itervalues 生成器似乎得到了不相等数量的 next() 调用，即使每个生成器产生相同数量的结果？

最佳答案

Gist repo with revisions | Quick link to solution

快速回答

您永远不会在 class Nth 中重置 self.i 和 self.nout。另外，你应该使用这样的东西:

# Similar to itertools.islice
class Nth(object):
    def __init__(self, n):
        self.n = n

    def itervalues(self, x):
        for a,b in enumerate(islice(x, self.n - 1, None, self.n)):
            self.nout = a
            yield a,b

但是因为你甚至不需要nout，你应该使用这个:

def Nth(iterable, step):
    return enumerate(itertools.islice(iterable, step - 1, None, step))

长答案

你的代码有一种不一致的味道，这让我想到了 NthSumDiff.itervalues() 中的这一行:

for (i,nthsum), (j,nthdiff) in izip(gen_sum, gen_diff):

如果你交换 gen_sum 和 gen_diff，你会发现 gen_diff 永远是 nout > 大一。这是因为 izip() 在从 gen_diff 拉取之前从 gen_sum 拉取。 gen_sum 在 gen_diff 甚至在最后一次迭代中尝试之前引发 StopIteration 异常。

例如，假设您选择了 N 个样本，其中 N % step == 7。在每次迭代结束时，第 N 个实例的 self.i 应该等于 0。但是在最后一次迭代中, gen_sum 中的 self.i 将增加到 7，然后 x 中将不再有元素。它将引发 StopIteration。不过，gen_diff 仍然位于 self.i 等于 0。

如果将 self.i = 0 和 self.nout = 0 添加到 Nth.itervalues() 的开头，问题就会消失。

类(class)

你遇到这个问题只是因为你的代码太复杂而且不是 Pythonic。如果您发现自己在循环中使用大量计数器和索引，那么这是一个好兆头(在 Python 中)退后一步，看看您是否可以简化您的代码。我有很长的 C 编程历史，因此，我仍然不时发现自己在 Python 中做同样的事情。

更简单的实现

言出必行......

from itertools import izip, islice
import random

def sumdiff(x,y,step):
    # filter for the Nth values of x and y now
    x = islice(x, step-1, None, step)
    y = islice(y, step-1, None, step)
    return ((xi + yi, xi - yi) for xi, yi in izip(x,y))

nskip = 12
nfiles = 10
for i in range(nfiles):
    # Generate some data.
    n = random.randint(50000, 100000)
    print 'file %d n=%d' % (i, n)
    x = range(n)
    y = range(100,n+100)
    for nthsum, nthdiff in sumdiff(x,y,nskip):
        assert nthsum is not None
        assert nthdiff is not None
    assert len(list(sumdiff(x,y,nskip))) == n/nskip

问题的更多解释

回应 Brian 的评论:

This doesn't do the same thing. Not resetting i and nout is intentional. I've basically got a continuous data stream X that's split across several files. Slicing the blocks gives a different result than slicing the concatenated stream (I commented earlier about possibly using itertools.chain). Also my actual program is more complicated than mere slicing; it's just a working example. I don't understand the explanation about the order of StopIteration. If izip('ABCD','abcd') --> Aa Bb Cc Dd then it seems like equal-length generators should get an equal number of next calls, no? – Brian Hawkins 6 hours ago

你的问题太长了，我错过了关于来自多个文件的流的部分。让我们看看代码本身。首先，我们需要真正清楚 itervalues(x) 的实际工作原理。

# Similar to itertools.islice
class Nth(object):
    def __init__(self, n):
        self.n = n
        self.i = 0
        self.nout = 0

    def itervalues(self, x):
        for xi in x:
            # We increment self.i by self.n on every next()
            # call to this generator method unless the
            # number of objects remaining in x is less than
            # self.n. In that case, we increment by that amount
            # before the for loop exits normally.
            self.i += 1
            if self.i == self.n:
                self.i = 0
                self.nout += 1
                # We're yielding, so we're a generator
                yield self.nout, xi
        # Python helpfully raises StopIteration to fulfill the 
        # contract of an iterable. That's how for loops and
        # others know when to stop.

在上面的 itervalues(x) 中，对于每个 next() 调用，它都会在内部将 self.i 递增 self。 n 然后产生 OR 它将 self.i 递增 x 中剩余的对象数，然后退出 for 循环，然后退出生成器(itervalues( ) 是一个生成器，因为它产生)。当 itervalues() 生成器退出时，Python 会引发 StopIteration 异常。

因此，对于用 N 初始化的 class Nth 的每个实例，self.i 在耗尽 itervalues(X) 中的所有元素后的值> 将是:

self.i = value_of_self_i_before_itervalues(X) + len(X) % N

现在，当您遍历 izip(Nth_1, Nth_2) 时，它将执行如下操作:

def izip(A, B):
    try:
        while True:
            a = A.next()
            b = B.next()
            yield a,b
    except StopIteration:
        pass

因此，假设 N=10 和 len(X)=13。在对 izip() 的最后一次 next() 调用中，A 和 B 的状态都是 self.i==0。 A.next() 被调用，递增 self.i += 3，用完 X 中的元素，退出 for 循环，返回，然后 Python 引发 >停止迭代。现在，在 izip() 中，我们直接进入异常 block ，完全跳过 B.next()。所以，A.i==3 和 B.i==0 在最后。

第二次尝试简化(具有正确的要求)

这是将所有文件数据视为一个连续流的另一个简化版本。它使用链式、小型、可重复使用的生成器。我会非常非常推荐观看这个 PyCon '14 talk about generators by David Beazley .从你的问题描述来看，应该是100%适用的。

from itertools import izip, islice
import random

def sumdiff(data):
    return ((x + y, x - y) for x, y in data)

def combined_file_data(files):
    for i,n in files:
        # Generate some data.
        x = range(n)
        y = range(100,n+100)
        for data in izip(x,y):
            yield data

def filelist(nfiles):
    for i in range(nfiles):
        # Generate some data.
        n = random.randint(50000, 100000)
        print 'file %d n=%d' % (i, n)
        yield i, n

def Nth(iterable, step):
    return islice(iterable, step-1, None, step)

nskip = 12
nfiles = 10
filedata = combined_file_data(filelist(nfiles))
nth_data = Nth(filedata, nskip)
for nthsum, nthdiff in sumdiff(nth_data):
    assert nthsum is not None
    assert nthdiff is not None

关于python - 我可以从实例方法中产生吗，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33818422/

文章推荐： python - 禁用 Python 基本优化 (-O)

文章推荐： android - 带有口味和包装名称的 Douts

文章推荐： android - Volley OAuth1.0认证

文章推荐： python - 我如何在 django rest 序列化程序中批量创建

javascript - 我需要将文本放在一个中，它位于一个 Div 中，该 Div 位于另一个 Div 中，该 Div 位于另一个 Div 中
我需要将文本放在中在一个 Div 中，在另一个 Div 中，在另一个 Div 中。所以这是它的样子: #document Change PIN
html - 两个背景图像。一个在 HTML 中，一个在 BODY 中。在 Firefox 中，主体图像未呈现
奇怪的事情发生了。我有一个基本的 html 代码。 html，头部， body 。(因为我收到了一些反对票，这里是完整的代码) 这是我的CSS: html { backgroun
ios - 将图像从 asset.xcassets 加载到 imageArray 中，并将其动态加载到 UIImageView 中，该 UIImageView 存在于 UICollectionView 中 - swift
我正在尝试将 Assets 中的一组图像加载到 UICollectionview 中存在的 ImageView 中，但每当我运行应用程序时它都会显示错误。而且也没有显示图像。我在ViewDidLoa
linux - 在 BASH 中，我需要根据 perl 脚本的输出更改一些环境变量。在 tcsh 中，我可以使用别名 eval 组合。不能在 bash 中
我需要根据带参数的 perl 脚本的输出更改一些环境变量。在 tcsh 中，我可以使用别名命令来评估 perl 脚本的输出。 tcsh: alias setsdk 'eval `/localhome/
asp.net - Windows 身份验证适用于 IIS，但不适用于 Kestrel/Microsoft.AspNetCore.Authentication.Negotiate(不在 Chrome 中，有时在 Edge 中，始终在 IE 中)？
我使用 Windows 身份验证创建了一个新的 Blazor(服务器端)应用程序，并使用 IIS Express 运行它。它将显示一条消息“Hello Domain\User!”来自右上方的以下 Ra
java - java 中 Kotlin 中的等价物是什么？
这是我的方法 void login(Event event);我想知道 Kotlin 中应该如何最佳答案在 Kotlin 中通配符运算符是 * 。它指示编译器它是未知的，但一旦知道，就不会有其他类
express - 在 Jade 中，为什么有时我可以按原样使用变量而有时必须将它们包含在#{......} 中？
看下面的代码 for story in book if story.title.length < 140 - var story
c - C 中 strstr() 中 for 循环的错误使用
我正在尝试用 C 语言学习字符串处理。我写了一个程序，它存储了一些音乐轨道，并帮助用户检查他/她想到的歌曲是否存在于存储的轨道中。这是通过要求用户输入一串字符来完成的。然后程序使用 strstr()
c - * 在 sscanf 中，* 在 [] 中
我正在学习 sscanf 并遇到如下格式字符串: sscanf("%[^:]:%[^*=]%*[*=]%n",a,b,&c); 我理解 %[^:] 部分意味着扫描直到遇到 ':' 并将其分配给 a。:
python - 在 Python (2.7.3) 中，如果 str(x) 中的任何字符在 str(y) 中(或 str(y) 在 str(x) 中)，我如何编写一个函数来回答？
def char_check(x,y): if (str(x) in y or x.find(y) > -1) or (str(y) in x or y.find(x) > -1):
ansible - 在 Ansible 中，如何将一行移动到一个 block 中？
我有一种情况，我想将文本文件中的现有行包含到一个新 block 中。 line 1 line 2 line in block line 3 line 4 应该变成 line 1 line 2 line
Django 调试工具栏显示在根 URL 中，但不显示在应用程序 URL 中
我有一个新项目，我正在尝试设置 Django 调试工具栏。首先，我尝试了快速设置，它只涉及将 'debug_toolbar' 添加到我的已安装应用程序列表中。有了这个，当我转到我的根 URL 时，调试
r - 在 R 中，Matlab 中 @ 函数句柄的等价物是什么？
在 Matlab 中，如果我有一个函数 f，例如签名是 f(a,b,c)，我可以创建一个只有一个变量 b 的函数，它将使用固定的 a=a1 和 c=c1 调用 f: g = @(b) f(a1, b,
swiftui - SwiftUI 中 ScrollView 中 VStack 元素中的神秘间距或填充
我不明白为什么 ForEach 中的元素之间有多余的垂直间距在 VStack 里面在 ScrollView 里面使用 GeometryReader 时渲染自定义水平分隔线。 Scrol
cookies - 什么应该存储在 session 中，什么应该存储在 cookie 中？
我想知道，是否有关于何时使用 session 和 cookie 的指南或最佳实践？什么应该和什么不应该存储在其中？谢谢! 最佳答案这些文档很好地了解了 session cookie 的安全问题以及
python - Python 中 matplotlib 中 3d 直方图的奇怪行为
我在 scipy/numpy 中有一个 Nx3 矩阵，我想用它制作一个 3 维条形图，其中 X 轴和 Y 轴由矩阵的第一列和第二列的值、高度确定每个条形的是矩阵中的第三列，条形的数量由 N 确定。
c - c 中 sem_init(...) 中 value 参数的不同用法
假设我用两种不同的方式初始化信号量 sem_init(&randomsem,0,1) sem_init(&randomsem,0,0) 现在， sem_wait(&randomsem) 在这两种情况下
c - 实际值存储在 pstr 中，但是该值如何存储在数组 "WORD"中
我怀疑该值如何存储在“WORD”中，因为 PStr 包含实际输出。？既然Pstr中存储的是小写到大写的字母，那么在printf中如何将其给出为“WORD”。有人可以吗？解释一下？ #include
javascript - 数组索引选择像在 numpy 中，但在 javascript 中
我有一个 3x3 数组: var my_array = [[0,1,2], [3,4,5], [6,7,8]]; 并想获得它的第一个 2
javascript - 在 Javascript 中，如何检测浏览器窗口何时在 View 中？
我意识到您可以使用如下方式轻松检查焦点: var hasFocus = true; $(window).blur(function(){ hasFocus = false; }); $(win

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城