python - 在 python 中替代 itertools.tee-6ren

python - 在 python 中替代 itertools.tee

转载作者：太空宇宙更新时间：2023-11-04 05:43:21

25

4

我正在处理拆分成多个文件的大量输入数据。为了将处理算法与 I/O 分开，我使用生成器设置了所有内容。这工作得很好，除非我想对通过生成器的数据进行一些中间操作。这是一个命中要点的例子

import numpy as np
from itertools import izip, tee

# Have two input matrices.  In reality they're very large, so data is provided
# one row at a time via generators.
M, N = 100, 3
def gen_data_rows(m,n):
    for i in range(m):
        yield np.random.normal(size=n)

rows1 = gen_data_rows(M,N)
rows2 = gen_data_rows(M,N)

# Signal processing operates on chunks of the input, e.g. blocks of rows and
# yields results at a reduced rate.  Here's a simple example.
def foo_rows(rows):
    i = 0
    for row in rows:
        if i % 5 == 0:
            yield row
        i += 1

# But what if we want to do some transformations between the raw input data
# and the processing?
def fun1(x, y):
    return x + y

def fun2(x, y):
    return (x + y)**2

def foo_transformed_rows(rows1, rows2):
    # Define a generator that consumes both inputs at the same time and
    # produces two streams of output I'd like to send to foo_rows().
    def gen_transformed_rows(rows1, rows2):
        for x, y in izip(rows1, rows2):
            yield fun1(x,y), fun2(x,y)

    # Do I really need to tee the above and define separate generators to pick
    # off each result?
    def pick_generator_idx(gen, i):
        for vals in gen:
            yield vals[i]

    gen_xformed_rows, dupe = tee(gen_transformed_rows(rows1, rows2))
    gen_foo_fun1 = foo_rows(pick_generator_idx(gen_xformed_rows, 0))
    gen_foo_fun2 = foo_rows(pick_generator_idx(dupe, 1))
    for foo1, foo2 in izip(gen_foo_fun1, gen_foo_fun2):
        yield foo1, foo2


for foo1, foo2 in foo_transformed_rows(rows1, rows2):
    print foo1, foo2

我认为这里的主要复杂性是我有两个输入，我想将它们合并到两个中间生成器中(I/O 是瓶颈，所以我真的不想遍历数据两次)。有没有更好的方法来实现foo_transformed_rows()函数？必须 tee() 所需的数据并定义生成器只是为了从元组中挑选项目似乎有点矫枉过正。

编辑:我根据评论稍微修改了示例，但不幸的是，为了保持完整，它仍然很长。基本问题是处理多输入多输出 (MIMO) 数据流。我想我想要一个类似 yield 的语句来产生多个生成器，例如

def two_streams(gen_a, gen_b):
    "Consumes two generators, produces two results."
    for a, b in itertools.izip(gen_a, gen_b):
        c, d = foo(a, b)
        yield c, d

# This doesn't work.  You get one generator of tuples instead of
# two generators of singletons.
gen_c, gen_d = two_streams(gen_a, gen_b)

我想也许会有一些 itertools 魔法来做同样的事情。

最佳答案

我同意@ShadowRanger 的评论，我不明白你为什么要避开tee。它适用于此目的。

但是，对我来说，使用原始生成器似乎更简单、更直观:

def transform_rows(fun, rows1, rows2):
    for x, y in izip(rows1, rows2):
        yield fun(x,y)

rows1a, rows1b = tee(rows1)
rows2a, rows2b = tee(rows2)
gen_foo_fun1 = foo_rows(transform_rows(fun1, rows1a, rows2a)
gen_foo_fun2 = foo_rows(transform_rows(fun2, rows1b, rows2b)

关于python - 在 python 中替代 itertools.tee，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33136957/

25

4

0

文章推荐： html - 列表元素在 2 列中水平对齐

文章推荐： c - 在函数中使用 strcpy

文章推荐： c - 我的单链表代码有什么问题？

文章推荐： python - 在给定日期时间之前查找列表中的最新日期时间

perl - File::Tee 和打开到 "tee"的管道有什么区别吗？
我有一个关于 this answer 的问题，如下引用，由 friedo 回答此处的另一个问题。 (我无权对此发表评论，所以我将其作为一个问题提出。) "You can use File::Tee.
perl - File::Tee 和打开一个管道到 "tee"之间有什么区别吗？
我有一个关于 this answer 的问题，在下面引用，由 friedo 在这里回答另一个问题。 (我无权对此发表评论，所以我将此作为问题提出。) "You can use File::Tee. u
c - C 中的 Tee 函数调用不起作用而不是 tee 命令
您好，我一直在用 C 编写一个 linux shell。我想将我的输出重定向到文件和终端，我发现 tee 是可行的方法。我去了 tee 的 linux 手册页，发现 tee 可以用作函数调用以在 C
bash - 在将 tee 发送到文件之前，如何从 tee 获取 "process"文本
有没有办法在发送到文件之前处理来自 tee 的文本？例如，如果程序输出以下行: stack 11 stack 22 stack 33 serverfault serverfault stack 44
python - itertools.tee 是如何工作的，可以复制 'itertools.tee' 以保存它的 "status"吗？
下面是一些关于itertools.tee的测试: li = [x for x in range(10)] ite = iter(li) ========================
shell - PowerShell 相当于 bash `exec >>(tee -a $logfile); exec 2>>(tee -a $logfile >&2)`
我正在将 bash 脚本日志记录移植到 Powershell，它在文件顶部有以下内容: # redirect stderr and stdout backupdir="/backup" logfile
bash - tee - 如果文件不存在什么都不做
我尝试将 echo 命令保存到日志文件: echo "XXXXX" | tee -a ./directory_with_logs/my_script.log 当文件 my_script.log 存在时
ffmpeg:使用带有分段器的 tee
我正在尝试使用 tee 将我的流输出为 1 分钟的片段并同时输出到一个文件中。这是我的命令: ffmpeg -i "rtsp://${cameraIp}:554/axis-media/media.am
shell - 创建一个写入多个文件的管道 (tee)
我想在 ksh 脚本(使用 exec)中创建一个管道，该管道连接到 tee，并将输出发送到管道。当前: #Redirect EVERYTHING exec 3>&1 #Save STDOUT as
shell - tee 到压缩文件
tee 从标准输入读取并写入标准输出和文件。 some_command |& tee log tee 可以写入压缩文件吗？ some_command |& tee -some_option log.b
PowerShell Tee 到管道中的控制台
这个问题已经有答案了: Can you redirect Tee-Object to standard out? (2 个回答) 已关闭去年。我生成一个 csv 文件: myscript.ps1 |
powershell - Tee-对象到两个管道？
我有以下代码。 $summary = . { while ($true) { # Generating huge list of psobject } } | Tee-
PowerShell Tee 到管道中的控制台
这个问题已经有答案了: Can you redirect Tee-Object to standard out? (2 个回答) 已关闭去年。我生成一个 csv 文件: myscript.ps1 |
MYSQL tee 语法不起作用
有人可以帮我解决这个问题吗？我目前正在尝试将查询写入文件，最终将用 notee 关闭它；称呼。我以前使用过发球电话，但由于某种原因，今天我遇到了问题。这是有问题的语法: tee c:/trash/t
mysql 在存储过程中使用 tee
我是 MySQL(或一般 SQL)新手我试图让 MySQL 使用 TEE 命令将时间戳写入带有存储过程的文件中(我不认为我可以使用“select into outfile”，因为我不想删除该文件，我想
bash tee 去除颜色
我目前正在使用以下内容来捕获进入终端的所有内容并将其放入日志文件中 exec 4&2>&>(tee -a $LOG_FILE) 但是，我不想让颜色转义码/困惑进入日志文件。所以我有这样的东西，有点管用
python 非常缓慢地释放输出到 tee
这个问题在这里已经有了答案: Force line-buffering of stdout in a pipeline (7 个答案) 关闭 9 年前。我正在运行这样的命令: python myc
linux - tee 命令和脚本的问题
我正在尝试在 ubuntu 15.04 上将 tee 命令与 rendercheck 测试一起使用，tee 命令可以很好地处理 6 个 rendercheck 测试，例如: ./renderchec
linux - Tee 命令基本行为
我想通过使用 while 循环和读取来模拟 shell 脚本中 tee 命令的行为，或者是否可以查看命令的内容。最佳答案不确定你在问什么，但为了一个简单的例子，试试这个 - file=$1
c - "tee"类似UNIX中的编程api
我想要这样的东西 $> ps -ax | tee -a processes.txt 在 UNIX C 编程环境中，意味着不通过 shell 脚本。基本上有一个 API，这样我就可以在 STDIN 和

首页

博学

6Ren·AI

商城

python - 在 python 中替代 itertools.tee