python - 在 Python 中将 3D 32 位 float 组保存为 48 位整数 PNG 以匹配 Kitti Ground Truth 格式-6ren

python - 在 Python 中将 3D 32 位 float 组保存为 48 位整数 PNG 以匹配 Kitti Ground Truth 格式

转载作者：行者123 更新时间：2023-12-02 17:28:39

26

4

Kitti 有一个光流基准。他们要求流量估计为 48 位 PNG 文件，以匹配他们拥有的地面实况文件的格式。

Ground Truth PNG 图片可用于download here

Kitti 有一个 Matlab DevKit 用于估计与地面实况的比较。

我想将网络中的流量输出为 48 位整数 PNG 文件，以便可以将我的流量估计值与其他 Kitti 基准流量估计值进行比较。

来自网络的 numpy 缩放流文件是 downloadable from here

但是，我无法在 python 中将 float32 3D 数组流转换为 3 channel 48 位文件(每 channel 16 位)，因为图像库提供程序似乎不支持这一点，或者因为我做错了什么我的代码。任何人都可以帮忙吗？

我尝试了很多不同的库并阅读了很多帖子。

不幸的是，Scipy 输出了一个只有 24 位的 png。
使用 scipy available here 生成的输出流估计 png

# Numpy Flow to 48bit PNG with 16bits per channel

import scipy as sp
from scipy import misc
import numpy as np
import png
import imageio
import cv2
from PIL import Image
from matplotlib import image

"""From Kitti DevKit:-

Optical flow maps are saved as 3-channel uint16 PNG images: The first 
channel
contains the u-component, the second channel the v-component and the 
third
channel denotes if the pixel is valid or not (1 if true, 0 otherwise). To 
convert
the u-/v-flow into floating point values, convert the value to float, 
subtract 2^15 and divide the result by 64.0:"""

Scaled_Flow = np.load('Scaled_Flow.npy') # This is a 32bit float
# This is the very first Kitti Test Flow Output from image_2 testing folder  
# passed through DVF
# The network that produced this flow is only trained to 51 steps, so it 
# won't provide an accurate correspondence
# But the Estimated Flow PNG should look green

ones = np.float32(np.ones((2,375,1242,1))) # Kitti devkit readme says 
that third channel is 1 if flow is valid for that pixel
# 2 for batch size, 3 for height, 3 for width, 1 for this extra layer of 
ones.
with_ones = np.concatenate((Scaled_Flow, ones), axis=3)

im = sp.misc.toimage(with_ones[-1,:,:,:], cmin=-1.0, cmax=1.0) # saves image object
im.save("Scipy_24bit.png", dtype="uint48") # Outputs 24bit only.

Flow = np.int16(with_ones) # An attempt at converting the format from 
float 32 to 16 bit integers
f512 = Flow * 512 # Kitti instructs that the flows are scaled by 512.

x = np.array(Scaled_Flow)
x.astype(np.uint16) # another attempt at converting it to unsigned 16 bit 
integers

try: # try PyPNG
    with open('PyPNGuint48bit.png', 'wb') as f:
        writer = png.Writer(width=375, height=1242, bitdepth=16)
        # Convert z to the Python list of lists expected by
        # the png writer.
        #z2list = x.reshape(-1, x.shape[1]*x.shape[2]).tolist()
        writer.write(f, x)
except:
    print("png lib approach didn't work, it might be to do with the 
sizing")

try: # try imageio
    imageio.imwrite('imageio_Flow_48bit.png', x, format='PNG-FI')
except:
    print("imageio approach didn't work, it probably couldn't handle the 
datatype")

try: # try OpenCV
    cv2.imwrite('OpenCVFlow_48bit_.png',x )
except:
    print("OpenCV approach didn't work, it probably couldn't handle the 
datatype")

try: #try: # try PIL
    im = Image.fromarray(x)
    im.save("PILLOW_Flow_48bit.png", format="PNG")
except:
    print("PILLOW approach didn't work, it probably couldn't handle the 
datatype")

try: # try Matplotlib
    image.imsave('MatplotLib_Flow_48bit.png', x)
except:
    print("Matplotlib approach didn't work, ValueError: object too deep 
for desired array")'''

我想获得一个与 Kitti Ground Truth 相同的 48 位 png 文件，即
看起来是绿色的。目前 Scipy 输出一个 24 位的 png 文件，它是蓝色的，并且
白色的样子。

最佳答案

这是我对您想要做的事情的理解:

从 Scaled_Flow.npy 加载数据.这是一个形状为 (2, 375, 1242, 2) 的 32 位浮点 numpy 数组。

转换 Scaled_Flow[1] (形状为 (375, 1242, 2) 的数组)转换为 16 位无符号整数:

乘以 64，

添加 2**15 , 和

将值转换为 np.uint16 .

这与您引用的描述相反:“要将 u-/v-流转换为浮点值，请将值转换为浮点数，减去 2^15 并将结果除以 64.0”。

通过连接一个全为 1 的数组，将第三维的长度从 2 增加到 3。

将结果保存到 PNG 文件中。

这是您可以做到这一点的一种方法。要创建 PNG 文件，我将使用 numpngw ，我编写的一个库，用于从 numpy 数组创建 PNG 和动画 PNG 文件。如果你给 numpngw.write_png数据类型为 np.uint16 的 numpy 数组，它将创建一个每 channel 16 位的 PNG 文件(即在这种情况下为 48 位图像)。

import numpy as np
from numpngw import write_png


Scaled_Flow = np.load('Scaled_Flow.npy')
sf16 = (64*Scaled_Flow[-1] + 2**15).astype(np.uint16)
imgdata = np.concatenate((sf16, np.ones(sf16.shape[:2] + (1,), dtype=sf16.dtype)), axis=2)

write_png('sf48.png', imgdata)

这是该脚本创建的图像。

关于python - 在 Python 中将 3D 32 位 float 组保存为 48 位整数 PNG 以匹配 Kitti Ground Truth 格式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57313640/

26

4

0

文章推荐： python - 使用cv2.pointPolygonTest()和cv2.polylines()的问题

文章推荐： python - 无法在Raspberry Pi 3上通过pip3安装opencv-python

文章推荐： python - 使用opencv完成边缘补全

文章推荐： OpenCV 4.x+ 需要启用 C++11 支持

检查用户输入的字符串在 C 中的格式为 "%d/%d/%d/%d/%d"
关闭。这个问题需要debugging details .它目前不接受答案。编辑问题以包含 desired behavior, a specific problem or error, and th
java - 使用此表单获取号码\d\s\d\d\s\d
我试图用这种形式简单地获取数字 28 integer+space+integer+integer+space+integer我试过这个正则表达式 \\s\\d\\d\\s 但我得到了两个数字11 和
d - D 语言是否完全依赖于 D 运行时？
最近一直在学习D语言。我一直对运行时感到困惑。从我能收集到的关于它的信息中，(这不是很多)我知道它是一种有助于 D 的一些特性的运行时。像垃圾收集一样，它与您自己的程序一起运行。但是既然 D 是编译
java - 这两种语法\\d\\d\\d 和\\d{3} 在所有环境中都一样吗？
想问一下这两个正则表达式有区别吗？ \d\d\d 与 \d{3} 我已经在我的本地机器上使用 Java 和 Windows 操作系统对此进行了测试，两者都工作正常并且结果相同。但是，当在 linux
go - 为什么我不能执行 fmt.Sprintf ("%d.%d.%d.%d"，一个...)？
我正在学习 Go，而且我坚持使用 Go 之旅(exercise-stringer.go:https://tour.golang.org/methods/7)。这是一些代码: type IPAddr
java - Java中的正则表达式: Pattern.编译( "J.*\\d[0-35-9]-\\d\\d-\\d\\d")
我在Java正则表达式中发现了一段令我困惑的代码: Pattern.compile( "J.*\\d[0-35-9]-\\d\\d-\\d\\d" ); 要编译的字符串是: String string
ruby - gsub(/(\d{4})\/(\d\d)\/(\d\d)\/(.*)/, '\1-\2-\3-\4' ) 是什么意思？
我在 ruby 代码上偶然发现了这个。我知道\d{4})\/(\d\d)\/(\d\d)\/(.*)/是什么意思，但是\1-\2-\3-\4 是什么意思？最佳答案 \1-\2-\3-\4 是 b
d - 如何在没有 D 运行时编译 D 应用程序？
我一直在努力解决这个问题，这让我很恼火。我了解 D 运行时库。它是什么，它做什么。我也明白你可以在没有它的情况下编译 D 应用程序。就像 XoMB 所做的那样。好吧，XoMB 定义了自己的运行时，但是
Java算法到 "multiply"两个列表列表((A),(B))*((C,C),(D,D))==((A,C,C),(A,D,D), (B,C,C),(B,D,D))
我有两个列表列表，子列表代表路径。我想找到所有路径。 List> pathList1 List> pathList2 当然是天真的解决方案: List> result = new ArrayList>
java - 如何清理和打印\d{3}\d{3}\d{2}\d{2}格式的数字
我需要使用 Regex 格式化一个字符串，该字符串包含数字、字母 a-z 和 A-Z，同时还包含破折号和空格。从用户输入我有02-219 8 53 24 输出应该是022 198 53 24 我正在
d - D 中的表达式模板
目标是达到与this C++ example相同的效果: 避免创建临时文件。我曾尝试将 C++ 示例翻译为 D，但没有成功。我也尝试过不同的方法。 import std.datetime : benc
d - D 中的完美转发？
tl;dr:你好吗perfect forwarding在 D？该链接有一个很好的解释，但例如，假设我有这个方法: void foo(T)(in int a, out int b, ref int c
d - D 中的抽象自动函数
有什么方法可以在 D 中使用abstract auto 函数吗？如果我声明一个类如下: class MyClass { abstract auto foo(); } 我收到以下错误: mai
d - D 中的切片交集
有没有人为内存中重叠的数组切片实现交集？算法在没有重叠时返回 []。当 pretty-print (使用重叠缩进)内存中重叠的数组切片时，我想要这个。最佳答案如果您确定它们是数组，那么只需取 p
d - D 中循环索引变量的默认类型是什么？
我已经开始学习 D，但我在使用 Andrei Alexandrescu 所著的 The D Programming Language 一书中提供的示例时遇到了一些麻烦。由于 int 和 ulong 类
d - D 中唯一的不可变类
如何创建一个不可变的类？我的目标是创建一个实例始终不可变的类。现在我只是用不可变的方法和构造函数创建了一个“可变”类。我将其称为 mData，m 表示可变。然后我创建一个别名 alias immut
d - D 中的扩展函数
不久前我买了《The D Programming Language》。好书，很有教育意义。但是，我在尝试编译书中列出的语言功能时遇到了麻烦:扩展函数。在这本书中，Andrei 写了任何可以像这样调用
d - D 中的无限数据结构
我在 D http://www.digitalmars.com/d/2.0/lazy-evaluation.html 中找到了函数参数的惰性求值示例我想知道如何在 D 中实现可能的无限数据结构，就像
c - printf ("%d %d %d\n",++a, a++,a) 输出
这个问题在这里已经有了答案: 12 年前关闭。 Possible Duplicate: Could anyone explain these undefined behaviors (i = i++
d - D:查找具有特定属性的所有功能
当前是否可以跨模块扫描/查询/迭代具有某些属性的所有函数（或类）？例如： source/packageA/something.d: @sillyWalk(10) void doSomething()

首页

博学

6Ren·AI

商城

python - 在 Python 中将 3D 32 位 float 组保存为 48 位整数 PNG 以匹配 Kitti Ground Truth 格式