gpt4 book ai didi

rust - 如何将 Chars 迭代器存储在与其迭代的 String 相同的结构中?

转载 作者:行者123 更新时间:2023-11-29 07:54:22 24 4
gpt4 key购买 nike

我刚刚开始学习 Rust,我正在努力处理生命周期。

我想要一个包含 String 的结构,用于缓冲来自标准输入的行。然后我想在结构上有一个方法,它返回缓冲区中的下一个字符,或者如果该行中的所有字符都已被消耗,它将从标准输入读取下一行。

文档说 Rust 字符串不能按字符索引,因为 UTF-8 效率低下。当我按顺序访​​问字符时,使用迭代器应该没问题。然而,据我所知,Rust 中的迭代器与它们正在迭代的事物的生命周期相关联,我不知道如何将此迭代器与 String 一起存储在结构中。

这是我想要实现的伪 Rust。显然它无法编译。

struct CharGetter {
/* Buffer containing one line of input at a time */
input_buf: String,
/* The position within input_buf of the next character to
* return. This needs a lifetime parameter. */
input_pos: std::str::Chars
}

impl CharGetter {
fn next(&mut self) -> Result<char, io::Error> {
loop {
match self.input_pos.next() {
/* If there is still a character left in the input
* buffer then we can just return it immediately. */
Some(n) => return Ok(n),
/* Otherwise get the next line */
None => {
io::stdin().read_line(&mut self.input_buf)?;
/* Reset the iterator to the beginning of the
* line. Obviously this doesn’t work because it’s
* not obeying the lifetime of input_buf */
self.input_pos = self.input_buf.chars();
}
}
}
}
}

我正在尝试执行 Synacor challenge .这涉及实现一个虚拟机,其中一个操作码从标准输入读取一个字符并将其存储在寄存器中。我这部分工作正常。文档指出,每当 VM 内的程序读取一个字符时,它会一直读取,直到读取整行。我想利用这一点为我的实现添加一个“保存”命令。这意味着无论何时程序要求输入一个字符,我都会从输入中读取一行。如果该行是“保存”,我将保存 VM 的状态,然后继续获取另一行以提供给 VM。每次 VM 执行输入操作码时,我都需要能够从缓冲行中一次给它一个字符,直到缓冲区耗尽。

我当前的实现是 here .我的计划是将 input_bufinput_pos 添加到表示 VM 状态的 Machine 结构。

最佳答案

详见 Why can't I store a value and a reference to that value in the same struct? ,通常您不能这样做,因为它确实不安全。移动内存时,会使引用无效。这就是为什么很多人使用 Rust 的原因 - 没有导致程序崩溃的无效引用!

让我们看看您的代码:

io::stdin().read_line(&mut self.input_buf)?;
self.input_pos = self.input_buf.chars();

在这两行之间,您离开了 self.input_pos状态不佳。如果发生 panic ,那么对象的析构函数就有机会访问无效内存! Rust 正在保护您免受大多数人从未想过的问题的影响。


正如该答案中所述:

There is a special case where the lifetime tracking is overzealous:when you have something placed on the heap. This occurs when you use aBox<T>, for example. In this case, the structure that is movedcontains a pointer into the heap. The pointed-at value will remainstable, but the address of the pointer itself will move. In practice,this doesn't matter, as you always follow the pointer.

Some crates provide ways of representing this case, but they requirethat the base address never move. This rules out mutating vectors,which may cause a reallocation and a move of the heap-allocatedvalues.

记住 String只是添加了额外前提条件的字节向量。

除了使用其中一个 crate 之外,我们还可以推出自己的解决方案,这意味着我们(阅读)可以承担所有责任,确保我们没有做错任何事情。

这里的技巧是确保 String 中的数据永远不会移动,也不会意外引用。

use std::{mem, str::Chars};

/// I believe this struct to be safe because the String is
/// heap-allocated (stable address) and will never be modified
/// (stable address). `chars` will not outlive the struct, so
/// lying about the lifetime should be fine.
///
/// TODO: What about during destruction?
/// `Chars` shouldn't have a destructor...
struct OwningChars {
_s: String,
chars: Chars<'static>,
}

impl OwningChars {
fn new(s: String) -> Self {
let chars = unsafe { mem::transmute(s.chars()) };
OwningChars { _s: s, chars }
}
}

impl Iterator for OwningChars {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
self.chars.next()
}
}

您甚至可以考虑将只是这段代码放入一个模块中,这样您就不会不小心弄乱内部结构。


这是使用 ouroboros 的相同代码crate 创建一个包含 String 的自引用结构和一个 Chars迭代器:

use ouroboros::self_referencing; // 0.4.1
use std::str::Chars;

#[self_referencing]
pub struct IntoChars {
string: String,
#[borrows(string)]
chars: Chars<'this>,
}

// All these implementations are based on what `Chars` implements itself

impl Iterator for IntoChars {
type Item = char;

#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.with_mut(|me| me.chars.next())
}

#[inline]
fn count(mut self) -> usize {
self.with_mut(|me| me.chars.count())
}

#[inline]
fn size_hint(&self) -> (usize, Option<usize>) {
self.with(|me| me.chars.size_hint())
}

#[inline]
fn last(mut self) -> Option<Self::Item> {
self.with_mut(|me| me.chars.last())
}
}

impl DoubleEndedIterator for IntoChars {
#[inline]
fn next_back(&mut self) -> Option<Self::Item> {
self.with_mut(|me| me.chars.next_back())
}
}

impl std::iter::FusedIterator for IntoChars {}

// And an extension trait for convenience

trait IntoCharsExt {
fn into_chars(self) -> IntoChars;
}

impl IntoCharsExt for String {
fn into_chars(self) -> IntoChars {
IntoCharsBuilder {
string: self,
chars_builder: |s| s.chars(),
}
.build()
}
}

这是使用 rental 的相同代码crate 创建一个包含 String 的自引用结构和一个 Chars迭代器:

#[macro_use]
extern crate rental; // 0.5.5

rental! {
mod into_chars {
pub use std::str::Chars;

#[rental]
pub struct IntoChars {
string: String,
chars: Chars<'string>,
}
}
}

use into_chars::IntoChars;

// All these implementations are based on what `Chars` implements itself

impl Iterator for IntoChars {
type Item = char;

#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.rent_mut(|chars| chars.next())
}

#[inline]
fn count(mut self) -> usize {
self.rent_mut(|chars| chars.count())
}

#[inline]
fn size_hint(&self) -> (usize, Option<usize>) {
self.rent(|chars| chars.size_hint())
}

#[inline]
fn last(mut self) -> Option<Self::Item> {
self.rent_mut(|chars| chars.last())
}
}

impl DoubleEndedIterator for IntoChars {
#[inline]
fn next_back(&mut self) -> Option<Self::Item> {
self.rent_mut(|chars| chars.next_back())
}
}

impl std::iter::FusedIterator for IntoChars {}

// And an extension trait for convenience

trait IntoCharsExt {
fn into_chars(self) -> IntoChars;
}

impl IntoCharsExt for String {
fn into_chars(self) -> IntoChars {
IntoChars::new(self, |s| s.chars())
}
}

关于rust - 如何将 Chars 迭代器存储在与其迭代的 String 相同的结构中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43952104/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com