gpt4 book ai didi

list - 列表与向量上的 Scala 内存问题

转载 作者:行者123 更新时间:2023-12-04 09:57:40 24 4
gpt4 key购买 nike

我写了一个解决方案来解决 Scala 中的 Euler 问题 #59,我不明白为什么在 Vector 和 List 之间切换会增加我认为的内存泄漏。

这是一个使用 Vectors 的有效的蛮力解决方案。

val code = scala.io.Source.fromFile("e59.txt").getLines()
.flatMap(l => l.split(',')).map(_.toInt).toVector

val commonWords = scala.io.Source.fromFile("common_words.txt").getLines().toVector

def decode(k: Int)(code: Vector[Int])(pswd: Vector[Int]): Vector[Int] = {
code.grouped(k).flatMap(cs => cs.toVector.zip(pswd).map(t => t._1 ^ t._2)).toVector
}

def scoreText(text: Vector[Int]): Int = {
if (text.contains((c: Int) => (c < 0 || c > 128))) -1
else {
val words = text.map(_.toChar).mkString.toLowerCase.split(' ')
words.length - words.diff(commonWords).length
}
}

lazy val psswds = for {
a <- (97 to 122);
b <- (97 to 122);
c <- (97 to 122)
} yield Vector(a, b, c)

val ans = psswds.toStream.map(decode(3)(code))
.map(text => (text, scoreText(text)))
.maxBy(_._2)._1.sum

println(ans)

我将原始代码(有序整数的集合)、每个密码和一些常见的英文单词存储为 Vector s。

但是,如果我用 Vector 替换 List ,我的程序会随着每个检查的密码变慢并最终崩溃:
val code = scala.io.Source.fromFile("e59.txt").getLines()
.flatMap(l => l.split(',')).map(_.toInt).toList

val commonWords = scala.io.Source.fromFile("common_words.txt").getLines().toList

def decode(k: Int)(code: List[Int])(pswd: List[Int]): List[Int] = {
println(pswd)
code.grouped(k).flatMap(cs => cs.toList.zip(pswd).map(t => t._1 ^ t._2)).toList
}

def scoreText(text: List[Int]): Int = {
if (text.contains((c: Int) => (c < 0 || c > 128))) -1
else {
val words = text.map(_.toChar).mkString.toLowerCase.split(' ')
words.length - words.diff(commonWords).length
}
}

lazy val psswds = for {
a <- (97 to 122);
b <- (97 to 122);
c <- (97 to 122)
} yield List(a, b, c)

val ans = psswds.toStream.map(decode(3)(code))
.map(text => (text, scoreText(text)))
.maxBy(_._2)._1.sum

println(ans)

错误:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.String.valueOf(String.java:2861)
at java.lang.Character.toString(Character.java:4439)
at java.lang.String.valueOf(String.java:2847)
at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:200)
at scala.collection.TraversableOnce$$anonfun$addString$1.apply(TraversableOnce.scala:349)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableOnce$class.addString(TraversableOnce.scala:342)
at scala.collection.AbstractTraversable.addString(Traversable.scala:104)
at scala.collection.TraversableOnce$class.mkString(TraversableOnce.scala:308)
at scala.collection.AbstractTraversable.mkString(Traversable.scala:104)
at scala.collection.TraversableOnce$class.mkString(TraversableOnce.scala:310)
at scala.collection.AbstractTraversable.mkString(Traversable.scala:104)
at scala.collection.TraversableOnce$class.mkString(TraversableOnce.scala:312)
at scala.collection.AbstractTraversable.mkString(Traversable.scala:104)
at Main$$anon$1.Main$$anon$$scoreText(e59_list.scala:14)
at Main$$anon$1$$anonfun$5.apply(e59_list.scala:26)
at Main$$anon$1$$anonfun$5.apply(e59_list.scala:26)
at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:418)
at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:418)
at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1222)
at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1212)
at scala.collection.immutable.Stream.foreach(Stream.scala:595)
at scala.collection.TraversableOnce$class.maxBy(TraversableOnce.scala:227)
at scala.collection.AbstractTraversable.maxBy(Traversable.scala:104)
at Main$$anon$1.<init>(e59_list.scala:27)
at Main$.main(e59_list.scala:1)
at Main.main(e59_list.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.reflect.internal.util.ScalaClassLoader$$anonfun$run$1.apply(ScalaClassLoader.scala:70)

使用的文件:
common_words.txt
a
able
about
across
after
all
almost
also
am
among
an
and
any
are
as
at
be
because
been
but
by
can
cannot
could
dear
did
do
does
either
else
ever
every
for
from
get
got
had
has
have
he
her
hers
him
his
how
however
i
if
in
into
is
it
its
just
least
let
like
likely
may
me
might
most
must
my
neither
no
nor
not
of
off
often
on
only
or
other
our
own
rather
said
say
says
she
should
since
so
some
than
that
the
their
them
then
there
these
they
this
tis
to
too
twas
us
wants
was
we
were
what
when
where
which
while
who
whom
why
will
with
would
yet
you
your

e59.txt
79,59,12,2,79,35,8,28,20,2,3,68,8,9,68,45,0,12,9,67,68,4,7,5,23,27,1,21,79,85,78,79,85,71,38,10,71,27,12,2,79,6,2,8,13,9,1,13,9,8,68,19,7,1,71,56,11,21,11,68,6,3,22,2,14,0,30,79,1,31,6,23,19,10,0,73,79,44,2,79,19,6,28,68,16,6,16,15,79,35,8,11,72,71,14,10,3,79,12,2,79,19,6,28,68,32,0,0,73,79,86,71,39,1,71,24,5,20,79,13,9,79,16,15,10,68,5,10,3,14,1,10,14,1,3,71,24,13,19,7,68,32,0,0,73,79,87,71,39,1,71,12,22,2,14,16,2,11,68,2,25,1,21,22,16,15,6,10,0,79,16,15,10,22,2,79,13,20,65,68,41,0,16,15,6,10,0,79,1,31,6,23,19,28,68,19,7,5,19,79,12,2,79,0,14,11,10,64,27,68,10,14,15,2,65,68,83,79,40,14,9,1,71,6,16,20,10,8,1,79,19,6,28,68,14,1,68,15,6,9,75,79,5,9,11,68,19,7,13,20,79,8,14,9,1,71,8,13,17,10,23,71,3,13,0,7,16,71,27,11,71,10,18,2,29,29,8,1,1,73,79,81,71,59,12,2,79,8,14,8,12,19,79,23,15,6,10,2,28,68,19,7,22,8,26,3,15,79,16,15,10,68,3,14,22,12,1,1,20,28,72,71,14,10,3,79,16,15,10,68,3,14,22,12,1,1,20,28,68,4,14,10,71,1,1,17,10,22,71,10,28,19,6,10,0,26,13,20,7,68,14,27,74,71,89,68,32,0,0,71,28,1,9,27,68,45,0,12,9,79,16,15,10,68,37,14,20,19,6,23,19,79,83,71,27,11,71,27,1,11,3,68,2,25,1,21,22,11,9,10,68,6,13,11,18,27,68,19,7,1,71,3,13,0,7,16,71,28,11,71,27,12,6,27,68,2,25,1,21,22,11,9,10,68,10,6,3,15,27,68,5,10,8,14,10,18,2,79,6,2,12,5,18,28,1,71,0,2,71,7,13,20,79,16,2,28,16,14,2,11,9,22,74,71,87,68,45,0,12,9,79,12,14,2,23,2,3,2,71,24,5,20,79,10,8,27,68,19,7,1,71,3,13,0,7,16,92,79,12,2,79,19,6,28,68,8,1,8,30,79,5,71,24,13,19,1,1,20,28,68,19,0,68,19,7,1,71,3,13,0,7,16,73,79,93,71,59,12,2,79,11,9,10,68,16,7,11,71,6,23,71,27,12,2,79,16,21,26,1,71,3,13,0,7,16,75,79,19,15,0,68,0,6,18,2,28,68,11,6,3,15,27,68,19,0,68,2,25,1,21,22,11,9,10,72,71,24,5,20,79,3,8,6,10,0,79,16,8,79,7,8,2,1,71,6,10,19,0,68,19,7,1,71,24,11,21,3,0,73,79,85,87,79,38,18,27,68,6,3,16,15,0,17,0,7,68,19,7,1,71,24,11,21,3,0,71,24,5,20,79,9,6,11,1,71,27,12,21,0,17,0,7,68,15,6,9,75,79,16,15,10,68,16,0,22,11,11,68,3,6,0,9,72,16,71,29,1,4,0,3,9,6,30,2,79,12,14,2,68,16,7,1,9,79,12,2,79,7,6,2,1,73,79,85,86,79,33,17,10,10,71,6,10,71,7,13,20,79,11,16,1,68,11,14,10,3,79,5,9,11,68,6,2,11,9,8,68,15,6,23,71,0,19,9,79,20,2,0,20,11,10,72,71,7,1,71,24,5,20,79,10,8,27,68,6,12,7,2,31,16,2,11,74,71,94,86,71,45,17,19,79,16,8,79,5,11,3,68,16,7,11,71,13,1,11,6,1,17,10,0,71,7,13,10,79,5,9,11,68,6,12,7,2,31,16,2,11,68,15,6,9,75,79,12,2,79,3,6,25,1,71,27,12,2,79,22,14,8,12,19,79,16,8,79,6,2,12,11,10,10,68,4,7,13,11,11,22,2,1,68,8,9,68,32,0,0,73,79,85,84,79,48,15,10,29,71,14,22,2,79,22,2,13,11,21,1,69,71,59,12,14,28,68,14,28,68,9,0,16,71,14,68,23,7,29,20,6,7,6,3,68,5,6,22,19,7,68,21,10,23,18,3,16,14,1,3,71,9,22,8,2,68,15,26,9,6,1,68,23,14,23,20,6,11,9,79,11,21,79,20,11,14,10,75,79,16,15,6,23,71,29,1,5,6,22,19,7,68,4,0,9,2,28,68,1,29,11,10,79,35,8,11,74,86,91,68,52,0,68,19,7,1,71,56,11,21,11,68,5,10,7,6,2,1,71,7,17,10,14,10,71,14,10,3,79,8,14,25,1,3,79,12,2,29,1,71,0,10,71,10,5,21,27,12,71,14,9,8,1,3,71,26,23,73,79,44,2,79,19,6,28,68,1,26,8,11,79,11,1,79,17,9,9,5,14,3,13,9,8,68,11,0,18,2,79,5,9,11,68,1,14,13,19,7,2,18,3,10,2,28,23,73,79,37,9,11,68,16,10,68,15,14,18,2,79,23,2,10,10,71,7,13,20,79,3,11,0,22,30,67,68,19,7,1,71,8,8,8,29,29,71,0,2,71,27,12,2,79,11,9,3,29,71,60,11,9,79,11,1,79,16,15,10,68,33,14,16,15,10,22,73

最佳答案

与相同的 Lists 相比,大量的 Vectors 会在 GC 上产生更多的负载。但是您的问题不是关于集合的正确选择,而是关于 Stream 的错误使用。

如果使用不当,Scala 的流可能是 very memory inefficient。在你的情况下,我假设你试图使用 Stream 来避免对转换后的 passwds 集合的急切计算,但你实际上让事情变得更糟(因为 Stream 不仅记住了你的元素,它还使用这些元素的 Stream 包装器创造了额外的开销) .

您所要做的只是将 toStream 替换为 view 。它将创建集合包装器,这使得几乎所有的转换都变得懒惰(基本上是你试图实现的)。

val ans = psswds.view.map(decode(3)(code))
.map(text => (text, scoreText(text)))
.maxBy(_._2)._1.sum

在这个微小的修复之后,即使使用 -Xmx5m(我检查过),您的程序也能正常运行。

在你的程序中还有很多其他的事情需要优化(尽量避免创建过多的集合),但我会把它留给你。

关于list - 列表与向量上的 Scala 内存问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32544869/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com