- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在尝试用 Java 创建一个快速的素数生成器。人们(或多或少)接受了最快的方法是 Eratosthenes 的分段筛:https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes .可以进一步实现许多优化以使其更快。 截至目前,我的实现生成 50847534
下面的素数10^9
在关于 1.6
秒 ,但我希望让它更快,至少打破 1
第二个障碍。为了增加获得好的回复的机会,我将包括算法和代码的演练。
尽管如此,作为 TL;DR
,我希望在代码中包含多线程
出于这个问题的目的,我想将 Eratosthenes 的“分段”筛和“传统”筛区分开来。传统筛子需要O(n)
空间,因此输入范围非常有限(它的限制)。然而,分段筛只需要 O(n^0.5)
空间,并且可以在更大的范围内运行。 (一个主要的加速是使用缓存友好的分段,考虑到特定计算机的 L1 & L2
缓存大小)。最后,与我的问题有关的主要区别是传统的筛选是顺序的,这意味着它只能在前面的步骤完成后继续。然而,分段筛不是。每个段都是独立的,并根据筛选素数(不大于 n^0.5
的素数)单独“处理”。这意味着理论上,一旦我有了筛选素数,我就可以在多台计算机之间分配工作,每台计算机处理不同的部分。彼此的工作相互独立。假设(错误地)每个段需要相同的时间 t
完成,还有k
段,一台计算机需要的总时间为 T = k * t
, 而 k
计算机,每台计算机在不同的段上工作将需要总时间 T = t
来完成整个过程。 (实际上,这是错误的,但为了示例的简单起见)。
这让我开始阅读有关多线程的内容 - 将工作划分为几个线程,每个线程处理少量工作以更好地使用 CPU。据我所知,传统的筛子不能完全是多线程的,因为它是顺序的。每个线程都依赖于前一个线程,从而使整个想法变得不可行。但是分段筛可能确实(我认为)是多线程的。
与其直接进入我的问题,我认为首先介绍我的代码很重要,所以我在此包括我目前最快的分段筛实现。我为此付出了很大的努力。花了相当长的时间,慢慢地调整和添加优化。代码并不简单。我会说,这相当复杂。因此,我假设读者熟悉我介绍的概念,例如轮分解、素数、分段等。我已经包含了注释,以便更容易理解。
import java.math.BigInteger;
import java.util.ArrayList;
import java.util.Arrays;
public class primeGen {
public static long x = (long)Math.pow(10, 9); //limit
public static int sqrtx;
public static boolean [] sievingPrimes; //the sieving primes, <= sqrtx
public static int [] wheels = new int [] {2,3,5,7,11,13,17,19}; // base wheel primes
public static int [] gaps; //the gaps, according to the wheel. will enable skipping multiples of the wheel primes
public static int nextp; // the first prime > wheel primes
public static int l; // the amount of gaps in the wheel
public static void main(String[] args)
{
long startTime = System.currentTimeMillis();
preCalc(); // creating the sieving primes and calculating the list of gaps
int segSize = Math.max(sqrtx, 32768*8); //size of each segment
long u = nextp; // 'u' is the running index of the program. will continue from one segment to the next
int wh = 0; // the will be the gap index, indicating by how much we increment 'u' each time, skipping the multiples of the wheel primes
long pi = pisqrtx(); // the primes count. initialize with the number of primes <= sqrtx
for (long low = 0 ; low < x ; low += segSize) //the heart of the code. enumerating the primes through segmentation. enumeration will begin at p > sqrtx
{
long high = Math.min(x, low + segSize);
boolean [] segment = new boolean [(int) (high - low + 1)];
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if (sievingPrimes[(i + 1) / 2])
{
long firstMultiple = (long) (low / i * i);
if (firstMultiple < low)
firstMultiple += i;
if (firstMultiple % 2 == 0) //start with the first odd multiple of the current prime in the segment
firstMultiple += i;
for (long j = firstMultiple ; j < high ; j += i * 2)
segment[(int) (j - low)] = true;
}
g++;
//if (g == l) //due to segment size, the full list of gaps is never used **within just one segment** , and therefore this check is redundant.
//should be used with bigger segment sizes or smaller lists of gaps
//g = 0;
}
while (u <= high)
{
if (!segment[(int) (u - low)])
pi++;
u += gaps[wh];
wh++;
if (wh == l)
wh = 0;
}
}
System.out.println(pi);
long endTime = System.currentTimeMillis();
System.out.println("Solution took "+(endTime - startTime) + " ms");
}
public static boolean [] simpleSieve (int l)
{
long sqrtl = (long)Math.sqrt(l);
boolean [] primes = new boolean [l/2+2];
Arrays.fill(primes, true);
int g = -1;
for (int i = nextp ; i <= sqrtl ; i += gaps[g])
{
if (primes[(i + 1) / 2])
for (int j = i * i ; j <= l ; j += i * 2)
primes[(j + 1) / 2]=false;
g++;
if (g == l)
g=0;
}
return primes;
}
public static long pisqrtx ()
{
int pi = wheels.length;
if (x < wheels[wheels.length-1])
{
if (x < 2)
return 0;
int k = 0;
while (wheels[k] <= x)
k++;
return k;
}
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if(sievingPrimes[( i + 1 ) / 2])
pi++;
g++;
if (g == l)
g=0;
}
return pi;
}
public static void preCalc ()
{
sqrtx = (int) Math.sqrt(x);
int prod = 1;
for (long p : wheels)
prod *= p; // primorial
nextp = BigInteger.valueOf(wheels[wheels.length-1]).nextProbablePrime().intValue(); //the first prime that comes after the wheel
int lim = prod + nextp; // circumference of the wheel
boolean [] marks = new boolean [lim + 1];
Arrays.fill(marks, true);
for (int j = 2 * 2 ;j <= lim ; j += 2)
marks[j] = false;
for (int i = 1 ; i < wheels.length ; i++)
{
int p = wheels[i];
for (int j = p * p ; j <= lim ; j += 2 * p)
marks[j]=false; // removing all integers that are NOT comprime with the base wheel primes
}
ArrayList <Integer> gs = new ArrayList <Integer>(); //list of the gaps between the integers that are coprime with the base wheel primes
int d = nextp;
for (int p = d + 2 ; p < marks.length ; p += 2)
{
if (marks[p]) //d is prime. if p is also prime, then a gap is identified, and is noted.
{
gs.add(p - d);
d = p;
}
}
gaps = new int [gs.size()];
for (int i = 0 ; i < gs.size() ; i++)
gaps[i] = gs.get(i); // Arrays are faster than lists, so moving the list of gaps to an array
l = gaps.length;
sievingPrimes = simpleSieve(sqrtx); //initializing the sieving primes
}
}
50847534
下面的素数10^9
在关于 1.6
秒。 这非常令人印象深刻,至少以我的标准来看,但我希望让它更快,可能会打破
1
第二个障碍。即便如此,我相信它仍然可以做得更快。
19
.
public static int [] wheels = new int [] {2,3,5,7,11,13,17,19}; // base wheel primes
preCalc
中计算我们需要采取的数字之间的差距。方法。如果我们在搜索范围内的数字之间进行这些跳跃,我们将跳过基素数的倍数。
public static void preCalc ()
{
sqrtx = (int) Math.sqrt(x);
int prod = 1;
for (long p : wheels)
prod *= p; // primorial
nextp = BigInteger.valueOf(wheels[wheels.length-1]).nextProbablePrime().intValue(); //the first prime that comes after the wheel
int lim = prod + nextp; // circumference of the wheel
boolean [] marks = new boolean [lim + 1];
Arrays.fill(marks, true);
for (int j = 2 * 2 ;j <= lim ; j += 2)
marks[j] = false;
for (int i = 1 ; i < wheels.length ; i++)
{
int p = wheels[i];
for (int j = p * p ; j <= lim ; j += 2 * p)
marks[j]=false; // removing all integers that are NOT comprime with the base wheel primes
}
ArrayList <Integer> gs = new ArrayList <Integer>(); //list of the gaps between the integers that are coprime with the base wheel primes
int d = nextp;
for (int p = d + 2 ; p < marks.length ; p += 2)
{
if (marks[p]) //d is prime. if p is also prime, then a gap is identified, and is noted.
{
gs.add(p - d);
d = p;
}
}
gaps = new int [gs.size()];
for (int i = 0 ; i < gs.size() ; i++)
gaps[i] = gs.get(i); // Arrays are faster than lists, so moving the list of gaps to an array
l = gaps.length;
sievingPrimes = simpleSieve(sqrtx); //initializing the sieving primes
}
preCalc
方法,
simpleSieve
方法被调用,有效地筛选出前面提到的所有筛选素数,素数
<= sqrtx
.这是一个简单的 Eratosthenes 筛子,而不是分段的,但它仍然基于
车轮分解 , 以前计算过。
public static boolean [] simpleSieve (int l)
{
long sqrtl = (long)Math.sqrt(l);
boolean [] primes = new boolean [l/2+2];
Arrays.fill(primes, true);
int g = -1;
for (int i = nextp ; i <= sqrtl ; i += gaps[g])
{
if (primes[(i + 1) / 2])
for (int j = i * i ; j <= l ; j += i * 2)
primes[(j + 1) / 2]=false;
g++;
if (g == l)
g=0;
}
return primes;
}
<= sqrtx
,具有以下调用:
long pi = pisqrtx();`
public static long pisqrtx ()
{
int pi = wheels.length;
if (x < wheels[wheels.length-1])
{
if (x < 2)
return 0;
int k = 0;
while (wheels[k] <= x)
k++;
return k;
}
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if(sievingPrimes[( i + 1 ) / 2])
pi++;
g++;
if (g == l)
g=0;
}
return pi;
}
pi
之后跟踪素数枚举的变量,我们执行上述分段,从第一个素数
> sqrtx
开始枚举。 :
int segSize = Math.max(sqrtx, 32768*8); //size of each segment
long u = nextp; // 'u' is the running index of the program. will continue from one segment to the next
int wh = 0; // the will be the gap index, indicating by how much we increment 'u' each time, skipping the multiples of the wheel primes
long pi = pisqrtx(); // the primes count. initialize with the number of primes <= sqrtx
for (long low = 0 ; low < x ; low += segSize) //the heart of the code. enumerating the primes through segmentation. enumeration will begin at p > sqrtx
{
long high = Math.min(x, low + segSize);
boolean [] segment = new boolean [(int) (high - low + 1)];
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if (sievingPrimes[(i + 1) / 2])
{
long firstMultiple = (long) (low / i * i);
if (firstMultiple < low)
firstMultiple += i;
if (firstMultiple % 2 == 0) //start with the first odd multiple of the current prime in the segment
firstMultiple += i;
for (long j = firstMultiple ; j < high ; j += i * 2)
segment[(int) (j - low)] = true;
}
g++;
//if (g == l) //due to segment size, the full list of gaps is never used **within just one segment** , and therefore this check is redundant.
//should be used with bigger segment sizes or smaller lists of gaps
//g = 0;
}
while (u <= high)
{
if (!segment[(int) (u - low)])
pi++;
u += gaps[wh];
wh++;
if (wh == l)
wh = 0;
}
}
19-wheel
)。但是在更广泛的程序概述中,我们将使用整个间隙数组,因此变量
u
必须遵循它,而不是意外超越它:
while (u <= high)
{
if (!segment[(int) (u - low)])
pi++;
u += gaps[wh];
wh++;
if (wh == l)
wh = 0;
}
wheel
素数基数可能对程序产生这种影响。不过,切换到位筛选可以在很大程度上改善段限制。
L1 & L2
缓存大小考虑在内。我得到32,768 * 8 = 262,144 = 2^18
段大小的最快结果.我不确定我的计算机的缓存大小是多少,但我知道<= 32,768
.4
,使用4个线程(对应4个portions
.划分n
进入 4
相等的部分 - 线程,每个线程依次执行n/4
的分段它负责的元素,使用最佳答案
安 example like this应该可以帮助您入门。
解决方案概述:
synchronized
下的可变共享输出结构,可能(也可能不会)实现额外的加速。 ,取决于加入步骤涉及多少工作。
关于java - Java中Eratosthenes的多线程分段筛分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57203706/
我正在尝试编写一个函数,使用 "Sieve of Sundaram" algorithm 从 1..n 计算所有奇数素数. 这是我的尝试: sSund :: Integer -> [Integer]
我是 Haskell 的新手,对于我正在实现的事情,我需要一个素数列表。我试着写一个,但它太慢了。 这是我尝试过的。 primeList = primes 1000 primes :: Int ->
我是一名优秀的程序员,十分优秀!