parallel-processing - 带 ksp 导轨的 PETSc 求解线性系统-6ren

parallel-processing - 带 ksp 导轨的 PETSc 求解线性系统

转载作者：行者123 更新时间：2023-12-04 08:05:26

我开始使用 PETSc 库来并行求解线性方程组。我已经安装了所有软件包，构建并成功运行了 petsc/src/ksp/ksp/examples/tutorials/文件夹中的示例，例如 ex.c

但是我不明白如何通过从文件中读取它们来填充矩阵 A、X 和 B。

这里我提供 ex2.c 文件中的代码:

/* Program usage:  mpiexec -n <procs> ex2 [-help] [all PETSc options] */ 

static char help[] = "Solves a linear system in parallel with KSP.\n\
Input parameters include:\n\
  -random_exact_sol : use a random exact solution vector\n\
  -view_exact_sol   : write exact solution vector to stdout\n\
  -m <mesh_x>       : number of mesh points in x-direction\n\
  -n <mesh_n>       : number of mesh points in y-direction\n\n";

/*T
   Concepts: KSP^basic parallel example;
   Concepts: KSP^Laplacian, 2d
   Concepts: Laplacian, 2d
   Processors: n
T*/

/* 
  Include "petscksp.h" so that we can use KSP solvers.  Note that this file
  automatically includes:
     petscsys.h       - base PETSc routines   petscvec.h - vectors
     petscmat.h - matrices
     petscis.h     - index sets            petscksp.h - Krylov subspace methods
     petscviewer.h - viewers               petscpc.h  - preconditioners
*/
#include <C:\PETSC\include\petscksp.h>

#undef __FUNCT__
#define __FUNCT__ "main"
int main(int argc,char **args)
{
  Vec            x,b,u;  /* approx solution, RHS, exact solution */
  Mat            A;        /* linear system matrix */
  KSP            ksp;     /* linear solver context */
  PetscRandom    rctx;     /* random number generator context */
  PetscReal      norm;     /* norm of solution error */
  PetscInt       i,j,Ii,J,Istart,Iend,m = 8,n = 7,its;
  PetscErrorCode ierr;
  PetscBool      flg = PETSC_FALSE;
  PetscScalar    v;
#if defined(PETSC_USE_LOG)
  PetscLogStage  stage;
#endif

  PetscInitialize(&argc,&args,(char *)0,help);
  ierr = PetscOptionsGetInt(PETSC_NULL,"-m",&m,PETSC_NULL);CHKERRQ(ierr);
  ierr = PetscOptionsGetInt(PETSC_NULL,"-n",&n,PETSC_NULL);CHKERRQ(ierr);
  /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
         Compute the matrix and right-hand-side vector that define
         the linear system, Ax = b.
     - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
  /* 
     Create parallel matrix, specifying only its global dimensions.
     When using MatCreate(), the matrix format can be specified at
     runtime. Also, the parallel partitioning of the matrix is
     determined by PETSc at runtime.

     Performance tuning note:  For problems of substantial size,
     preallocation of matrix memory is crucial for attaining good 
     performance. See the matrix chapter of the users manual for details.
  */
  ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);
  ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,m*n,m*n);CHKERRQ(ierr);
  ierr = MatSetFromOptions(A);CHKERRQ(ierr);
  ierr = MatMPIAIJSetPreallocation(A,5,PETSC_NULL,5,PETSC_NULL);CHKERRQ(ierr);
  ierr = MatSeqAIJSetPreallocation(A,5,PETSC_NULL);CHKERRQ(ierr);

  /* 
     Currently, all PETSc parallel matrix formats are partitioned by
     contiguous chunks of rows across the processors.  Determine which
     rows of the matrix are locally owned. 
  */
  ierr = MatGetOwnershipRange(A,&Istart,&Iend);CHKERRQ(ierr);

  /* 
     Set matrix elements for the 2-D, five-point stencil in parallel.
      - Each processor needs to insert only elements that it owns
        locally (but any non-local elements will be sent to the
        appropriate processor during matrix assembly). 
      - Always specify global rows and columns of matrix entries.

     Note: this uses the less common natural ordering that orders first
     all the unknowns for x = h then for x = 2h etc; Hence you see J = Ii +- n
     instead of J = I +- m as you might expect. The more standard ordering
     would first do all variables for y = h, then y = 2h etc.

   */
  ierr = PetscLogStageRegister("Assembly", &stage);CHKERRQ(ierr);
  ierr = PetscLogStagePush(stage);CHKERRQ(ierr);
  for (Ii=Istart; Ii<Iend; Ii++) { 
    v = -1.0; i = Ii/n; j = Ii - i*n;  
    if (i>0)   {J = Ii - n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
    if (i<m-1) {J = Ii + n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
    if (j>0)   {J = Ii - 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
    if (j<n-1) {J = Ii + 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);}
    v = 4.0; ierr = MatSetValues(A,1,&Ii,1,&Ii,&v,INSERT_VALUES);CHKERRQ(ierr);
  }

  /* 
     Assemble matrix, using the 2-step process:
       MatAssemblyBegin(), MatAssemblyEnd()
     Computations can be done while messages are in transition
     by placing code between these two statements.
  */
  ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  ierr = PetscLogStagePop();CHKERRQ(ierr);

  /* A is symmetric. Set symmetric flag to enable ICC/Cholesky preconditioner */
  ierr = MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE);CHKERRQ(ierr);

  /* 
     Create parallel vectors.
      - We form 1 vector from scratch and then duplicate as needed.
      - When using VecCreate(), VecSetSizes and VecSetFromOptions()
        in this example, we specify only the
        vector's global dimension; the parallel partitioning is determined
        at runtime. 
      - When solving a linear system, the vectors and matrices MUST
        be partitioned accordingly.  PETSc automatically generates
        appropriately partitioned matrices and vectors when MatCreate()
        and VecCreate() are used with the same communicator.  
      - The user can alternatively specify the local vector and matrix
        dimensions when more sophisticated partitioning is needed
        (replacing the PETSC_DECIDE argument in the VecSetSizes() statement
        below).
  */
  ierr = VecCreate(PETSC_COMM_WORLD,&u);CHKERRQ(ierr);
  ierr = VecSetSizes(u,PETSC_DECIDE,m*n);CHKERRQ(ierr);
  ierr = VecSetFromOptions(u);CHKERRQ(ierr);
  ierr = VecDuplicate(u,&b);CHKERRQ(ierr); 
  ierr = VecDuplicate(b,&x);CHKERRQ(ierr);

  /* 
     Set exact solution; then compute right-hand-side vector.
     By default we use an exact solution of a vector with all
     elements of 1.0;  Alternatively, using the runtime option
     -random_sol forms a solution vector with random components.
  */
  ierr = PetscOptionsGetBool(PETSC_NULL,"-random_exact_sol",&flg,PETSC_NULL);CHKERRQ(ierr);
  if (flg) {
    ierr = PetscRandomCreate(PETSC_COMM_WORLD,&rctx);CHKERRQ(ierr);
    ierr = PetscRandomSetFromOptions(rctx);CHKERRQ(ierr);
    ierr = VecSetRandom(u,rctx);CHKERRQ(ierr);
    ierr = PetscRandomDestroy(&rctx);CHKERRQ(ierr);
  } else {
    ierr = VecSet(u,1.0);CHKERRQ(ierr);
  }
  ierr = MatMult(A,u,b);CHKERRQ(ierr);

  /*
     View the exact solution vector if desired
  */
  flg  = PETSC_FALSE;
  ierr = PetscOptionsGetBool(PETSC_NULL,"-view_exact_sol",&flg,PETSC_NULL);CHKERRQ(ierr);
  if (flg) {ierr = VecView(u,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);}

  /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
                Create the linear solver and set various options
     - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

  /* 
     Create linear solver context
  */
  ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);

  /* 
     Set operators. Here the matrix that defines the linear system
     also serves as the preconditioning matrix.
  */
  ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);

  /* 
     Set linear solver defaults for this problem (optional).
     - By extracting the KSP and PC contexts from the KSP context,
       we can then directly call any KSP and PC routines to set
       various options.
     - The following two statements are optional; all of these
       parameters could alternatively be specified at runtime via
       KSPSetFromOptions().  All of these defaults can be
       overridden at runtime, as indicated below.
  */
  ierr = KSPSetTolerances(ksp,1.e-2/((m+1)*(n+1)),1.e-50,PETSC_DEFAULT,
                          PETSC_DEFAULT);CHKERRQ(ierr);

  /* 
    Set runtime options, e.g.,
        -ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
    These options will override those specified above as long as
    KSPSetFromOptions() is called _after_ any other customization
    routines.
  */
  ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);

  /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
                      Solve the linear system
     - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

  ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);

  /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
                      Check solution and clean up
     - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

  /* 
     Check the error
  */
  ierr = VecAXPY(x,-1.0,u);CHKERRQ(ierr);
  ierr = VecNorm(x,NORM_2,&norm);CHKERRQ(ierr);
  ierr = KSPGetIterationNumber(ksp,&its);CHKERRQ(ierr);
  /* Scale the norm */
  /*  norm *= sqrt(1.0/((m+1)*(n+1))); */

  /*
     Print convergence information.  PetscPrintf() produces a single 
     print statement from all processes that share a communicator.
     An alternative is PetscFPrintf(), which prints to a file.
  */
  ierr = PetscPrintf(PETSC_COMM_WORLD,"Norm of error %A iterations %D\n",
                     norm,its);CHKERRQ(ierr);

  /*
     Free work space.  All PETSc objects should be destroyed when they
     are no longer needed.
  */
  ierr = KSPDestroy(&ksp);CHKERRQ(ierr);
  ierr = VecDestroy(&u);CHKERRQ(ierr);  ierr = VecDestroy(&x);CHKERRQ(ierr);
  ierr = VecDestroy(&b);CHKERRQ(ierr);  ierr = MatDestroy(&A);CHKERRQ(ierr);

  /*
     Always call PetscFinalize() before exiting a program.  This routine
       - finalizes the PETSc libraries as well as MPI
       - provides summary and diagnostic information if certain runtime
         options are chosen (e.g., -log_summary). 
  */
  ierr = PetscFinalize();
  return 0;
}

有人知道如何在示例中填充自己的矩阵吗？

最佳答案

是的，当你开始时，这可能有点令人生畏。 this 中有一个很好的流程演练ACTS 2006 年的教程； tutorials listed PetSC 网页上的一般都相当不错。

这其中的关键部分是:

  ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);

实际创建 PetSC 矩阵对象， Mat A ;

  ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,m*n,m*n);CHKERRQ(ierr);

设置尺寸；这里，矩阵是 m*n x m*n ，因为它是在 m x n 上操作的模板二维网格

  ierr = MatSetFromOptions(A);CHKERRQ(ierr);

如果您想控制 A 的设置方式，这只需要您在运行时提供的任何 PetSC 命令行选项并将它们应用于矩阵；否则，您可以使用 MatCreateMPIAIJ() 将其创建为 AIJ 格式矩阵(默认)， MatCreateMPIDense()如果它是一个密集的矩阵。

  ierr = MatMPIAIJSetPreallocation(A,5,PETSC_NULL,5,PETSC_NULL);CHKERRQ(ierr);
  ierr = MatSeqAIJSetPreallocation(A,5,PETSC_NULL);CHKERRQ(ierr);

现在我们已经得到了一个 AIJ 矩阵，这些调用只是预先分配了稀疏矩阵，假设每行有 5 个非零。这是为了性能。请注意，必须调用 MPI 和 Seq 函数以确保它适用于 1 个处理器和多个处理器；这似乎总是很奇怪，但是你去吧。

好的，既然矩阵已经全部设置好了，我们就从这里开始深入探讨问题的实质。

首先，我们找出这个特定进程拥有哪些行。分布是按行的，这对于典型的稀疏矩阵来说是一个很好的分布。

  ierr = MatGetOwnershipRange(A,&Istart,&Iend);CHKERRQ(ierr);

所以在这个调用之后，每个处理器都有自己的 Istart 和 Iend 版本，并且它的这个处理器的工作是更新从 Istart 开始到 Iend 之前结束的行，正如你在这个 for 循环中看到的那样:

  for (Ii=Istart; Ii<Iend; Ii++) { 
    v = -1.0; i = Ii/n; j = Ii - i*n;

好的，所以如果我们在行 Ii 上操作, 这对应于网格位置 (i,j)在哪里 i = Ii/n和 j = Ii % n .例如，网格位置 (i,j)对应行 Ii = i*n + j .说得通？

我将在这里去掉 if 语句，因为它们很重要，但它们只是处理边界值，它们使事情变得更加复杂。

在这一行中，对角线上有 +4，对应于 (i-1,j) 的列有 -1。 , (i+1,j) , (i,j-1) , 和 (i,j+1) .假设我们还没有超出这些(例如， 1 < i < m-1 和 1 < j < n-1)，这意味着

    J = Ii - n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);
    J = Ii + n; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);
    J = Ii - 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);
    J = Ii + 1; ierr = MatSetValues(A,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);

    v = 4.0; ierr = MatSetValues(A,1,&Ii,1,&Ii,&v,INSERT_VALUES);CHKERRQ(ierr);
  }

我取出的 if 语句只是避免设置这些值(如果它们不存在)，而 CHKERRQ如果 ierr != 0，宏只会打印出一个有用的错误。，例如设置值调用失败(因为我们试图设置无效值)。

现在我们已经设置了本地值； MatAssembly调用开始通信以确保在处理器之间交换任何必要的值。如果你有任何不相关的工作要做，它可能会卡在 Begin 和 End 之间以尝试重叠通信和计算:

  ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);

现在你已经完成了，可以调用你的求解器了。

所以一个典型的工作流程是:

创建您的矩阵(MatCreate)

设置它的大小 ( MatSetSizes )

设置各种矩阵选项(MatSetFromOptions 是一个不错的选择，而不是硬编码)

对于稀疏矩阵，将预分配设置为对每行非零数的合理猜测；您可以使用单个值(如此处)或使用表示每行非零数的数组(此处填写 PETSC_NULL ):( MatMPIAIJSetPreallocation ，MatSeqAIJSetPreallocation )

找出哪些行是您的责任: ( MatGetOwnershipRange )

设置值(调用 MatSetValues 每个值一次，或传入一大块值；INSERT_VALUES 设置新元素，ADD_VALUES 增加任何现有元素)

然后进行组装( MatAssemblyBegin ， MatAssemblyEnd )。

其他更复杂的用例也是可能的。

关于parallel-processing - 带 ksp 导轨的 PETSc 求解线性系统，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/10815450/

文章推荐： amazon-web-services - 将 Yii2 Assets 目录更改为 AWS S3

文章推荐： amazon-web-services - aws cli s3 使用什么 TCP 端口号？

opengl - 线性/非线性纹理映射扭曲的四边形
在我的 previous question ，已经确定，当纹理四边形时，面部被分解为三角形，纹理坐标以仿射方式插值。不幸的是，我不知道如何解决这个问题。 provided link很有用，但没有达到
Qt运动(线性)模糊
是否有简单的解决方案可以在 Qt 中为图像添加运动模糊？还没有找到任何关于模糊的好教程。我需要一些非常简单的东西，我可以理解，如果我可以改变模糊角度，那就太好了。最佳答案 Qt 没有运动模糊过滤器。
javascript - 线性+阈值统一尺度
我想构建一个有点复杂的轴，它可以处理线性数据到像素位置，直到某个值，在该值中所有内容都被归入一个类别，因此具有相同的数据到像素值。例如，考虑具有以下刻度线的 y 轴: 0%, 10%, 20%, 30
android - 线性、可滚动布局中的基线约束
我需要确保两个 View 元素彼此相邻且垂直高度相同。我会使用基线约束来做到这一点，但目前我正在使用线性、可滚动的布局( ScrollView 中的线性布局)，当我点击一个元素时，它不允许我从中获取基
regex - 如何在非贪婪的正则表达式中避免(线性)回溯？
考虑正则表达式 ".*?\s*$" 和一个不以空格结尾的字符串。示例 " a" .最后\s永远无法匹配 a这就是为什么匹配器迭代: \s\s\s\s\s - fails .\s\s\
approximation - 插值建议(线性，三次？)
Closed. This question needs to be more focused。它当前不接受答案。想要改善这个问题吗？更新问题，使它仅关注editing this post的一个问题。
assembly - 线性、物理、逻辑和虚拟内存地址有什么区别？
我正在尝试阅读英特尔软件开发人员手册以了解操作系统的工作原理，这四个寻址术语让我感到困惑。以上是我的理解，如有不对请指正。线性地址 : 对一个孤立的程序来说，似乎是一长串以地址0开头的内存。该程序的
javascript - 检查数字表达式(线性)是否有效？
有很多方法可以使用正则表达式并相应地使用匹配/测试匹配来检查字符串是否有效。我正在检查包含字母(a-b)、运算符(+、-、/、*)、仅特殊字符(如(')'、'(')和数字(0-9)的表达式是否有效我
r - 线性 SVM 和提取权重
我正在使用 iris 数据集在 R 中练习 SVM，我想从我的模型中获取特征权重/系数，但我想我可能误解了一些东西，因为我的输出给了我 32 个支持向量。假设我要分析四个变量，我会得到四个。我知道在使
r - 线性 SVM 和提取权重
我正在使用 iris 数据集在 R 中练习 SVM，我想从我的模型中获取特征权重/系数，但我想我可能误解了一些东西，因为我的输出给了我 32 个支持向量。假设我要分析四个变量，我会得到四个。我知道在使
java - 如何在android中滑动布局(线性/相对..)
如何向左或向右滑动线性布局。在该线性布局中，默认情况下我有一个不可见的删除按钮，还有一些其他小部件，它们都是可见状态，当向左滑动线性布局时，我需要使其可见的删除按钮，当向右滑动时，我需要隐藏该删除按钮
r - 线性 SVM 中的错误预测
我正在编写一个 R 脚本，运行时会给出因变量的预测值。我的所有变量都被分类(如图所示)并分配了一个编号，总类数为101。(每个类是歌曲名称)。所以我有一个训练数据集，其中包含 {(2,5,6,1)8
java - 线性 RGB 空间中的仿射变换
如果源栅格位于 linear RGB color space使用以下 Java 代码进行转换，应用过滤器时(最后一行)会引发 java.awt.image.ImagingOpException: Un
ios - UIVIew animateWithDuration 线性？
我想为我的多个 UIImageView 设置动画，使其从 A 点线性移动到 B 点。我正在使用 options:UIViewAnimationOptionCurveLinear - Apple 文档
css - 线性，右渐变到透明渐变，底部有额外渐变
我第一次无法使用 CSS3 创建好看的渐变效果。右侧应该有从黑色到透明的渐变透明渐变。底部是页脚，所以它需要在底部另外淡化为透明。如果可能的话，一个例子: 页面的背景是一张图片，所以不可能有非透明淡
c++ - 线性(超定)代数方程的解
我有一组线性代数方程，Ax=By。其中A是36x20的矩阵，x是20x1的 vector ，B是36x13，y是13x1。排名(A)=20。因为系统是超定的，所以最小二乘解是可能的，即； x = (
Python 将每月值插入每日值(线性): Pandas
我有一个带有年月数据列(yyyymm)的 Pandas 数据框。我计划将数据插入每日和每周值。下面是我的 df。 df: 201301 201302 201303
python - 线性 N 次等式问题的最小二乘法
假设我想找到2条任意高维直线的“交点”。这两条线实际上不会相交，但我仍然想找到最相交的点(即尽可能靠近所有线的点)。假设这些线有方向向量A、B和初始点C、D，我可以通过简单地设置一个线性最小二乘问题
java - 线性 "smoothed"函数使用查找表
如果我想编写一个函数(可能也是一个类)，它从不可变的查找表(调用构造函数时固定)返回线性“平滑”数据，如下所示: 例如func(5.0) == 0.5。存储查找表的最佳方式是什么？我正在考虑使用两
python - 线性 X 对数刻度
给定一条线 X像素长如: 0-------|---V---|-------|-------|-------max 如果0 <= V <= max , 线性刻度 V位置将是 X/max*V像素。如何计

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

parallel-processing - 带 ksp 导轨的 PETSc 求解线性系统