- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我已经使用 NVBLAS 编译了 jBLAS,解决方案有点老套,因为配置脚本没有正确找到库。我像这样手动编辑了 jBLAS 的 configure.out
文件,以包含 NVBLAS 库。
BUILD_TYPE=nvblas
CC=gcc
CCC=c99
CFLAGS=-fPIC -DHAS_CPUID
F77=gfortran
FOUND_JAVA=true
FOUND_NM=true
INCDIRS=-Iinclude -I/usr/lib/jvm/java-11-openjdk-amd64/include -I/usr/lib/jvm/java-11-openjdk-amd64/include/linux
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
LAPACK_HOME=./lapack-lite-3.1.1
LD=gcc
LDFLAGS=-shared
LIB=lib
LINKAGE_TYPE=static
LOADLIBES=-Wl,-z,muldefs /home/linyi/jblas/lapack-lite-3.1.1/lapack_LINUX.a /usr/local/cuda-11.0/lib64/libnvblas.so.11 /home/linyi/jblas/lapack-lite-3.1.1/blas_LINUX.a -lgfortran
MAKE=make
NM=nm
OS_ARCH=amd64
OS_ARCH_WITH_FLAVOR=amd64/sse3
OS_NAME=Linux
RUBY=ruby
SO=so
然后我按照记录 here 运行命令 make clean all
和 mvn clean package
.测试成功通过,但程序在退出时导致段错误。
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running org.jblas.TestEigen
[NVBLAS] NVBLAS_CONFIG_FILE environment variable is set to '/home/linyi/nvblas.conf'
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec
Running org.jblas.TestComplexFloat
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.jblas.TestDecompose
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec
Running org.jblas.TestBlasDouble
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec
Running org.jblas.TestBlasDoubleComplex
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.jblas.TestSingular
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec
Running org.jblas.TestDoubleMatrix
Tests run: 37, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.022 sec
Running org.jblas.TestSolve
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.jblas.TestBlasFloat
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.jblas.TestFloatMatrix
Tests run: 37, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 sec
Running org.jblas.SimpleBlasTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
Running org.jblas.ranges.RangeTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec
Running org.jblas.TestGeometry
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec
Running org.jblas.ComplexDoubleMatrixTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
-- org.jblas INFO Deleting /tmp/jblas4383455253907276334/libjblas.so
-- org.jblas INFO Deleting /tmp/jblas4383455253907276334/libjblas_arch_flavor.so
-- org.jblas INFO Deleting /tmp/jblas4383455253907276334
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fa1f6bb96b1, pid=8063, tid=8072
#
# JRE version: OpenJDK Runtime Environment (11.0.8+10) (build 11.0.8+10-post-Ubuntu-0ubuntu118.04.1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.8+10-post-Ubuntu-0ubuntu118.04.1, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C [libcublas.so.11+0xa096b1]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /home/linyi/jblas/jblas/core.8063)
#
# An error report file with more information is saved as:
# /home/linyi/jblas/jblas/hs_err_pid8063.log
#
# If you would like to submit a bug report, please visit:
# https://bugs.launchpad.net/ubuntu/+source/openjdk-lts
#
Aborted (core dumped)
Results :
Tests run: 122, Failures: 0, Errors: 0, Skipped: 0
我决定运行 mvn clean package -DskipTests
因为测试似乎通过得很好,只是程序在终止时导致了段错误。然而,当我在我的 Java 项目中使用该库时,nvblas.log
显示尽管 NVBLAS 拦截了对 BLAS 例程的调用,但它们实际上是在 CPU 而不是 GPU 上执行的。用我的程序运行 nvprof --print-gpu-summary
也得到了相同的结论。
#
==7711== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 100.00% 1.8240us 1 1.8240us 1.8240us 1.8240us [CUDA memcpy HtoD]
======== Error: Application received signal 134
而nvblas.log
的内容如下:
[NVBLAS] Using devices :0
[NVBLAS] Config parsed
[NVBLAS] dgemm[cpu]: ta=N, tb=N, m=1, n=1, k=1
[NVBLAS] dsyr2k[cpu]: up=U, ta=N, n=24, k=28
[NVBLAS] dsyr2k[cpu]: up=U, ta=N, n=32, k=28
[NVBLAS] dsyr2k[cpu]: up=U, ta=N, n=26, k=28
[NVBLAS] dsyr2k[cpu]: up=U, ta=N, n=22, k=28
[NVBLAS] dsyr2k[cpu]: up=U, ta=N, n=20, k=28
[NVBLAS] dsyr2k[cpu]: up=U, ta=N, n=26, k=28
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=N, di=U, m=52, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=N, di=U, m=54, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=L, ta=T, di=N, m=52, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=L, ta=T, di=N, m=54, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=N, di=U, m=60, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=T, di=U, m=54, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=T, di=U, m=52, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=L, ta=T, di=N, m=60, n=31
[NVBLAS] dsyr2k[cpu]: up=U, ta=N, n=22, k=28
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=T, di=U, m=60, n=31
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=N, di=U, m=54, n=22
[NVBLAS] dgemm[cpu]: ta=T, tb=N, m=54, n=22, k=31
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=N, di=U, m=60, n=28
[NVBLAS] dtrmm[cpu]: si=R, up=U, ta=N, di=U, m=52, n=20
[NVBLAS] dtrmm[cpu]: si=R, up=L, ta=T, di=N, m=54, n=22
[NVBLAS] dgemm[cpu]: ta=T, tb=N, m=60, n=28, k=31
[NVBLAS] dgemm[cpu]: ta=T, tb=N, m=52, n=20, k=31
[NVBLAS] dgemm[cpu]: ta=N, tb=T, m=31, n=54, k=22
. . .
我真的不知道该怎么办,我希望有人能提供任何建议,这似乎记录得很差。
最佳答案
我相信我已经弄明白了。
nm libjblas.so | grep -is dgemm
U cblas_dgemm
U dgemm_@@libnvblas.so.11
这表明该库确实链接正确。然后我继续运行 jBlas 的内置基准测试,只需运行 java -jar jblas.jar
,其中 jblas.jar
是已编译的库,显然 GPU 卸载只发生对于大型矩阵,因为在 n=10
或 n=100
时没有进行任何 gpu 调用,但 nvblas.log
在 处记录了 GPU 计算n=1000
。这让我很困惑,我希望这能帮助其他正在努力解决这个问题的人。
关于java - 将 jBLAS 与 NVBLAS 结合使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63460550/
TLDR; for the ones that wants to avoid reading the whole story: Is there a way to interface RcppArma
我已经使用 NVBLAS 编译了 jBLAS,解决方案有点老套,因为配置脚本没有正确找到库。我像这样手动编辑了 jBLAS 的 configure.out 文件,以包含 NVBLAS 库。 BUILD
我通过 CRAN 在我的 Mac 上安装了 R .我也通过 homebrew 安装了 openblas .我可以按如下方式在 BLAS 实现之间切换: 引用 blas(我认为是 netlib): ln
我是一名优秀的程序员,十分优秀!