当前位置：首页 > article >正文

hpl 的测试配置文件 HPL.dat 的内容说明

article 2025/1/20 3:47:44

1，HPL.dat 内容和主体结构

在编译完成后，bin/$(arch)/HPL.dat 内容如下：

$ cat HPL.dat

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
file         device out (6=stdout,7=stderr,file)
4            # of problems sizes (N)
29 30 34 35  Ns
4            # of NBs
1 2 3 4      NBs
0            PMAP process mapping (0=Row-,1=Column-major)
3            # of process grids (P x Q)
2 1 4        Ps
2 4 1        Qs
16.0         threshold
3            # of panel fact
0 1 2        PFACTs (0=left, 1=Crout, 2=Right)
2            # of recursive stopping criterium
2 4          NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
3            # of recursive panel fact.
0 1 2        RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
0            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
0            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)

该文件的前两行位说明性信息，3行到末尾，右侧占大篇幅的文本为help信息，可以删除；只有左侧的大部分以数字形式出现的内容为有效的配置信息。

2，HPL.dat的详细说明

其实文字部分已经做了简要说经，

https://www.netlib.org/benchmark/hpl/HPL_pdinfo.html

1，说明性信息，忽略
2，说明性信息，忽略
3，指定测试输出文件的名字，可以随意命名，但是否启用由4-th 行规定。
4，指定测试输出的方向，值可以是：6，7，file 这三种。
     6 将信息输出到标准输出，
     7 将信息输出到标准错误输出，
     file/any 输出到 3-th line 指定的文件名的文件中。
5，测试方程组的个数，即矩阵的个数 Ns
6，对应 5-th line 中的矩阵个数，每个矩阵的阶数 N
7，分块方案的个数 NBs
8，每个分块的具体的边长列表 NB
9，0 row-major； 1 column-major；进程在 grid 中的主序；
        On exit, PMAPPIN specifies the process mapping onto the no-
        des of the MPI machine configuration. PMAPPIN defaults to
        row-major ordering.
10，进程布局方案个数 (P,Q) 的组合个数；P*Q = 集群中 cpu 物理core的总数时性能最佳，比如Intel志强cpu，不开启超线程时，HPC 性能更佳；
     HPL中，L分解的列向量通信采用 Binary Exchange 方式，可以检查一下 rocHPL是否依然如此。这对P的选择有关系，例如是否应该强调使用 P =2^n
11，具体的 P 的列表
12，具体的 Q 的列表
13，测试精度的阈值；对AX=B的计算结果进行检测时的对比阈值；小于此值则正确，大于此值则错误；对时很小，错时很大，不用改动。

14，采用 panel factorization 算法的个数
15，具体的 panel factorization 算法列表；
    具体为如下三种：
     0：Left looking lu fact variant
    1: Crout lu fact variant
    2: Right looking lu fact variant

16，迭代停止标准
17，NBMINs >= 1
18，
19，
20，
21，
22，
23，
24，
25，
26，
27，
28，L的存放格式, 0：按列；1:按行；
29，U的存放格式, 同上。
30，是否回代
31，内存分配对齐的字节数