Home My Page Projects PaStiX
Summary Activity Forums Lists Docs News Files

Forum: help

Monitor Forum | Start New Thread Start New Thread
RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-16 14:54
[forum:110243]
thanks, ok.
I will do the following:

1. compile Pastix with complete debug support
2. run it and send you the output
3. run the code with debugger, valdgrind and ddt, should be availabel on the cray,
that may take some time, I will try to do it over the weekend.

Greetings, Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-16 14:51
[forum:110242]
Hello,

For ce citing part you have this http://www.labri.fr/perso/ramet/bib/Year/2002.complete.html#A:LaBRI::HRR01a or this http://www.labri.fr/perso/ramet/bib/Year/2012.complete.html#lacoste:hal-00700066 but the second one is not yet published...

For the double free corruption, can you get more information. It would be great if we knew were it happens.
Maybe by compiling with debug option PaStiX + your code ? Maybe by running with debbuging tools if available on the machine (valgrind, debugger ) and if the test case is not too big ?

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-16 10:57
[forum:110240]

mie-scattering-rev-10-gold-debug-extinction-ncore-16.err (17) downloads
Hello again, I have now installed Pastix 5.2.0 (3923) on the Cray XE6 and I am
testing it. Unfortunately, i crashes when using more than 8 processes. The complete
error output is attached to this post as a file.

Could you have a look ? I would really like to use Pastix up to several hundreds
and possibly even more processes.

Thanks and greetings, Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-14 13:44
[forum:110238]
Using the debug output activated with your suggested flags got me to the problem quite
quickly...pastix now solves systems and I will go on testing larger and larger models.
The next step will be to transfer it to the Cray XE6 of CSCS (www.cscs.ch) in order to run
production models.

How would one acknowledge using Pastix ? Are there specific publications that I can
cite ? Greetings, Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-14 12:20
[forum:110237]
I would like to know in which there are NaN(s) to know when they appear, that could help a bit.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-14 11:13
[forum:110236]
ok, done, thanks.

There are now many files, which in particular
would you need ?

Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-14 10:49
[forum:110235]
Hello,

You can add -DPASTIX_DUMP_FACTO -DPASTIX_DUMP_SOLV to your config.in in CCTYPES variable to write the matrix and the right-hand-side on disk.

With that we can find where the NaN appear, or if there is a problem with the matrix or RHS.

Notice that if you do several facto/solve, file will be overwritten. And it will slow down computation much.

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-14 10:24
[forum:110234]
Hello, I am trying to solve a real system now, i.e. a matrix that originates from
a Discontinuous Galerkin discretization of the electric curl-curl equatin in the
frequency domain. I think I still have something wrong in the way I set up
the Pastix z_dpastix solver, but I am a bit at a loss. Here's the output from
Pastix, when compiled with maximum debug output.

Do you have a suggestion where I could look ? Thanks, Benedikt


AUTOSPLIT_COMM : global rank : 0, inter node rank 0, intra node rank 0, threads 1
+--------------------------------------------------------------------+
+ PaStiX : Parallel Sparse matriX package +
+--------------------------------------------------------------------+
Matrix size 11500 x 11500
Number of nonzeros in A 1049200
+--------------------------------------------------------------------+
+ Options +
+--------------------------------------------------------------------+
Version : exported
SMP_SOPALIN : Defined
VERSION MPI : Defined
PASTIX_DYNSCHED : Not defined
STATS_SOPALIN : Defined
NAPA_SOPALIN : Defined
TEST_IRECV : Not defined
TEST_ISEND : Defined
TAG : Exact Thread
FORCE_CONSO : Not defined
RECV_FANIN_OR_BLOCK : Not defined
OUT_OF_CORE : Not defined
DISTRIBUTED : Defined
METIS : Not defined
WITH_SCOTCH : Defined
INTEGER TYPE : int
FLOAT TYPE : double complex
+--------------------------------------------------------------------+
Check : Numbering OK
Check : Sort CSC OK
Check : Duplicates OK
Check : Graph symmetry OK
Ordering :
> dpastix_order_prepare_cscd
< dpastix_order_prepare_cscd

WARNING: PaStiX works only with PT-Scotch default strategy

Time to compute ordering 0.161 s
> Initiating ordering
Symbolic Factorization :
Analyse :
Number of cluster 1
Number of processor per cluster 1
Number of thread number per MPI process 1
Building elimination graph
Building cost matrix
Building elimination tree
Total cost of the elimination tree 0.326948
Spliting initial partition
Using proportionnal mapping
Total cost of the elimination tree 0.299756
** New Partition: cblknbr= 419 bloknbr= 2318 ratio=5.532219 **
Factorization of the new symbol matrix by Crout blok algo takes : 1.71081e+09
Re-Building elimination graph
Building task graph
Number of tasks 419
Distributing partition
0 : Genering final SolverMatrix
NUMBER of THREAD 1
NUMBER of BUBBLE 1
COEFMAX 318000 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
** End of Partition & Distribution phase **
Time to analyze 0.00146 s
Number of nonzeros in factorized matrice 5851300
Fill-in 5.57692
Number of operations (LU) 1.36371e+10
Prediction Time to factorize (AMD 6180 MKL) 0.6 s
0 : SolverMatrix size (without coefficients) 205 Ko
0 : Number of nonzeros (local block structure) 3274140
Maximum coeftab size (cefficients) 99.9 Mo
Numerical Factorization (LU) :
Time to fill internal csc 0.137 s
--- Sopalin : Allocation de la structure globale ---
--- Fin Sopalin Init ---
--- Initialisation des tableaux globaux ---
Launching 1 threads (1 commputation, 0 communication, 0 out-of-core)
--- Sopalin : Local structure allocation ---
--- Sopalin : Threads are binded ---
--- Sopalin Begin ---
0 - Local number of terms allocated Cblk+Ftgt : 3274140, Cblk : 3274140, Overhead : 1.00 (0.00%)
Maximum number of terms allocated Cblk+Ftgt : 3274140, Cblk : 3274140, Overhead : 1.00 (0.00%)
Total number of terms allocated Cblk+Ftgt : 3274140, Cblk : 3274140, Overhead : 1.00 (0.00%)
--- Sopalin End ---
[0][0] Factorization communication time : 0 s
--- Fin Sopalin Init ---
GMRES :
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.19617e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 1 :
time to solve 0.0545 s
total iteration time 0.08 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.29153e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 2 :
time to solve 0.0542 s
total iteration time 0.0673 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.29153e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 3 :
time to solve 0.0546 s
total iteration time 0.0676 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.3869e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 4 :
time to solve 0.0548 s
total iteration time 0.068 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.29153e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 5 :
time to solve 0.0527 s
total iteration time 0.0661 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.69685e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 6 :
time to solve 0.0538 s
total iteration time 0.0673 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.3869e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 7 :
time to solve 0.0548 s
total iteration time 0.0692 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.41074e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 8 :
time to solve 0.0548 s
total iteration time 0.0688 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.31538e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 9 :
time to solve 0.0548 s
total iteration time 0.0688 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.31538e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 10 :
time to solve 0.0546 s
total iteration time 0.0688 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.29153e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 11 :
time to solve 0.0545 s
total iteration time 0.0689 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.50611e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 12 :
time to solve 0.0531 s
total iteration time 0.0676 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.29153e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 13 :
time to solve 0.0527 s
total iteration time 0.0678 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.3869e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 14 :
time to solve 0.0544 s
total iteration time 0.0693 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
0:0 up_down_smp
--- Sopalin : Local structure allocation ---
[0][0] Solve initialization time : 4.50611e-05 s
--- Down Step ---
--- Up Step ---
[0][0] Solve communication time : 0 s
- iteration 15 :
time to solve 0.0551 s
total iteration time 0.0702 s
error nan
||r|| nan
||b|| 4.1086e+12
||r||/||b|| nan
Max memory used after factorization 147 Mo
Memory used after factorization 120 Mo
Static pivoting 0
Time to factorize 1.67 s
Time to solve 0.055 s
Refinement 15 iterations, norm=nan
Time for refinement 1.04 s
Max memory used after clean 147 Mo
Memory used after clean 0 o

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-12 12:00
[forum:110227]
Hello,

Your logs shows that your job probably crashed in Scotch.

Maybe it doesn't like pure diagonal matrix ?

Can you had an extra diagonal (with zeros on it if you want) to that matrix ?

Can you rerun it with valgrind (if it's not too long) to have more information about the crash ? If you can't, maybe with a smaller problem which doesn't crash you can still get the errors ?

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-12 11:28
[forum:110226]

log.np.2.out (14) downloads
Hello Xavier, the above example with the purely diagonal matrix now runs better.

Unfortunately, when testing with different number of cores and larger matrix
size, I experience a problem. I have attached the output from jobs with 2
and 8 processes. Greetings, Benedikt

MPI_Init_thread level = MPI_THREAD_MULTIPLE

2012-Nov-12 12:01:23.968815 ::: pastixsolvertest.cc: 142 ::: PRODUCTION MPI rank =0
2012-Nov-12 12:01:23.968882 ::: pastixsolvertest.cc: 145 ::: PRODUCTION MPI size =8
2012-Nov-12 12:01:23.968940 ::: pastixsolvertest.cc: 212 ::: PRODUCTION [[[ PASTIX ::: initializing operational parameters to default values...
2012-Nov-12 12:01:23.969014 ::: pastixsolvertest.cc: 240 ::: PRODUCTION ...PASTIX ::: initialized operational parameters to default values ]]]
2012-Nov-12 12:01:23.978099 ::: pastixsolvertest.cc: 288 ::: PRODUCTION [[[ PASTIX ::: initializing matrix values...
2012-Nov-12 12:01:23.981046 ::: pastixsolvertest.cc: 298 ::: PRODUCTION ...PASTIX ::: initialized matrix values ]]]
2012-Nov-12 12:01:23.981161 ::: pastixsolvertest.cc: 313 ::: PRODUCTION [[[ PASTIX ::: initializing column pointer array...
2012-Nov-12 12:01:23.983118 ::: pastixsolvertest.cc: 325 ::: PRODUCTION ...PASTIX ::: initialized column pointer array ]]]
2012-Nov-12 12:01:23.983232 ::: pastixsolvertest.cc: 336 ::: PRODUCTION [[[ PASTIX ::: initializing local2global array...
2012-Nov-12 12:01:23.985167 ::: pastixsolvertest.cc: 347 ::: PRODUCTION ...PASTIX ::: initialized local2global array ]]]
2012-Nov-12 12:01:23.985261 ::: pastixsolvertest.cc: 360 ::: PRODUCTION [[[ PASTIX ::: initializing row array...
2012-Nov-12 12:01:23.987165 ::: pastixsolvertest.cc: 373 ::: PRODUCTION ...PASTIX ::: initialized row array ]]]
2012-Nov-12 12:01:23.987273 ::: pastixsolvertest.cc: 382 ::: PRODUCTION [[[ PASTIX ::: initializing r.h.s. array...
2012-Nov-12 12:01:23.998808 ::: pastixsolvertest.cc: 393 ::: PRODUCTION ...PASTIX ::: initialized r.h.s array ]]]
2012-Nov-12 12:01:23.998939 ::: pastixsolvertest.cc: 403 ::: PRODUCTION [[[ PASTIX ::: initializing permutation array...
2012-Nov-12 12:01:23.999034 ::: pastixsolvertest.cc: 410 ::: PRODUCTION ...PASTIX ::: initialized permutation array ]]]
2012-Nov-12 12:01:23.999105 ::: pastixsolvertest.cc: 418 ::: PRODUCTION [[[ ================================================================
2012-Nov-12 12:01:23.999168 ::: pastixsolvertest.cc: 422 ::: PRODUCTION solving via distributed interface pastix [z_dpastix] ...
AUTOSPLIT_COMM : global rank : 0, inter node rank 0, intra node rank 0, threads 1
AUTOSPLIT_COMM : global rank : 1, inter node rank 1, intra node rank 0, threads 1
AUTOSPLIT_COMM : global rank : 2, inter node rank 2, intra node rank 0, threads 1
AUTOSPLIT_COMM : global rank : 3, inter node rank 3, intra node rank 0, threads 1
AUTOSPLIT_COMM : global rank : 5, inter node rank 5, intra node rank 0, threads 1
AUTOSPLIT_COMM : global rank : 7, inter node rank 7, intra node rank 0, threads 1
AUTOSPLIT_COMM : global rank : 4, inter node rank 4, intra node rank 0, threads 1
AUTOSPLIT_COMM : global rank : 6, inter node rank 6, intra node rank 0, threads 1
+--------------------------------------------------------------------+
+ PaStiX : Parallel Sparse matriX package +
+--------------------------------------------------------------------+
Matrix size 4000000 x 4000000
Number of nonzeros in A 4000000
+--------------------------------------------------------------------+
+ Options +
+--------------------------------------------------------------------+
Version : exported
SMP_SOPALIN : Defined
VERSION MPI : Defined
PASTIX_DYNSCHED : Not defined
STATS_SOPALIN : Defined
NAPA_SOPALIN : Defined
TEST_IRECV : Not defined
TEST_ISEND : Defined
TAG : Exact Thread
FORCE_CONSO : Not defined
RECV_FANIN_OR_BLOCK : Not defined
OUT_OF_CORE : Not defined
DISTRIBUTED : Defined
METIS : Not defined
WITH_SCOTCH : Defined
INTEGER TYPE : int
FLOAT TYPE : double complex
+--------------------------------------------------------------------+
Check : Numbering OK
Check : Sort CSC OK
Check : Duplicates OK
Ordering :

WARNING: PaStiX works only with PT-Scotch default strategy

[odysseus:94484] *** Process received signal ***
[odysseus:94484] Signal: Segmentation fault: 11 (11)
[odysseus:94484] Signal code: Address not mapped (1)
[odysseus:94484] Failing at address: 0x7fc9c86a00bc
[odysseus:94484] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 94484 on node odysseus.psi.ch exited on signal 11 (Segmentation fault: 11).

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-09 11:04
[forum:110225]
Thanks! solves the problem. I will now integrate pastix into the main code.
I am looking forward to it. Greetings, Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-09 10:46
[forum:110219]
Did you do make scotch ; make ptscotch ?
In this version this is not possible anymore, you have to do make ptscotch only (or make scotch if you want scotch alone, but not both)

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-09 10:18
[forum:110218]
Hello, thanks a lot. I have installed the new scotch and tried to recompile
pastix 5.2.0. release 3923 which unfortunately crashes at compilation:


gcc -O3 -Wall -DCUDA_SM_VERSION=20 -DFORCE_NO_CUDA -DMEMORY_USAGE -DSTATS_SOPALIN -I/Users/oswald/extlib/scotch/6.0.0.rc17/openmpi/1.4.4/gcc/4.6.2/include -DDISTRIBUTED -DWITH_SCOTCH -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -I./common/src -I./order/src -I./symbol/src -I./fax/src -I./perf/src -I./blend/src -I./kass/src -I./sopalin/src -I./utils/src -I./matrix_drivers/src -I./wrapper/src -I./sparse-matrix/src -Imurge/include -DPREC_DOUBLE -c symbol/src/symbol_levf.c -o symbol/obj/i686_mac/symbol_levf.o
gcc -O3 -Wall -DCUDA_SM_VERSION=20 -DFORCE_NO_CUDA -DMEMORY_USAGE -DSTATS_SOPALIN -I/Users/oswald/extlib/scotch/6.0.0.rc17/openmpi/1.4.4/gcc/4.6.2/include -DDISTRIBUTED -DWITH_SCOTCH -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -I./common/src -I./order/src -I./symbol/src -I./fax/src -I./perf/src -I./blend/src -I./kass/src -I./sopalin/src -I./utils/src -I./matrix_drivers/src -I./wrapper/src -I./sparse-matrix/src -Imurge/include -DPREC_DOUBLE -c symbol/src/symbol_nonzeros.c -o symbol/obj/i686_mac/symbol_nonzeros.o
gcc -O3 -Wall -DCUDA_SM_VERSION=20 -DFORCE_NO_CUDA -DMEMORY_USAGE -DSTATS_SOPALIN -I/Users/oswald/extlib/scotch/6.0.0.rc17/openmpi/1.4.4/gcc/4.6.2/include -DDISTRIBUTED -DWITH_SCOTCH -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -I./common/src -I./order/src -I./symbol/src -I./fax/src -I./perf/src -I./blend/src -I./kass/src -I./sopalin/src -I./utils/src -I./matrix_drivers/src -I./wrapper/src -I./sparse-matrix/src -Imurge/include -DPREC_DOUBLE -c symbol/src/symbol_tree.c -o symbol/obj/i686_mac/symbol_tree.o
gcc -O3 -Wall -DCUDA_SM_VERSION=20 -DFORCE_NO_CUDA -DMEMORY_USAGE -DSTATS_SOPALIN -I/Users/oswald/extlib/scotch/6.0.0.rc17/openmpi/1.4.4/gcc/4.6.2/include -DDISTRIBUTED -DWITH_SCOTCH -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -I./common/src -I./order/src -I./symbol/src -I./fax/src -I./perf/src -I./blend/src -I./kass/src -I./sopalin/src -I./utils/src -I./matrix_drivers/src -I./wrapper/src -I./sparse-matrix/src -Imurge/include -DPREC_DOUBLE -c fax/src/symbol_compact.c -o fax/obj/i686_mac/symbol_compact.o
gcc -O3 -Wall -DCUDA_SM_VERSION=20 -DFORCE_NO_CUDA -DMEMORY_USAGE -DSTATS_SOPALIN -I/Users/oswald/extlib/scotch/6.0.0.rc17/openmpi/1.4.4/gcc/4.6.2/include -DDISTRIBUTED -DWITH_SCOTCH -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -I./common/src -I./order/src -I./symbol/src -I./fax/src -I./perf/src -I./blend/src -I./kass/src -I./sopalin/src -I./utils/src -I./matrix_drivers/src -I./wrapper/src -I./sparse-matrix/src -Imurge/include -DPREC_DOUBLE -c fax/src/symbol_costi.c -o fax/obj/i686_mac/symbol_costi.o
gcc -O3 -Wall -DCUDA_SM_VERSION=20 -DFORCE_NO_CUDA -DMEMORY_USAGE -DSTATS_SOPALIN -I/Users/oswald/extlib/scotch/6.0.0.rc17/openmpi/1.4.4/gcc/4.6.2/include -DDISTRIBUTED -DWITH_SCOTCH -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -I./common/src -I./order/src -I./symbol/src -I./fax/src -I./perf/src -I./blend/src -I./kass/src -I./sopalin/src -I./utils/src -I./matrix_drivers/src -I./wrapper/src -I./sparse-matrix/src -Imurge/include -DPREC_DOUBLE -c fax/src/symbol_fax.c -o fax/obj/i686_mac/symbol_fax.o
mpicc -O3 -Wall -DCUDA_SM_VERSION=20 -DFORCE_NO_CUDA -DMEMORY_USAGE -DSTATS_SOPALIN -I/Users/oswald/extlib/scotch/6.0.0.rc17/openmpi/1.4.4/gcc/4.6.2/include -DDISTRIBUTED -DWITH_SCOTCH -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -DVERSION='"exported"' -DX_ARCHi686_mac -DDOF_CONSTANT -I./common/src -I./order/src -I./symbol/src -I./fax/src -I./perf/src -I./blend/src -I./kass/src -I./sopalin/src -I./utils/src -I./matrix_drivers/src -I./wrapper/src -I./sparse-matrix/src -Imurge/include -c fax/src/symbol_fax_graph.c -o fax/obj/i686_mac/symbol_fax_graph.o
fax/src/symbol_fax_graph.c:149:21: error: unknown type name ‘SCOTCH_Dgraph’
make: *** [fax/obj/i686_mac/symbol_fax_graph.o] Error 1

I think a type is missing ? Greetings, Benedikt




RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-08 13:57
[forum:110217]

scotch_6.0.0rc17.tar.gz (28) downloads
Ok I runned your test and it seems it comes from Scotch 5.1.12b.
Scotch 6.0.0rc17 (attached) seems to solve the problem.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-08 11:54
[forum:110216]
Salut, following the c++ example, I have implemented a very simple test program
for testing the distributed interface in c++ using z_dpastix.
Interestingly, it works for small matrices, even in parallel, but going to larger matrix
sizes, the code crashes with the following report:


+ PaStiX : Parallel Sparse matriX package +
+--------------------------------------------------------------------+
Matrix size 600 x 600
Number of nonzeros in A 600
+--------------------------------------------------------------------+
+ Options +
+--------------------------------------------------------------------+
Version : exported
SMP_SOPALIN : Defined
VERSION MPI : Defined
PASTIX_DYNSCHED : Not defined
STATS_SOPALIN : Defined
NAPA_SOPALIN : Defined
TEST_IRECV : Not defined
TEST_ISEND : Defined
TAG : Exact Thread
FORCE_CONSO : Not defined
RECV_FANIN_OR_BLOCK : Not defined
OUT_OF_CORE : Not defined
DISTRIBUTED : Defined
METIS : Not defined
WITH_SCOTCH : Defined
INTEGER TYPE : int
FLOAT TYPE : double complex
+--------------------------------------------------------------------+
Check : Numbering OK
Check : Sort CSC OK
Check : Duplicates OK
[Benedikts-MacBook-Pro:56012] *** Process received signal ***
[Benedikts-MacBook-Pro:56012] Signal: Floating point exception: 8 (8)
[Benedikts-MacBook-Pro:56012] Signal code: Integer divide-by-zero (7)
[Benedikts-MacBook-Pro:56012] Failing at address: 0x10c34dc35
[Benedikts-MacBook-Pro:56012] *** End of error message ***
[Benedikts-MacBook-Pro:56013] *** Process received signal ***
[Benedikts-MacBook-Pro:56013] Signal: Floating point exception: 8 (8)
[Benedikts-MacBook-Pro:56013] Signal code: Integer divide-by-zero (7)
[Benedikts-MacBook-Pro:56013] Failing at address: 0x10bd37c35
[Benedikts-MacBook-Pro:56013] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 56012 on node Benedikts-MacBook-Pro.local exited on signal 8 (Floating point exception: 8).
--------------------------------------------------------------------------

Here's the code snippet that builds the distributed matrix: a purely diagonal
matrix where the elements correspond to the column indices and the right hand side
is just 1 * I so that the solution can be verfied easily:

PaStiX::pastix_data_t *pastix_data = NULL; /* Pointer to a storage structure needed by pastix */
PaStiX::pastix_int_t nlocalcol; /* Size of the matrix */
PaStiX::pastix_int_t *colptr = NULL; /* Indexes of first element of each column in row and values */
PaStiX::pastix_int_t *row = NULL; /* Row of each element of the matrix */
PaStiX::pastix_int_t *local2global = NULL; /* Local to local column correspondance */

std::complex<double> *values = NULL; /* Value of each element of the matrix */
std::complex<double> *rhs = NULL; /* right hand side */
std::complex<double> *rhssaved = NULL; /* right hand side (save) */
std::complex<double> *ax = NULL; /* A times X product */

PaStiX::pastix_int_t iparm[PaStiX::IPARM_SIZE]; /* integer parameters for pastix */
double dparm[PaStiX::DPARM_SIZE]; /* floating parameters for pastix */


PaStiX::pastix_int_t *perm = NULL; /* Permutation tabular */
PaStiX::pastix_int_t *invp = NULL; /* Reverse permutation tabular */

int nbrhs = 1;

/*
char *type = NULL; type of the matrix
char *rhstype = NULL; type of the right hand side


PaStiX::driver_type_t *driver_type; Matrix driver(s) requested by user
char **filename; Filename(s) given by user
int nbmatrices; Number of matrices given by user
int nbthread; Number of thread wanted by user
int verbosemode; Level of verbose mode (0, 1, 2)
int ordering; Ordering to use

int incomplete; Indicate if we want to use incomplete factorisation
int level_of_fill; Level of fill for incomplete factorisation
int amalgamation; Level of amalgamation for Kass
int ooc; OOC limit (Mo/percent depending on compilation options)
PaStiX::pastix_int_t mat_type;
long i;
double norme1, norme2;
*/



/***********************************************************/
/** \brief initialize parameters to default values **/
/***********************************************************/

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ PASTIX ::: initializing operational parameters to default values...")
);

iparm[PaStiX::IPARM_MODIFY_PARAMETER] = PaStiX::API_NO;

PaStiX::z_dpastix(&pastix_data,
MPI_COMM_WORLD,
nlocalcol,
colptr,
row,
values,
local2global,
perm,
invp,
rhs,
nbrhs,
iparm,
dparm
);

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("...PASTIX ::: initialized operational parameters to default values ]]]")
);


/****************************************************************/
/** \brief initialize a sample matrix to be used in Pastix. **/
/** nota bene: matrix element index numbers follows **/
/** the Fortran convention, i.e. starts at 1 **/
/****************************************************************/


/************************************************************************/
/** \brief define the size of the local matrix !!! **/
/************************************************************************/
nlocalcol = 300;
PaStiX::pastix_int_t nlocalrow = nlocalcol;


/************************************************************************/
/** \brief dynamically allocate memory for the matrix elements; **/
/** in order to verify the correct solution we design a matrix **/
/** that is purely diagnonal: **/
/** **/
/** therefore, the number matrix elements equals the number of **/
/** columns or rows, which is equivalent. **/
/************************************************************************/

PaStiX::pastix_int_t nmatrixelem = nlocalcol;

values = new std::complex<double>[nmatrixelem];

/************************************************************/
/** \brief insert values into the values array **/
/** **/
/** matrix = [ 1 0 0 0 0 0 0 0 0 0 ] **/
/** 0 2 0 0 0 0 0 0 0 0 ] **/
/** 0 0 3 0 0 0 0 0 0 0 ] **/
/** 0 0 0 4 0 0 0 0 0 0 ] **/
/** 0 0 0 0 5 0 0 0 0 0 ] **/
/** 0 0 0 0 0 6 0 0 0 0 ] **/
/** 0 0 0 0 0 0 7 0 0 0 ] **/
/** 0 0 0 0 0 0 0 8 0 0 ] **/
/** 0 0 0 0 0 0 0 0 9 0 ] **/
/** 0 0 0 0 0 0 0 0 0 10 ] **/
/** **/
/************************************************************/

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ PASTIX ::: initializing matrix values...")
);

for(unsigned int i=0;i < nmatrixelem;i++)
{
values[i] = static_cast<double>( rank * nlocalcol ) + static_cast<double>(i+1);
}


loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("...PASTIX ::: initialized matrix values ]]]")
);


/******************************************************************************************/
/** \brief allocate the array to store the column pointers, **/
/** i.e. the array that says at which index position in the matrix **/
/** array a new column starts! **/
/** **/
/** nota bene: pastix uses the FORTRAN convention for array indexing !!! **/
/******************************************************************************************/


/** \brief since we use a diagonal matrix, there are as many elements as columns + 1 in this array **/
loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ PASTIX ::: initializing column pointer array...")
);

PaStiX::pastix_int_t ncolptr = nlocalcol + 1;
colptr = new PaStiX::pastix_int_t[ncolptr];

for(unsigned int i=0;i < ncolptr;i++)
{
colptr[i] = i + 1;
}

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("...PASTIX ::: initialized column pointer array ]]]")
);


/******************************************************************************/
/** \brief allocate the array to store the local to global column indices **/
/** i.e. the array that, given the local index, gives the global **/
/** column index; this array has as many elements as there are **/
/** local columns. **/
/******************************************************************************/
loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ PASTIX ::: initializing local2global array...")
);

local2global = new PaStiX::pastix_int_t[nlocalcol];

for(unsigned int i=0;i < nlocalcol;i++)
{
local2global[i] = (rank * nlocalcol) + (i + 1); /** \brief nota bene, we use the FORTRAN convention **/
}

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("...PASTIX ::: initialized local2global array ]]]")
);


/******************************************************************************************/
/** \brief allocate the array to store the row indices of every matrix element **/
/** therefore this array has as many elements as has the matrix element array. **/
/** **/
/** nota bene: pastix uses the FORTRAN convention for array indexing !!! **/
/******************************************************************************************/


loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ PASTIX ::: initializing row array...")
);


row = new PaStiX::pastix_int_t[nmatrixelem];

for(unsigned int i=0;i < nmatrixelem;i++)
{
row[i] = (rank * nlocalcol) + (i + 1); /** \brief nota bene, we use the FORTRAN convention **/
}


loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("...PASTIX ::: initialized row array ]]]")
);


/******************************************************************************************/
/** \brief allocate the array to store the right hand side of the linear system **/
/******************************************************************************************/

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ PASTIX ::: initializing r.h.s. array...")
);

rhs = new std::complex<double>[nlocalrow];

for(unsigned int i=0;i < nlocalrow;i++)
{
rhs[i] = std::complex<double>(0.0,1.0);
}

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("...PASTIX ::: initialized r.h.s array ]]]")
);



/******************************************************/
/** \brief call distributed interface pastix **/
/******************************************************/

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ PASTIX ::: initializing permutation array...")
);

perm = new PaStiX::pastix_int_t [nlocalcol];
invp = new PaStiX::pastix_int_t [nlocalcol];

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("...PASTIX ::: initialized permutation array ]]]")
);


// PRINT_RHS_CPLX("RHS", rhs, ncol, mpid, iparm[PaStiX::IPARM_VERBOSE]);


loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string("[[[ ================================================================")
);

loggingmessage<PRODUCTION_PHASE, TIME>(std::string(__FILE__), __LINE__,
std::string(" solving via distributed interface pastix [z_dpastix] ...")
);


PaStiX::z_dpastix(&pastix_data,
MPI_COMM_WORLD,
nlocalcol,
colptr,
row,
values,
local2global,
perm,
invp,
rhs,
nbrhs,
iparm,
dparm
);



Would you see an ovbious bug ? I am puzzled that it works on 2, 4 and 8 processes
for small matrix sized, e.g. 10 or 20 but crashes otherwise ?

Thanks again, Benedikt


RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-05 15:54
[forum:110213]
Hello again,

Yes, I didn't write the distributed example yet but it should be similar.

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-11-05 15:34
[forum:110212]
Salut, merci bien!!!

I have just installed the new release and am now studying the c++ example.
I guess, I can just use the distributed version analogously ?

Thanks, again! Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-05 14:58
[forum:110211]
Hello,

I created a C++ example for PaStiX.
I had to do few modification to PaStiX to be abble to make it work.
We produced a pre-release version of PaStiX, you can download it here : http://www.labri.fr/perso/ramet/pastix_release_3923.tar.bz2.

In this release you can find a c++ example in src/examples/src/cppsimple.cpp.

This example uses pastix_float_t but you can use std::complex<double> with z_dpastix().

You can generate the example using std::complex<double> with z_dpastix() by doing

> cd examples/src
> make zcppsimple.cpp

Have a nice day,

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-11-03 10:21
[forum:110206]
Hi,

You should have this for double complex :


###################################################################
# FLOAT TYPE #
###################################################################
CCTYPESFLT =
# Uncomment the following lines for double precision support
VERSIONPRC = _double
CCTYPESFLT := $(CCTYPESFLT) -DFORCE_DOUBLE -DPREC_DOUBLE

# Uncomment the following lines for float=complex support
VERSIONFLT = _complex
CCTYPESFLT := $(CCTYPESFLT) -DFORCE_COMPLEX -DTYPE_COMPLEX

The first part for double, the second one for complex.

(or else use z_dpastix)

Inside a C++ code i think you can use C++ complex, it should be compatible.
And you have to include pastix.h inside extern "C" {}.
We have to write a C++ example to help C++ users... ==> TODO-List.

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Benedikt Oswald on 2012-11-01 16:02
[forum:110205]
Thanks, still, unfortunately, I am a bit stuck, since the location where I want to use
Pastix is inside c++ code. When I include <complex.h> and try to use the C99 macros
for complex numbers, then gcc 4.6.2 reports that things such as: creal, cimag,
_Complex_I are not defined. I suppose, this is because I (need to) use g++ for
compiling.

On the other hand, how exactly should I set the flags in config.in for compiling
pastis, more the to the point, how should this passage be set:

###################################################################
# FLOAT TYPE #
###################################################################
CCTYPESFLT =
# Uncomment the following lines for double precision support
VERSIONPRC = _double
CCTYPESFLT := $(CCTYPESFLT) -DFORCE_DOUBLE -DPREC_DOUBLE

# Uncomment the following lines for float=complex support
#VERSIONFLT = _complex
#CCTYPESFLT := $(CCTYPESFLT) -DFORCE_COMPLEX -DTYPE_COMPLEX

in order to have pastix_float_t to be defined as complex, and which
complex type is it then ?

Thanks a lot, Benedikt






RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-10-31 15:03
[forum:110200]
Hello,

Those ComplexDouble_ are only there for the C++ interface.

You can use standard c99 "double complex" with z_dpastix().

Or you can build PaStiX with -DPREC_DOUBLE and -DTYPE_COMPLEX uncommented in config.in and use pastix_float_t.

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-10-31 14:29
[forum:110199]
Hello, another basic question;

I need to declare the non-zero vals for the matrix and I would like
to use complex numbers with double precision.

What data type should I use ?

I used ComplexDouble_ but this obivously was not liked
by the compiler ?

Thanks for a quick answer, Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-10-30 09:05
[forum:110194]
The cray xt6 is in fact the foreseen target/production machine
where I need a direct solver most. I will get back as soon as
I have news on using pastix there. Benedikt

RE: simple problem when running a distributed example [ Reply ]
By: Xavier Lacoste on 2012-10-30 09:04
[forum:110193]
Ok,

then I have to perform some tests with openmpi without thread support.
Thanks for the report.

I never worked on a Cray machine, so I don't know how PaStiX works with it.

XL.

RE: simple problem when running a distributed example [ Reply ]
By: Nobody on 2012-10-30 08:40
[forum:110192]
Salut Xavier, in fact, none of the examples works with these flags.

Therefore, I have taken the work and reinstallted my openmpi,
with --enable-mpi-threads and eventually recompiled everything.
Now the examples work.

I wonder, what will happen on the Cray XT6 architecture ?

Do you have experience with this ?

Merci, Benedikt




Older Messages