Home My Page Projects Scotch
Summary Activity Forums Tracker Lists Tasks Docs News SCM Files

[#18859] scotch for Openfoam not decomposing with 384 partitions

Date:
2015-04-15 10:26
Priority:
3
State:
Open
Submitted by:
matteo lombardi (matteolombardi)
Assigned to:
Francois PELLEGRINI (pelegrin)
Category:
none
Group:
none
Resolution:
none
Summary:
scotch for Openfoam not decomposing with 384 partitions

Detailed description
Hello,
I am trying to decompose an OpenFoam case with scotch and it works fine when selecting 4,8,128,256,512 partitions.

But I fails when I ask for 384 partitions (unfortunately this is the value I need for my cluster.. long story..).

Here is the error:

Selecting decompositionMethod ptscotch
(192): (272): ERROR: dgraphCheck: inconsistent communication data (5)
(304): ERROR: dgraphCheck: inconsistent communication data (5)
(336): ERROR: dgraphCheck: inconsistent communication data (5)
(368): ERROR: dgraphCheck: inconsistent communication data (5)
(202): (200): ERROR: dgraphCheck: inconsistent communication data (5)
(206): ERROR: dgraphCheck: inconsistent communication data (5)
(198): (201): (194): (196): ERROR: dgraphCheck: inconsistent communication data (5)
(236): (250): ERROR: dgraphCheck: inconsistent communication data (5)
(203): ERROR: dgraphCheck: inconsistent communication data (5)
(216): (207): ERROR: dgraphCheck: inconsistent communication data (5)
(238): (230): ERROR: dgraphCheck: inconsistent communication data (5)
(233): ERROR: dgraphCheck: inconsistent communication data (5)
(197): ERROR: dgraphCheck: inconsistent communication data (5)
(274): ERROR: dgraphCheck: inconsistent communication data (5)
(195): ERROR: dgraphCheck: inconsistent communication data (5)
(193): (199): ERROR: dgraphCheck: inconsistent communication data (5)


Any Idea why this happens? Is it a known bug? Any fix?

Thanks you very much,
Matteo
Message  ↓
Date: 2019-05-07 18:40
Sender: Francois PELLEGRINI

Dear Matteo,
Sorry for having been silent for so long.
In the weeks to come, I plan to devote some time to fixing this issue once for good.
Can you please get in touch with me by e-mail, on my academic address (francois.pellegrini@u-bordeaux.fr), so that we can devise ways for me to get the debugging info that I need, without requiring you to export sensitive data ?
Sorry for the inconvenience,
Regards,
f.p.

Date: 2019-03-14 14:52
Sender: matteo lombardi

Hello,
sorry, I had missed your reply.
Thanks for looking into this.

Some time has passed and we still get from time to time some errors like the one reported above. More often, we get now this one:
dgraphGatherAll2: out of memory.

These errors are quite random.
Often, just changing the number of partitions or modifying slightly the mesh, the decomposition works fine.

Some introduction:
We use Openfoam and we ran cases with ~500+ mln cells generated by SnappyHexMesh (OF mesher). We decompose the domain from ~100 partitions to ~400 partitions. we call the library ptscotch.
We used to run with scotch 6.03, but I have now tested also scotch6.06 and my test case failed as well.

Unfortunately I can not share my test case because I work in a F1 team and we can not send any geometry out.

I have checked my nodes during execution and I am sure they are not running our of memory.

Any suggestion?
Is there any hard-coded limit of max_label for cells that we are reaching? any flag we can add? also to help you debug?

Thank you very much,
Matteo

Date: 2018-02-12 07:30
Sender: Francois PELLEGRINI

Déplacé de Support Requests vers Bugs

Date: 2018-02-12 07:30
Sender: Francois PELLEGRINI

Dear Matteo,
Sorry to hear that. Can you check whether the new Scotch 6.0.5 fixes the issue ?
If not, can you send me a reproducer ?
Regards and sorry again,
f.p.

Field Old Value Date By
typeSupport Requests2018-02-12 07:30pelegrin
assigned_tonone2018-02-12 07:30pelegrin