Home My Page Projects Polyphemus
Summary Activity Forums Tracker Lists Docs News SCM Files Mediawiki

Forum: open-discussion

Monitor Forum | Start New Thread Start New Thread
RE: Call for feedback and discussion about the next Polyphemus 2.0 [ Reply ]
By: Pierre Tran on 2011-11-09 12:43
[forum:108375]
Presentation of what has been eventually developped as an alpha version can be found at
https://gforge.inria.fr/docman/view.php/1173/7673/IO_presentation_20111108.pdf



Call for feedback and discussion about the next Polyphemus 2.0 [ Reply ]
By: Pierre Tran on 2010-12-06 12:44
[forum:105950]
INTRODUCTION
=========
When Polyphemus 1.7 was released a few months ago, we announced a big overhaul
for the version to be released this autumn. At the time, we thought of
changes allowing Polyphemus to write its output results in a self-describing
format (for example, netCDF) in order to replace the raw binary format.
However, after some exploring work and afterthought, we came to the
conclusion that a far broader overhaul was necessary to keep Polyphemus in
line with its global objectives of:
- flexibility and versatility: users should be able to adapt easily Polyphemus
output to whatever they need;
- and modularity: the new I/O architecture should be as modular as possible,
so that developped modules might be resuable elsewhere or external modules
might be pluggable with little difficulty.
Such an overhaul was also a good opportunity to make the Polyphemus simulation
chain easier to use for beginners while staying as flexible as before for
advanced users.
Here is the description of the Polyphemus HMI (Human Machine Interface)
overhaul we are envisionning for now. Even if, as you'll see, it is rather
mature, it is still a draft and because it is a draft, every Polyphemus user
is invited to contribute with its specific needs, points of view and
ideas. As backward compatibility with Polyphemus 1.x will be dropped,
keep in mind that ALL users will be impacted by the future Polyphemus 2.0.

A/ HMI
====
From the user point of view, this HMI overhaul will be fourfold:

1/ Global changes in the structure and syntax of Polyphemus configuration

-----------------------------------------------------------------------------------------------------------------
files
------
The treatment of configuration files will rely on a new C++ library called Ops
( http://gitorious.org/libops/pages/Home ) that can deal with Lua files
(http://www.lua.org/). The Ops-Lua pair will allow to overcome some of the
limitations encountered with Talos such as the its flat structure. For
instance, prepocessing/emissions/emissions.cfg might be
replaced by a new file emissions.lua. Such a file would support:

- inclusion of other files through the instruction dofile(). For instance,

dofile("../general.lua")

- global variables available within the scope of emissions.lua and its
included files. For instance, look at the global variable "year",

year = 2001

emissions = {

options = {
divide_by_heights = true,
},

input = {
emep = {
directory = "/u/cergrene/A/quelo/EMEP-emissions/" .. year .. "/",
species = {"SOX", "NOX", "NMVOC", "CO", "NH3", "PM2.5", "PMcoarse"},
(...)
}
(...)
}

}

where "(...)" is not an Ops-Lua symbol but stands for "text not included
here"!

- a structured way to read and write informations that improves the
organization within the configuration files. In the example above, the object
emissions has two members called emissions.options and emissions.input.
emissions.options has one member emissions.options.divide_by_heights etc...

- types like boolean (for instance emissions.options.divide_by_heights),
integer (for instance, year), string (for instance,
emissions.input.emep.directory: take notice that ".." joins the strings),
float, double and also vectors of every supported type (for instance,
emissions.input.emep.species).

2/ A new homogeneous output configuration
------------------------------------------------------------------
With years, output configuration has become tricky. Each preprocessing
program had its rigid specific rules to configure output offering little
degrees of freedom for a user facing a specific usecase. Moreover, output
systems differed between preprocessing and processing programs.
With Polyphemus 2.0, we ought to offer an unrivaled flexibility and
versatility to the user within a unique frame.
For instance, we would get for the example of emissions.lua:

emissions = {
(...)
output = {
{
species = {"all"},
variable = {"SurfaceEmissions", "VolumeEmissions"},
file = directory_computed_fields ..
"/emissions/${variable}/${species}.bin",
format = "binary",

-- Units for output emissions: mass / number
-- - mass: microgramm m^{-2} s^{-1} & microgramm m^{-3} s^{-1}
-- - number: molecule cm^{-2} s^{-1} & molecule cm^{-3} s^{-1}
-- Only needed if Output_unit is set to 'number'.
unit = "mass"
}
}
(...)
}
where -- signals a line of comment.

That would give a completely equivalent output as the one that exists with
preprocessing/emissions/emissions.cfg of Polyphemus 1.x, i.e. save every type
of emissions in a specific directory (${variable} is replaced by the items of
the list variable) on the basis of one file per species (${species} is
replaced by the names of the species available in the chemical model.
As you might guess, the new syntax offers new numerous possibilities. For
instance,

emissions = {
(...)
output = {
{
species = {"all"},
variable = {"SurfaceEmissions", "VolumeEmissions"},
file = directory_computed_fields ..
"/emissions/${yyyy-mm-dd}.nc",
format = "netCDF-4",

-- Units for output emissions: mass / number
-- - mass: microgramm m^{-2} s^{-1} & microgramm m^{-3}
-- s^{-1}
-- - number: molecule cm^{-2} s^{-1} & molecule cm^{-3}
-- s^{-1}
-- Only needed if Output_unit is set to 'number'.
unit = "mass"
}
}
(...)
}

would put the results in a daily self-describing (under the netCDF-4 format)
file called after the date.
As with the former sections "[save]" of Polyphemus 1.x, several output
requests can be given in the emissions.output object allowing sophisticated
combinations:

emissions = {
...
output = {
-- first output request
{
species = ...
}
-- second output request
{
...
}
...
}


3/ The possibility to control the whole simulation chain from a unique

-------------------------------------------------------------------------------------------------------
configuration file
--------------------------

This simplifiying possibility results from the hierarchical structure of the
Lua configuration files. The configuration information pertaining to it is
now included within an object. One might also take advantage of the dofile
function to alleviate the structure of a unified configuration file.
For instance:

dofile("../general.lua") -- For the domain object.

luc-glcf = {
...
}

roughness = {
...
}

ic = {
...
}

bc = {
...
}

meteo = {
...
}

emissions = {
...
}

bio = {
...
}

processing = {
...
}

...


4/ The possibility to set and launch the whole simulation chain with a unique

-------------------------------------------------------------------------------------------------------------------
configuration file and even from a unique script file

--------------------------------------------------------------------------------

Now that all the configuration information can be found from a unique
configuration file, one might launch each link of the simulation chain with
it. For instance,

luc-glcf myconfig.lua
roughness myconfig.lua
ic myconfig.lua
polair3d myconfig.lua

The last step might be to use a script by which every processing step
indicated in myconfig.lua shall be launched:

polyphemus myconfig.lua

B/ IO formats, metadata conventions and layout
============================
Let us not forget that the initial purpose of this overhaul was to allow the
writing of self-describing output files. Still, there are many different
formats that users might need. The stake is then to allow each user to use
the format he wants with as little work as possible. The targeted format in
Polyphemus 2.0 are:
- raw binary (same as before to keep compatibility with former results),
- netCDF (different versions),
- HDF5,
- ascii text.
Other formats such as GRIB might be added in a near future as the I/O
architecture will define a clear and limited interface for formats.

Metadata conventions will also be separated from format implementation and
users will be able to add their own or adapt existing ones. The targeted
metadata conventions for the release of Polyphemus 2.0 are:
- COARDS
- CF
We might add the ones of WRF and IRSN (V1.0).

For a given metadata convention, some formats allows various layout. For
instance, the raw text or binary formats are indeed so raw that users might
want to define their own output layout. So layout implementation will be
separated from format and metadata implementations.

Finally, a user might be able to save the same results in different formats,
using differents metadata standards and various layout.

All this new IO architecture will give an enriched meaning to the Polyphemus
name that roughly translate into "multiple speeches".