Home My Page Projects LogolExec
Summary Activity Forums Tracker Lists Tasks Docs News SCM Files Mediawiki

InriaForge

Technical Specifications

From LogolExec Wiki
Jump to: navigation, search

Contents

Introduction

LogolMatch software suite is composed of 3 components:

* LogolExec, the main program
* LogolDesigner, a graphical interface to create models (optional)
* LogolAnalyser, a basic web interface to submit jobs and to analyse the results.

Features

The list of features with their technical description is available here: LogolFeatures

LogolDesigner

LogolDesigner is made of Javascript and XML files. It is based on mxgraph(http://www.jgraph.com/mxgraph.html) library and requires a license to develop and use it. This license is free for acedemics.

Configurations files are in javascript/designer/editors/config directory. Properties and components are defined in wftoolbar-logol.xml. The application is defined by logoleditor.xml

Library is in javascript/designer/editors/js/logol.js. It holds all the customization and the events management for the graph. It is based on mxgraph library which is hosted on Mxgraph company. Library is downloaded by the browser.

The web page is located at javascript/designer/editors/logoleditor.html. This file contains the license key ad specify the javascript files to load.

A java application is added to allow the loading of models in the interface via an upload mechanism. It is managed by the ModelLoad java class/

An Ant build file is available to package the application. IT should be deployed in a web container such as Tomcat.

LogolAnalyser

LogolAnalyser is a GWT (Google Web Toolkit) based web application. GWT is made of java class, transformed in Javascript help with the GWT compiler to provide Ajax based applications.

Application is made of client and server classes. Client classes are web interface related while server ones manage the server work (file uploads, xml manipulations etc...).

The entry point is the class LogolAnalyser.java. The interface is based on Panels, each Panel is defined in its own class (depending on its features).

The server classes provide a Job submission interface that allow both local jobs and DRMAA compliant systems job submission (SGE for example).

The application also use some Gwtext graphical components.

An Ant build file is available to package the application. IT should be deployed in a web container such as Tomcat.

LogolExec

The LogolExec is the most complex one. It is made of several executables and several languages.

From an executable point of view we have:

* LogolMultiExec.sh: a wrapper to handle multiple input sequences. It dispatch the jobs (threads of cluster jobs) and tries to split input sequences too if possible according to the model and sequence size. It is in charge of merging the results in a zip file ( calls LogolExec.jar)
* LogolExec.sh: called by LogolMultiExec. Interpret the input grammar, generate a prolog file and call logol.exe on a single input sequence (or part of a sequence). It generates an XML result file ( calls LogolExec.jar).
* logol.exe is a prolog based program. It is in charge of the model search on the sequence. It does this search help with the prolog file generated by LogolExec and a library included in the exe (logol.pro). It is compiled with the Sicstus 4 Prolog implementation.
* Vmatch: this is an external tool developped by S. Kurtz. Logol use this program via external calls to do some specific searched within the sequence. It is a suffix array based tool. This tool requires a license from S.Kurtz and is not shipped with the program. It is called via the wrapper suffixsearch.sh.

Logol.pro

The main prolog library. It contains a large number of functions (predicates) to match the models. There are some conversion/morphism functions, some search functions with errors or not, etc...

Some results can also be saved in memory for post-treatment (filters). Each result is "saved" as an object with all its data and "sub" data. Example [ name, startpos, end pos, ... , [ [ subname1, startpos1, ...] ,[ subname2, startpos2, ... ] ].

The library will access the sequence file with random access.

LogolExec.jar

Behavior

This java program analyse the model and the sequence to determine what needs to be done. It the model allow sequence splitting and file is large enough, it will split the sequence and execute several jobs then will merge the results. These jobs can be local or submitted to a cluster via the drmaa java library.

Locally, jobs are managed by threads.

Sequence splitting will be done based on configuration (min size, number of cpu core ...). Configuration is held in a properties file (logol.properties) but many options can be overriden via command-line.

When sequence to analyse by the local job is ok (eventually splitted then copied as a single line data, required for random access later on by the tool), the sequence is indexed in a suffix array index help the external suffix search tool.

Once ready for job, the input grammar is analysed. If it is a model (via graphical interface), it converts the model to a logol grammar file (via org.irisa.genouest.logol.utils.model.ModelConverter). Then the grammar file is analysed using the Antlr program with a DSL. This grammar is in fact parsed 3 times to generate a prolog file that will be used by the logol.exe program. This prolog file describes in prolog language what must be done to find the expected model using the available functions in logol.pro. It also specifies the path of the (partial) sequence file, the result file name etc... and make use of unique identifers to differenciate the different jobs.

Grammar parser

As said previously, the grammar is parsed 3 times:

* The first run, stores the different variables used with their constraints in Java objects
* The second run creates a temporary prolog file to check if right to left constraints are present in the grammar. If this is the case, additional information for those variables are searched in the grammar if available at those are marked to be managed differently (will call preanalyse.exe instead of logol.exe).
* The last run creates the final prolog file based on previous knowledge. IF for example a variable is constrained with second run, some constraints will be released at search time and will be checked only later on when all information is available.

Antlr is in charge of generating java classes for parser. Then the org.irisa.genouest.logol.Treatment class is linked with those classes. During the parsing, the parser classes will call Treatment functions and Treatment will call the appropriate classes to store the data and flow of data.

Each type of data is managed via different class defined in org.irisa.genouest.logol.types.

Treatment calls the content function of the type class. This will generate a prolog code for this variable. Depending on the type, (variable, repeat etc...) data will be managed differently. As an example, a set fo variable may be analysed to generate some code, then this code will be refactored because this set of variable is included in a view. This will, in this case, generate a new predicate with this code and only predicate call will be added to current treatment.