Docking@GRID

Docking@GRID is a software dedicated to the flexible conformational sampling and docking on the computational grid. The goal of the software is to help users to perform such processes in a friendly way. In other words, the software provides a web portail for remote job submission, importation/preparation of proteins, access to protein data bank, visualization, efficient sampling and docking. The project could be later integrated into the larger platform of chemioinformatics tools under construction around the site of the ”Chimiothèque Nationale”-project of the CNRS (Prof. Hibert, Strasbourg). This platform, designed as a portail for the display of the collections of molecules synthesized in French academic labs might offer predicted affinities of these compounds with respect to various biologically interesting targets, in order to facilitate compound selection.

Docking@GRID is currently available online on the Lille Genopole server and accessible at http://docking.futurs.inria.fr. The current version considers only sampling and visualization of conformations. A registration step requiring a reduced amount of information is demanded in order to access the provided resources. The software offers a hierarchical perspective, allowing to group different tasks into projects. A new project can be created by accessing the Ligands/ActiveSite section - following this initial phase, the project is displayed in a hierarchical manner. After creating the project, the user has the possibility of creating a new molecule file by employing the Msketch application (Chemaxon), which is provided in the form of a Java applet. Following this process, a conformational sampling step can be applied on the specified file. The sampling process relies on a hybrid genetic hierarchical algorithm executed in a distributed environment and making use of different parallelization strategies. The underlying framework is ParadisEO-G which is a Globus based version of the ParadisEO framework. The parallelization of the algorithm is transparently achieved by making use of the ParadisEO models - asynchronous island model, parallel evaluation of the population and parallel synchronous multi-start model. As middle-ware an MPICH-based distribution of MPI is used, the execution being performed on a dedicated set of machines.The results are displayed at the end of the sampling process, a notification mail being sent in case the processing step takes longer than 5 seconds. The obtained conformations may be visualized by using the MView tool (Chemaxon). Each conformation can be further subjected to rigid transformations (translations, rotations), animations can be constructed, etc.

This year, from a first new model, we have designed and tested seven other models for the flexible docking problem. All these models have been validated on instances of the CCDC-Astex dataset thanks a multi-objective genetic algorithme based on IBEA (Indicator Based Evolutionnary Algorithm) proved to be better than NSGA-II (Non dominated Sorting Genetic Algorithm). We use twelve different configurations of the GA for each model. The configurations using local searches as mutation operator and making a full flexible docking give the best results. According to the result gained, the surface and robustness objectives improve the docking efficiency.

All this work has been funded by the ANR Dock project and the "PPF Bio-Informatique" of the USTL.\\