Prerequisites
Automatic installation
Dependency
In order to run the automatic installation, one need to install conda with python3. The installation of conda with python3 is described here.
Run the installation
Dowload the installation script from here. The installation can be run by:
chmod +x script_irsom.sh
./script_irsom.sh ENV_NAME
where ENV_NAME
is the conda environment name which will be created.
Manual installation
Get sources
The IRSOM tools can be download as follow:
git clone https://forge.ibisc.univ-evry.fr/lplaton/IRSOM.git
IRSOM has been devleoped on python3.
Python package
These packages can be install by running the following command:
pip install -r ${path_IRSOM}/pip_package.txt
where ${path_IRSOM} is the path to the directory of IRSOM.
For better performance, we recommend to rebuild TensorFlow from sources.
Compilation Featurer
The repository contains a compiled Featurer compiled on linux in the bin folder. If needed, the binary can be compiles using the Qt5 tools by doing:
cd ${path_IRSOM}/bin/
qmake ${path_IRSOM}/Featurer/Featurer.pro
make
Datasets
An archive containing all the datasets can be download here. To download a specific dataset, one can access a folder containing all the datasets here.
Basic usage
Train a model
Default usage:
python ${path_IRSOM}scripts/train.py --featurer=${path_IRSOM}/bin/Featurer -c coding.fasta -n noncoding.fasta --output=output_dir_of_model
The use of multiple fasta files are allowed with this script. For example, we can create a model with 2 coding fasta files and 3 non-coding fasta files with the following command:
python ${path_IRSOM}scripts/train.py --featurer=${path_IRSOM}/bin/Featurer -c coding1.fasta -c coding2.fasta -n noncoding1.fasta -n noncoding2.fasta -n noncoding3.fasta
The model parameters can be set by the command parameters:
- --dim0= SOM dimension 0 (by default at 3)
- --dim1= SOM dimension 1 (by default at 3).
- --batch_size= the size of the batch given at each iteration (by default at 10).
- --penality= Coefficient of the regularization term (by default at 0.001).
By default the computed features are removed from the output directory. To keep this files, use the parameter --keep_features.
Predict
Default usage:
python ${path_IRSOM}/scripts/predict.py --featurer=${path_IRSOM}/bin/Featurer --file=fasta_file.fasta --model=${path_IRSOM}/model/species/ --output=output_dir_of_result [--reject=${rejection_threshold}]
The rejection threshold can be set with the option --reject
.
By default there is no rejection.
As for the train script, the features are removed by default. To keep them, use the parameter --keep_features.