Prerequisites
Get sources
The IRSOM tools can be download as follow:
git clone https://forge.ibisc.univ-evry.fr/lplaton/IRSOM.git
Python package
These packages can be install by running the following command:
pip install -r ${path_IRSOM}/pip_package.txt
where ${path_IRSOM} is the path to the directory of IRSOM.
For better performance, we recommend to rebuild TensorFlow from sources.
Compilation Featurer
The repository contains a compiled Featurer compiled on linux in the bin folder. If needed, the binary can be compiles using the Qt5 tools by doing:
cd ${path_IRSOM}/bin/
qmake ${path_IRSOM}/Featurer/Featurer.pro
make
Basic usage
Train a model
Default usage:
python scripts/train.py --featurer=${path_IRSOM}/bin/featurer -c coding.fasta -n noncoding.fasta --output=output_dir_of_model
The use of multiple fasta files are allowed with this script. For example, we can create a model with 2 coding fasta files and 3 non-coding fasta files with the following command:
python scripts/train.py --featurer=${path_IRSOM}/bin/featurer -c coding1.fasta -c coding2.fasta -n noncoding1.fasta -n noncoding2.fasta -n noncoding3.fasta
The model parameters can be set by the command parameters:
- --dim0= SOM dimension 0 (by default at 10)
- --dim1= SOM dimension 1 (by default at 10).
- --batch_size= the size of the batch given at each iteration (by default at 10).
- --penality= Coefficient of the regularization term (by default at 0.001).
By default the computed features are removed from the output directory. To keep this files, use the parameter --keep_features.
Predict
Default usage:
python scripts/predict.py --featurer=${path_IRSOM}/bin/featurer --file=fasta_file.fasta --model=${path_IRSOM}/model/species/ --output=output_dir_of_result
As for the train script, the features are removed by default. To keep them, use the parameter --keep_features.