Name Last Update
Featurer Loading commit data...
bin Loading commit data...
model Loading commit data...
scripts Loading commit data...
.gitignore Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
pip_package.txt Loading commit data...

Prerequisites

Get sources

The IRSOM tools can be download as follow:

git clone https://forge.ibisc.univ-evry.fr/lplaton/IRSOM.git

IRSOM has been devleoped on python3.

Python package

  1. Matplotlib
  2. Pandas
  3. Plotnine
  4. Numpy
  5. TensorFlow
  6. Docopt

These packages can be install by running the following command:

pip install -r ${path_IRSOM}/pip_package.txt

where ${path_IRSOM} is the path to the directory of IRSOM.

For better performance, we recommend to rebuild TensorFlow from sources.

Compilation Featurer

The repository contains a compiled Featurer compiled on linux in the bin folder. If needed, the binary can be compiles using the Qt5 tools by doing:

cd ${path_IRSOM}/bin/
qmake ${path_IRSOM}/Featurer/Featurer.pro
make

Basic usage

Train a model

Default usage:

python scripts/train.py --featurer=${path_IRSOM}/bin/Featurer -c coding.fasta -n noncoding.fasta --output=output_dir_of_model

The use of multiple fasta files are allowed with this script. For example, we can create a model with 2 coding fasta files and 3 non-coding fasta files with the following command:

python scripts/train.py --featurer=${path_IRSOM}/bin/Featurer -c coding1.fasta -c coding2.fasta -n noncoding1.fasta -n noncoding2.fasta -n noncoding3.fasta

The model parameters can be set by the command parameters:

  • --dim0= SOM dimension 0 (by default at 3)
  • --dim1= SOM dimension 1 (by default at 3).
  • --batch_size= the size of the batch given at each iteration (by default at 10).
  • --penality= Coefficient of the regularization term (by default at 0.001).

By default the computed features are removed from the output directory. To keep this files, use the parameter --keep_features.

Predict

Default usage:

python scripts/predict.py --featurer=${path_IRSOM}/bin/Featurer --file=fasta_file.fasta --model=${path_IRSOM}/model/species/ --output=output_dir_of_result [--reject=${rejection_threshold}]

The rejection threshold can be set with the option --reject. By default there is no rejection.

As for the train script, the features are removed by default. To keep them, use the parameter --keep_features.