Name Last Update
Featurer Loading commit data...
bin Loading commit data...
model Loading commit data...
scripts Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
pip_package.txt Loading commit data...

Prerequisites

Get sources

The IRSOM tools can be download as follow:

git clone https://forge.ibisc.univ-evry.fr/lplaton/IRSOM.git

IRSOM has been tested on python3.

Python package

  1. Matplotlib
  2. Pandas
  3. Plotnine
  4. Numpy
  5. TensorFlow

These packages can be install by running the following command:

pip install -r ${path_IRSOM}/pip_package.txt

where ${path_IRSOM} is the path to the directory of IRSOM.

For better performance, we recommend to rebuild TensorFlow from sources.

Compilation Featurer

The repository contains a compiled Featurer compiled on linux in the bin folder. If needed, the binary can be compiles using the Qt5 tools by doing:

cd ${path_IRSOM}/bin/
qmake ${path_IRSOM}/Featurer/Featurer.pro
make

Basic usage

Train a model

Default usage:

python scripts/train.py --featurer=${path_IRSOM}/bin/Featurer -c coding.fasta -n noncoding.fasta --output=output_dir_of_model

The use of multiple fasta files are allowed with this script. For example, we can create a model with 2 coding fasta files and 3 non-coding fasta files with the following command:

python scripts/train.py --featurer=${path_IRSOM}/bin/Featurer -c coding1.fasta -c coding2.fasta -n noncoding1.fasta -n noncoding2.fasta -n noncoding3.fasta

The model parameters can be set by the command parameters:

  • --dim0= SOM dimension 0 (by default at 3)
  • --dim1= SOM dimension 1 (by default at 3).
  • --batch_size= the size of the batch given at each iteration (by default at 10).
  • --penality= Coefficient of the regularization term (by default at 0.001).

By default the computed features are removed from the output directory. To keep this files, use the parameter --keep_features.

Predict

Default usage:

python scripts/predict.py --featurer=${path_IRSOM}/bin/Featurer --file=fasta_file.fasta --model=${path_IRSOM}/model/species/ --output=output_dir_of_result

As for the train script, the features are removed by default. To keep them, use the parameter --keep_features.