wiki.ziemers.de

ziemer's informatik Wiki

Benutzer-Werkzeuge

Webseiten-Werkzeuge


wiki:software:beuthbot:rasa:training

Unterschiede

Hier werden die Unterschiede zwischen zwei Versionen angezeigt.

Link zu dieser Vergleichsansicht

Nächste Überarbeitung
Vorhergehende Überarbeitung
wiki:software:beuthbot:rasa:training [22.07.2020 19:24]
Lukas Danckwerth angelegt
wiki:software:beuthbot:rasa:training [22.07.2020 19:39] (aktuell)
Lukas Danckwerth
Zeile 1: Zeile 1:
 ===== Rasa Training ===== ===== Rasa Training =====
  
 +==== Overview ====
  
 +== Training Input Data ==
 +
 +Trining input data files are `.chatito`  files and are placed in the in the `training/app/input/` directory.
 +
 +> This is the place where to provide new functionality for the BeuthBot.
 +
 +== Training Dataset ==
 +
 +Trining dataset files are `.json`  files and are placed in the in the `training/app/data/` directory.
 +
 +== Training Model ==
 +
 +Trining model files generated by Rasa are `.tar.gz`  archives and are placed in the in the `training/app/model/` directory. This directory contains all generated model archives. We use the most up-to-date model for the Rasa service of the BeuthBot. There is `Makefile` command for that so you can simply type the following command (in the root directory of Rasa).
 +
 +<code>
 +$ make update-model
 +</code>
 +
 +
 +
 +==== Step-by-Step Guide ====
 +
 +This guide explains how to create a new training model for Rasa. The following image gives you an overview of the files and steps to do. `*.chatito` files places in the `training/app/input/` directory are used by [Chatito](#Chatito) to create `JSON` files in the `training/app/data/` directory. These `JSON` files are used by Rasa to create the training model which will be placed in the `training/app/model/`.
 +
 +<uml>
 +@startuml
 +package Rasa {
 +
 +folder training/app/input as TI {
 +artifact FILE1.chatito
 +artifact FILE2.chatito
 +}
 +
 +card "docker-compose -f docker-compose.generate-data.yml up" as C1
 +
 +folder training/app/data as TD {
 +artifact FILE1.json
 +artifact FILE2.json
 +}
 +
 +card "docker-compose -f docker-compose.train-model.yml up" as C2
 +
 +folder training/app/model as M {
 +artifact "nlu-YYYYMMDD-HHMMSS.tar.gz"
 +}
 +
 +TI --> C1
 +C1 --> TD
 +TD --> C2
 +C2 --> M
 +
 +}
 +@enduml
 +</uml>
 +
 +=== 1 -  Provide new input training data ===
 +
 +Modify or add  `*.chatito` files in the `training/app/input/` directory to provide a new functionality. For more information about `Chatito` have a look a the [[https://github.com/rodrigopivi/Chatito|Chatito]] section of this document.
 +
 +== 1.1 - Generate training datasets ==
 +
 +We created a `docker-compose.yml` which defines a container which generates the datasets. To use it type the following command in the `training` directory.
 +
 +<code>
 +# change into `training` directory (if you are not still there)
 +$ cd training
 +
 +# runs the dataset generation container
 +$ docker-compose -f docker-compose.generate-data.yml up
 +
 +# .. or use the convenient make target
 +$ make generate-data
 +</code>
 +
 +=== 2 - Create model with Rasa ===
 +
 +There are two ways of generating models from training data. Either with a local Rasa installation or with withing a Docker container. The preferred way is to use the Docker container.
 +
 +Furthermore Rasa NLU is configurable and is defined by pipelines. These pipelines define how the models are generated with the training data and which entities are extracted. For this, a preconfigured pipeline with "supervised_embeddings" is used. "supervised_embeddings" allows to tokenize any languages.
 +
 +> Check the `config.yml` for configuration of Rasa pipeline (how the trained model is generated).
 +
 +== 2.1 - Create model with local Rasa installation ==
 +
 +Create training model with local `rasa` command. For furher information and an installation guide for a local Rasa installation see this [link](https://github.com/beuthbot/rasa/tree/database-understanding#local-rasa-installation).
 +
 +<code>
 +$ rasa train nlu
 +</code>
 +
 +== 2.2 - Create model with Rasa Docker container ==
 +
 +Build and run the training Docker container which generates the model file.
 +
 +<code>
 +# runs the train model container
 +$ docker-compose -f docker-compose.train-model.yml up
 +
 +# .. or use the convenient make target
 +$ make train-model
 +</code>
 +
 +> For a fast convenient way to [[#1.1---Generate-training-datasets|generate training datasets]] and [[#2---Create-model-with-Rasa|create a model]] simply use the `docker-compose.yml` file in the `training` directory.
 +<code>
 +# runs both the dataset generation and train model container
 +docker-compose -f docker-compose.yml up
 + 
 +# .. or simply
 +docker-compose up
 +</code>
 +
 +=== 3 - Check generated file ===
 +
 +Both way will create a new training model in the `/training/app/models` directory. The name of the  model file will have a format like `nlu-YYYYMMDD-HHMMSS.tar.gz`.
 +
 +<code>
 +# check file existence
 +$ ls -la app/models
 +</code>
 +
 +=== 4 - Replace existing models file ===
 +
 +The model file which is used by Rasa in production is placed in the `app/models` directory. Replace this file with the newly generated model file.
 +
 +<code>
 +# delete existing model (if you are still in `training` directory)
 +$ rm -rf ../app/models/nlu-*.tar.gz
 +
 +# then copy the newes model archive 
 +$ cp "./app/models/$(ls -Art "./app/models" | tail -n 1)" ../app/models
 +
 +# .. or you use the convenient make target
 +$ make update-model
 +</code>
 +
 +> Restart Rasa container or complete BeuthBot container and you are done. Rasa now runs with you new model.
 +
 +=== 5 - Shorthand ===
 +
 +There is a shorthand for all the above listed steps. Simply use the `Makefile` target to run all commands from step 1 till step 4. Providing new input data still depants on you.
 +
 +<code>
 +# assuming you are in the `training` main directory
 +$ make train
 +</code>
 +
 +Lean back an wait till the training is done.
wiki/software/beuthbot/rasa/training.1595438664.txt.gz · Zuletzt geändert: 22.07.2020 19:24 von Lukas Danckwerth