Get started

Installation

Install with pipx if you only need the command-line client

pipx install zeldarose

If you plan to use Zelda Rose together with other python programs, install it in your project's environment

pip install zeldarose

If the model you want to use relies on sentencepiece, also install that.

Train a tokenizer

TOKENIZERS_PARALLELISM=true zeldarose tokenizer --vocab-size 4096 --out-path {tokenizer path}  --model-name "my-muppet" tests/fixtures/raw.txt tests/fixtures/raw.txt

The input format is raw text files, with one sample (e.g. sentence) per line.

Train a MLM model

zeldarose transformer --tokenizer {tokenizer path} --pretrained-model flaubert/flaubert_small_cased --out-dir local/muppet --val-text tests/fixtures/raw.txt tests/fixtures/raw.txt

The input format here is raw text files as well.
The arguments to --tokenizer and --pretrained-model can be either paths to local directories, or 🤗 hub model identifiers.

Next steps

Now that you know how to train a default model on a default task and with default settings, the next step is to configure all these.

Parameters that have to do with Zelda Rose behaviour towards the host machine (train on CPU or GPU, the batch size per device…) are set by command line arguments and can be accessed from zeldarose transformer --help
Other settings are passed via a config file, which the next section will tell you about.

Transformers config

If you want to train a model on another task, or if you want to configure hyperparameters, you will need to pass a configuration to zeldarose transformer via its --config option. It takes a path to a local TOML file that provides a training configuration (optimizer hyperparameters such as batch size, learning rate, etc.) and a task configuration (masked language modelling, replaced tokens detection, etc.) with its hyperparameters.

They look like this

type = "mlm"  # The name of the task.

[task]
change_ratio = 0.15  # The proportion of tokens to modify
mask_ratio = 0.8     # The proportion of modified tokens to mask
switch_ratio = 0.1   # The proportion of modified tokens to change to a random token

[tuning]
batch_size = 64
betas = [0.9, 0.98]
epsilon = 1e-8
learning_rate = 1e-4

There are example configurations for every task in the examples/ directory of the development repository. The options in the task section are documented in their respective page in “Tasks” and the tuning options in “Training configurations”.