Language Model

The language model is an agent which trains an RNN on a language modeling task. It is adapted from the language model featured in Pytorch’s examples repo here.

Basic Examples

Train a language model with embedding size 128 and sequence length 30 on the Persona-Chat task.

python examples/ -m language_model -t personachat -esz 128 -sl 30 -mf /tmp/LM_personachat_test.mdl

After training, load and evaluate that model on the Persona-Chat test set.

python examples/ -m language_model -t personachat -mf /tmp/LM_personachat_test.mdl -dt test

LanguageModelAgent Options

Language Model Arguments


Load dict/features/weights/opts from this file

-hs, --hiddensize

Size of the hidden layers

Default: 200.

-esz, --embeddingsize

Size of the token embeddings

Default: 200.

-nl, --numlayers

Number of hidden layers

Default: 2.

-dr, --dropout

Dropout rate

Default: 0.2.

-clip, --gradient-clip

Gradient clipping

Default: 0.25.


Disable GPUs even if available

Default: False.

-rnn, --rnn-class

Type of recurrent net (RNN_TANH, RNN_RELU, LSTM, GRU)

Default: LSTM.

-sl, --seq-len

Sequence length

Default: 35.

-tied, --emb-tied

Tie the word embedding and softmax weights

Default: False.

-seed, --random-seed

Random seed

Default: 1111.


Which GPU device to use

Default: -1.

-tr, --truncate-pred

Truncate predictions

Default: 50.

-rf, --report-freq

Report frequency of prediction during eval

Default: 0.1.

-pt, --person-tokens

Append person1 and person2 tokens to text

Default: True.

-lr, --learningrate

Initial learning rate

Default: 20.

-lrf, --lr-factor

Mutliply learning rate by this factor when the validation loss does not decrease

Default: 1.0.

-lrp, --lr-patience

Wait before decreasing learning rate

Default: 10.

-lrm, --lr-minimum

Minimum learning rate

Default: 0.1.

-sm, --sampling-mode

Sample when generating tokens instead of taking the max and do not produce UNK token (when bs=1)

Default: False.