Advanced Scripts

These are the more obscure and advanced scripts in parlai.

build_candidates

Short description: Build the candidate responses for a retrieval model

Build the candidate responses for a retrieval model.

Examples

parlai build_candidates --task convai2 --outfile /tmp/cands.txt

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:evalmode.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--num-examples, --n

Total number of exs to convert, -1 to convert all examples
Default: -1.

--outfile, --of

Output file where to save, by default will be created in /tmp

--log-every-n-secs, --ltim

Default: 2.


build_dict

Short description: Build a dictionary.

Generates a dictionary file from the training data.

Examples

# learn the vocabulary from one task, then train on another task.
parlai build_dict --task convai2 --dict-file premade.dict
parlai train_model --task squad --dict-file premade.dict --model seq2seq

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--dict-maxexs

Max number of examples to build dict on
Default: -1.

--dict-include-valid

Include validation set in dictionary building for task.

--dict-include-test

Include test set in dictionary building for task.

--log-every-n-secs, --ltim

Default: -1.

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

--bpe-dropout

Use BPE dropout during training.


convert_to_json

Short description: Convert data to json format

Converts data used in a task to json format. (Same as “Conversation” class; ie, for use in ACUTE-eval)

Specify the task with -t. By default, this code will save to a file with prefix “tmp”. To change the prefix, set --world-logs.

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: valid.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)
Default: repeat_label.

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--report-filename, --rf

Saves a json file of the evaluation report either as an extension to the model-file (if begins with a “.”) or a whole file path. Set to the empty string to not save at all.

--world-logs

Saves a jsonl file of the world logs.Set to the empty string to not save at all.
Default: tmp.

--save-format

Choices: conversations, parlai
Default: conversations.

--area-under-curve-digits, --auc

A positive number indicates to calculate the area under the roc curve and it also determines how many decimal digits of the predictions to keep (higher numbers->more precise); also used to determine whether or not to calculate the AUC metric
Default: -1.

--area-under-curve-class, --auclass

The name(s) of the class to calculate the auc for

--num-examples, --ne

Default: -1.

--display-examples, --d

--log-every-n-secs, --ltim

Default: 10.

--metrics, --mcs

List of metrics to show/compute, e.g. all, default,or give a list split by , like ppl,f1,accuracy,hits@1,rouge,bleuthe rouge metrics will be computed as rouge-1, rouge-2 and rouge-l
Default: default.

--aggregate-micro, --micro

Report micro-averaged metrics instead of macro averaged metrics.

--log-keep-fields

Fields to keep when logging. Should be a comma separated list
Default: all.

--tensorboard-log, --tblog

Tensorboard logging of metrics

--tensorboard-logdir, --tblogdir

Tensorboard logging directory, defaults to model_file.tensorboard


convert_to_parlai

Short description: Dump a task to a standardized format

Convert a dataset into the ParlAI text format.

Examples

parlai convert_data_to_parlai_format --task babi:task1k:1 --outfile /tmp/dump

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:stream.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--num-examples, --n

Total number of exs to convert, -1 to convert all examples
Default: -1.

--outfile, --of

Output file where to save, by default will be created in tmp

--ignore-fields, --if

Ignore these fields from the message (returned with .act() )
Default: id.

--log-every-n-secs, --ltim

Default: 2.


convo_render

Short description: Render data as HTML

CLI Arguments

Argument

Description

--init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--input, --i

Input file to read conversations from

--output, --o

Output file to write conversations to. One of [.pdf, .png, .html] only

--width, --wd

Width of output file
Default: 8.

--height, --ht

Height of output file
Default: 10.

--user-icon, --uic

Absolute Path/URL to user image icon
Default: https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/160/apple/76/woman_1f469.png.

--alt-icon, --aic

Absolute Path/URL to alternate image icon
Default: https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/160/facebook/230/parrot_1f99c.png.

--num-examples, --ne

Number of conversations to render
Default: 10.


data_stats

Short description: Compute data statistics

Count and display statistics of the data.

Examples

parlai data_stats --task convai2

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:ordered.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--num-examples, --n, --ne

Default: -1.

--log-every-n-secs, --ltim

Default: 10.

--agent

Use teacher (agent 0) or model (agent 1)
Choices: 0, 1

--new-line-new-utt

New lines treat substrings as separate utterances.

--ignore-tokens

Ignore tokens containings these substrings (comma-separated)

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

--bpe-dropout

Use BPE dropout during training.


detect_offensive

Short description: Check task for offensive language

Basic example which iterates through the tasks specified and checks them for offensive language.

Examples

parlai detect_offensive_language --task "convai_chitchat" --display-examples True

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:ordered.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)
Default: repeat_query.

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--log-every-n-secs, --ltim

Default: 2.

--display-examples, --d

--safety

Type of safety detector to apply to messages
Choices: classifier, string_matcher, all
Default: all.


eval_wordstat

Short description: Compute statistics from model predictions

This helper script can be used alone with modelfile and task: the output will contain the word statistics of the model outputs. One can also use the function defined here in other places in order to get such statistic for any agent given the agent object (with corr. dict) and a sequence.

Additionally provides function get_word_stats that can be used in other parts of runtime code since it depends only on the agent object. For example:

from parlai.scripts.eval_wordstat import get_word_stats
reqs, cnt = get_word_stats(predictions.tolist(), self.dict)

Examples

parlai eval_wordstat --model-file /path/to/model_file --task convai2:self --freq-bins 10,100,1000

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: valid.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

--bpe-dropout

Use BPE dropout during training.

--num-examples, --ne

Default: -1.

--log-every-n-secs, --ltim

Default: 2.

--external-dict, --ed

External dictionary for stat computation

--freq-bins, --fb

Bins boundaries for rare words stat
Default: 0,100,1000,10000.

--dump-predictions-path, --dup

Dump predictions into file

--compute-unique, --cun

Compute %% of unique responses from the model
Default: True.

--tensorboard-log, --tblog

Tensorboard logging of metrics

--tensorboard-logdir, --tblogdir

Tensorboard logging directory, defaults to model_file.tensorboard


extract_image_feature

Short description: Load/extract image features

Basic example which iterates through the tasks specified and load/extract the image features.

For more options, check parlai.core.image_featurizers

Examples

To extract the image feature of COCO images:

parlai extract_image_feature --task vqa_v1 --image-mode resnet152

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data


flask

Example Flask server which hosts a model.

Examples

Serving the model

parlai flask -m repeat_query
parlai flask -mf zoo:blender/blender_90M/model

Hitting the API*

curl -k http://localhost:5000/response -H "Content-Type: application/json" -d '{"text": "foobar"}'

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file


interactive_web

Short description: Interactive chat with a model in a web browser

Aliases: iweb Talk with a model using a web UI.

Examples

parlai interactive_web --model-file "zoo:tutorial_transformer_generator/model"

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”
Default: interactive.

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--display-examples, --d

--display-prettify

Set to use a prettytable when displaying examples with text candidates

--display-add-fields

Display these fields when verbose is off (e.g., “–display-add-fields label_candidates,beam_texts”)

--interactive-task, --it

Create interactive version of task
Default: True.

--outfile

Saves a jsonl file containing all of the task examples and model replies. Set to the empty string to not save at all

--save-format

Format to save logs in. conversations is a jsonl format, parlai is a text format.
Choices: conversations, parlai
Default: conversations.

--local-human-candidates-file, --fixedCands

File of label_candidates to send to other agent

--single-turn

If on, assumes single turn episodes.

--log-keep-fields

Fields to keep when logging. Should be a comma separated list
Default: all.

--port

Port to listen on.
Default: 8080.

--host

Host from which allow requests, use 0.0.0.0 to allow all IPs
Default: localhost.


multiprocessing_eval

Short description: Evaluate a model

Aliases: mp_eval Main launch script for single-host, multi-GPU evaluation.

This is a drop-in replacement for [eval_model]. This script will launch N subprocess, each which runs the full eval loop independently.

Uses torch.nn.parallel.DistributedDataParallel for its main uses. Agents must specifically implement the wrapper of DistributedDataParallel, but all TorchRankerAgents and TorchGeneratorAgents support this.

Examples

parlai multiprocessing_eval --model-file "zoo:tutorial_transformer_generator/model" --batchsize 16 --task convai2

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: valid.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--report-filename, --rf

Saves a json file of the evaluation report either as an extension to the model-file (if begins with a “.”) or a whole file path. Set to the empty string to not save at all.

--world-logs

Saves a jsonl file of the world logs.Set to the empty string to not save at all.

--save-format

Choices: conversations, parlai
Default: conversations.

--area-under-curve-digits, --auc

A positive number indicates to calculate the area under the roc curve and it also determines how many decimal digits of the predictions to keep (higher numbers->more precise); also used to determine whether or not to calculate the AUC metric
Default: -1.

--area-under-curve-class, --auclass

The name(s) of the class to calculate the auc for

--num-examples, --ne

Default: -1.

--display-examples, --d

--log-every-n-secs, --ltim

Default: 10.

--metrics, --mcs

List of metrics to show/compute, e.g. all, default,or give a list split by , like ppl,f1,accuracy,hits@1,rouge,bleuthe rouge metrics will be computed as rouge-1, rouge-2 and rouge-l
Default: default.

--aggregate-micro, --micro

Report micro-averaged metrics instead of macro averaged metrics.

--log-keep-fields

Fields to keep when logging. Should be a comma separated list
Default: all.

--tensorboard-log, --tblog

Tensorboard logging of metrics

--tensorboard-logdir, --tblogdir

Tensorboard logging directory, defaults to model_file.tensorboard

--distributed-world-size

Number of workers.

--ddp-backend

Distributed backend. Zero2 can be faster but is more experimental. Zero3 significantly reduces memory pressure. DDP is the most tested.
Choices: ddp, zero2, zero3
Default: ddp.


multiprocessing_train

Short description: Train a model

Aliases: mp_train Main launch script for single-host, multi-GPU training.

This is a drop-in replacement for [train_model]. This script will launch N subprocess, each which runs the full training loop independently.

Uses torch.nn.parallel.DistributedDataParallel for its main uses. Agents must specifically implement the wrapper of DistributedDatParallel, but all TorchRankerAgents and TorchGeneratorAgents support this.

Examples

parlai multiprocessing_train -m transformer/generator --batchsize 16 --task convai2 --model-file /tmp/mymodel

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--evaltask, --et

Task to use for valid/test (defaults to the one used for training)

--final-extra-opt

A ‘.opt’ file that is used for final eval. Useful for setting skip-generation to false. ‘datatype’ must be included as part of the opt.

--eval-dynamic-batching

Set dynamic batching at evaluation time. Set to off for train-only dynamic batching. Set to none (default) to use same setting as –dynamic-batching.
Choices: full, off, batchsort, None

--num-workers

Number of background workers (training only)

--num-epochs, --eps

Default: -1.

--max-train-time, --ttim

Default: -1.

--max-train-steps, --max-lr-steps, --tstep

End training after n model updates
Default: -1.

--log-every-n-steps, --lstep

Log every n training steps
Default: 50.

--validation-every-n-secs, --vtim

Validate every n seconds. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

--validation-every-n-steps, --vstep

Validate every n training steps. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

--save-every-n-secs, --stim

Saves the model to model_file.checkpoint after every n seconds (default -1, never).
Default: -1.

--save-after-valid, --sval

Saves the model to model_file.checkpoint after every validation (default False).

--validation-every-n-epochs, --veps

Validate every n epochs. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

--validation-patience, --vp

Number of iterations of validation where result does not improve before we stop training
Default: 10.

--validation-metric, --vmt

Key into report table for selecting best validation
Default: accuracy.

--validation-metric-mode, --vmm

The direction in which to optimize the validation metric, i.e. maximize or minimize
Choices: max, min

--metrics, --mcs

List of metrics to show/compute, e.g. all, default,or give a list split by , like ppl,f1,accuracy,hits@1,rouge,bleuthe rouge metrics will be computed as rouge-1, rouge-2 and rouge-l
Default: default.

--aggregate-micro, --micro

Report micro-averaged metrics instead of macro averaged metrics.

--world-logs

Saves a jsonl file of the world logs.Set to the empty string to not save at all.

--save-format

Choices: conversations, parlai
Default: conversations.

--seed

--log-keep-fields

Fields to keep when logging. Should be a comma separated list
Default: all.

--tensorboard-log, --tblog

Tensorboard logging of metrics

--tensorboard-logdir, --tblogdir

Tensorboard logging directory, defaults to model_file.tensorboard

--wandb-log, --wblog

Enable W&B logging of metrics

--wandb-project

W&B project name. Defaults to timestamp. Usually the name of the sweep.

--wandb-entity

W&B entity name.

--wandb-log-model

Enable logging of model artifacts to weight and biases

--clearml-log, --clearmllog

Creates a ClearML Task. Default: False. If True, ClearML logging will be enabled.

--clearml-project-name, --clearmlproject

ClearML Project Name. All the logs will be stored under this project in ClearML WebUI. If not set, default will set to ParlAI.
Default: ParlAI.

--clearml-task-name, --clearmltask

ClearML Task Name. All the logs will be stored under this task in ClearML WebUI. If not set, default will set to “Default Task”.
Default: Default Task.

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

--bpe-dropout

Use BPE dropout during training.

--distributed-world-size

Number of workers.

--ddp-backend

Distributed backend. Zero2 can be faster but is more experimental. Zero3 significantly reduces memory pressure. DDP is the most tested.
Choices: ddp, zero2, zero3
Default: ddp.

--port


party

Short description: Throw a party!

Aliases: parrot Throw a party.

Examples

parlai party

CLI Arguments

Argument

Description

--seconds, --n

Number of seconds to party
Default: -1.


profile_interactive

Short description: Interactive chat with a model

Basic script which allows to profile interaction with a model using repeat_query to avoid human interaction (so we can time it, only).

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”
Default: interactive.

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--display-examples, --d

Default: True.

--num-examples, --ne

Default: 5.

--display-prettify

Set to use a prettytable when displaying examples with text candidates

--display-add-fields

Display these fields when verbose is off (e.g., “–display-add-fields label_candidates,beam_texts”)

--interactive-task, --it

Create interactive version of task
Default: True.


profile_train

Short description: cProfile a training run

Run the python or pytorch profiler and prints the results.

Examples

To make sure that bAbI task 1 (1k exs) loads one can run and to see a few of them:

parlai profile_train --task babi:task1k:1 --model seq2seq --dict-file /tmp/dict

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--evaltask, --et

Task to use for valid/test (defaults to the one used for training)

--final-extra-opt

A ‘.opt’ file that is used for final eval. Useful for setting skip-generation to false. ‘datatype’ must be included as part of the opt.

--eval-dynamic-batching

Set dynamic batching at evaluation time. Set to off for train-only dynamic batching. Set to none (default) to use same setting as –dynamic-batching.
Choices: full, off, batchsort, None

--num-workers

Number of background workers (training only)

--num-epochs, --eps

Default: 1.

--max-train-time, --ttim

Default: -1.

--max-train-steps, --max-lr-steps, --tstep

End training after n model updates
Default: -1.

--log-every-n-steps, --lstep

Log every n training steps
Default: 50.

--validation-every-n-secs, --vtim

Validate every n seconds. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

--validation-every-n-steps, --vstep

Validate every n training steps. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

--save-every-n-secs, --stim

Saves the model to model_file.checkpoint after every n seconds (default -1, never).
Default: -1.

--save-after-valid, --sval

Saves the model to model_file.checkpoint after every validation (default False).

--validation-every-n-epochs, --veps

Validate every n epochs. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

--validation-patience, --vp

Number of iterations of validation where result does not improve before we stop training
Default: 10.

--validation-metric, --vmt

Key into report table for selecting best validation
Default: accuracy.

--validation-metric-mode, --vmm

The direction in which to optimize the validation metric, i.e. maximize or minimize
Choices: max, min

--metrics, --mcs

List of metrics to show/compute, e.g. all, default,or give a list split by , like ppl,f1,accuracy,hits@1,rouge,bleuthe rouge metrics will be computed as rouge-1, rouge-2 and rouge-l
Default: default.

--aggregate-micro, --micro

Report micro-averaged metrics instead of macro averaged metrics.

--world-logs

Saves a jsonl file of the world logs.Set to the empty string to not save at all.

--save-format

Choices: conversations, parlai
Default: conversations.

--seed

--log-keep-fields

Fields to keep when logging. Should be a comma separated list
Default: all.

--tensorboard-log, --tblog

Tensorboard logging of metrics

--tensorboard-logdir, --tblogdir

Tensorboard logging directory, defaults to model_file.tensorboard

--wandb-log, --wblog

Enable W&B logging of metrics

--wandb-project

W&B project name. Defaults to timestamp. Usually the name of the sweep.

--wandb-entity

W&B entity name.

--wandb-log-model

Enable logging of model artifacts to weight and biases

--clearml-log, --clearmllog

Creates a ClearML Task. Default: False. If True, ClearML logging will be enabled.

--clearml-project-name, --clearmlproject

ClearML Project Name. All the logs will be stored under this project in ClearML WebUI. If not set, default will set to ParlAI.
Default: ParlAI.

--clearml-task-name, --clearmltask

ClearML Task Name. All the logs will be stored under this task in ClearML WebUI. If not set, default will set to “Default Task”.
Default: Default Task.

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

--bpe-dropout

Use BPE dropout during training.

--torch

If true, use the torch profiler. Otherwise use cProfile.

--torch-cuda

If true, use the torch cuda profiler. Otherwise use cProfile.

--debug

If true, enter debugger at end of run.


token_stats

Short description: Compute tokenized stats.

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:stream:ordered.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)
Default: test_agents/null.

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--num-examples, --n

Default: -1.

--log-every-n-secs, --ltim

Default: 10.

--field

Default: text.

--final-only


torchscript

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--scripted-model-file, --smf

Where the scripted model checkpoint will be saved
Default: _scripted.pt.

--input, --in

Input string to pass into the encoder of the scripted model, to test it against the unscripted version. Separate lines with a pipe

--script-module, --sm

Module to TorchScript. Example: parlai.torchscript.modules:TorchScriptGreedySearch
Default: parlai.torchscript.modules:TorchScriptGreedySearch.

--enable-inference-optimization, --eio

Enable inference optimizations on the scripted model.


vacuum

Short description: Shrink a model file for release.

Reduces the size of a model file by stripping the optimizer.

Assumes we are working with a TorchAgent

CLI Arguments

Argument

Description

--model-file, --mf

Path to model file.

--no-backup

Do not create a backup.


verify_data

Short description: Check tasks for common errors

Verify data doesn’t have basic mistakes, like empty text fields or empty label candidates.

Examples

parlai verify_data --task convai2 --datatype valid

CLI Arguments

Argument

Description

--init-opt, --o

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

--task, --t

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

--datatype, --dt

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:stream:ordered.

--batchsize, --bs

Batch size for minibatch training schemes
Default: 1.

--dynamic-batching, --dynb

Use dynamic batching
Choices: full, batchsort, None

--verbose, --v

Print all messages

--debug

Enables some debug behavior

--datapath, --dp

Path to datasets, defaults to {parlai_dir}/data

--model, --m

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

--model-file, --mf

Model file name for loading and saving models

--init-model, --im

Initialize model weights and dict from this file

--log-every-n-secs, --ltim

Default: 2.

--display-examples, --d