Command Line Usage

This contains the command line usage for each of the standard scripts we release. These are each included in parlai/scripts.

extract_image_feature

Basic example which iterates through the tasks specified and load/extract the image features.

For more options, check parlai.core.image_featurizers

Examples

To extract the image feature of COCO images:

python examples/extract_image_feature.py -t vqa_v1 -im resnet152

CLI help

usage: python -m parlai.scripts.extract_image_feature [-h] [-v] [-t TASK]
                                                      [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                                      [-im IMAGE_MODE]
                                                      [-nt NUMTHREADS]
                                                      [-bs BATCHSIZE]
                                                      [-bsrt BATCH_SORT]
                                                      [-clen CONTEXT_LENGTH]
                                                      [-incl INCLUDE_LABELS]
                                                      [-dp DATAPATH]
                                                      [-pyt PYTORCH_TEACHER_TASK]
                                                      [-pytd PYTORCH_TEACHER_DATASET]
                                                      [--pytorch-datapath PYTORCH_DATAPATH]
                                                      [-nw NUMWORKERS]
                                                      [--pytorch-preprocess PYTORCH_PREPROCESS]
                                                      [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                                      [--batch-sort-cache-type {pop,index,none}]
                                                      [--batch-length-range BATCH_LENGTH_RANGE]
                                                      [--shuffle SHUFFLE]
                                                      [--batch-sort-field BATCH_SORT_FIELD]
                                                      [-pyclen PYTORCH_CONTEXT_LENGTH]
                                                      [-pyincl PYTORCH_INCLUDE_LABELS]
                                                      [--dataset DATASET]
                                                      [-at]
                                                      [--use-hdf5-extraction USE_HDF5_EXTRACTION]

Load/extract image features

optional arguments:
  -h, --help            show this help message and exit

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

Image Extraction:
  --dataset DATASET     Pytorch Dataset; if specified, will save the images in
                        one hdf5 file according to how they are returned by
                        the specified dataset
  -at, --attention      Whether to extract image features with attention (Note
                        - this is specifically for the mlb_vqa model)
  --use-hdf5-extraction USE_HDF5_EXTRACTION
                        Whether to extract images into an hdf5 dataset

interactive_rank

Does human evaluation on a task with label_candidates.

Human can exit with ctrl + c and metrics will be computed and displayed.

Examples

python examples/interactive_rank.py -t babi:task10k:1 -dt valid

When prompted, enter the index of the label_candidate you think is correct. Candidates are shuffled for each example. During datatype train, examples are randomly sampled with replacement; use train:ordered to not repeat examples. During datatype valid or test, examples are shown in order, not shuffled.

CLI help

usage: python -m parlai.scripts.interactive_rank [-h] [-v] [-t TASK]
                                                 [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                                 [-im IMAGE_MODE]
                                                 [-nt NUMTHREADS]
                                                 [-bs BATCHSIZE]
                                                 [-bsrt BATCH_SORT]
                                                 [-clen CONTEXT_LENGTH]
                                                 [-incl INCLUDE_LABELS]
                                                 [-dp DATAPATH]
                                                 [-pyt PYTORCH_TEACHER_TASK]
                                                 [-pytd PYTORCH_TEACHER_DATASET]
                                                 [--pytorch-datapath PYTORCH_DATAPATH]
                                                 [-nw NUMWORKERS]
                                                 [--pytorch-preprocess PYTORCH_PREPROCESS]
                                                 [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                                 [--batch-sort-cache-type {pop,index,none}]
                                                 [--batch-length-range BATCH_LENGTH_RANGE]
                                                 [--shuffle SHUFFLE]
                                                 [--batch-sort-field BATCH_SORT_FIELD]
                                                 [-pyclen PYTORCH_CONTEXT_LENGTH]
                                                 [-pyincl PYTORCH_INCLUDE_LABELS]

ParlAI parser

optional arguments:
  -h, --help            show this help message and exit

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

build_pytorch_data

Generates a pytorch data file from the training data; for use in the PytorchDataTeacher.

Note that with our given implementation of batch act, episodes are compressed such that each episode is one example for a model.

One can set the --context-len flag to specify how many past utterances are used in a flattened episode.

CLI help

usage: python -m parlai.scripts.build_pytorch_data [-h] [-v] [-t TASK]
                                                   [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                                   [-im IMAGE_MODE]
                                                   [-nt NUMTHREADS]
                                                   [-bs BATCHSIZE]
                                                   [-bsrt BATCH_SORT]
                                                   [-clen CONTEXT_LENGTH]
                                                   [-incl INCLUDE_LABELS]
                                                   [-dp DATAPATH]
                                                   [-pyt PYTORCH_TEACHER_TASK]
                                                   [-pytd PYTORCH_TEACHER_DATASET]
                                                   [--pytorch-datapath PYTORCH_DATAPATH]
                                                   [-nw NUMWORKERS]
                                                   [--pytorch-preprocess PYTORCH_PREPROCESS]
                                                   [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                                   [--batch-sort-cache-type {pop,index,none}]
                                                   [--batch-length-range BATCH_LENGTH_RANGE]
                                                   [--shuffle SHUFFLE]
                                                   [--batch-sort-field BATCH_SORT_FIELD]
                                                   [-pyclen PYTORCH_CONTEXT_LENGTH]
                                                   [-pyincl PYTORCH_INCLUDE_LABELS]
                                                   [-m MODEL] [-mf MODEL_FILE]

Builds a pytorch data file.

optional arguments:
  -h, --help            show this help message and exit

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

train_model

The standard way to train a model. After training, also computes validation and test error.

The user must provide a model (with --model) and a task (with --task or --pytorch-teacher-task).

Examples

python -m parlai.scripts.train -m ir_baseline -t dialog_babi:Task:1 -mf /tmp/model
python -m parlai.scripts.train -m seq2seq -t babi:Task10k:1 -mf '/tmp/model' -bs 32 -lr 0.5 -hs 128
python -m parlai.scripts.train -m drqa -t babi:Task10k:1 -mf /tmp/model -bs 10

CLI help

usage: python -m parlai.scripts.train_model [-h] [-v] [-t TASK]
                                            [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                            [-im IMAGE_MODE] [-nt NUMTHREADS]
                                            [-bs BATCHSIZE] [-bsrt BATCH_SORT]
                                            [-clen CONTEXT_LENGTH]
                                            [-incl INCLUDE_LABELS]
                                            [-dp DATAPATH]
                                            [-pyt PYTORCH_TEACHER_TASK]
                                            [-pytd PYTORCH_TEACHER_DATASET]
                                            [--pytorch-datapath PYTORCH_DATAPATH]
                                            [-nw NUMWORKERS]
                                            [--pytorch-preprocess PYTORCH_PREPROCESS]
                                            [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                            [--batch-sort-cache-type {pop,index,none}]
                                            [--batch-length-range BATCH_LENGTH_RANGE]
                                            [--shuffle SHUFFLE]
                                            [--batch-sort-field BATCH_SORT_FIELD]
                                            [-pyclen PYTORCH_CONTEXT_LENGTH]
                                            [-pyincl PYTORCH_INCLUDE_LABELS]
                                            [-m MODEL] [-mf MODEL_FILE]
                                            [-et EVALTASK]
                                            [--display-examples DISPLAY_EXAMPLES]
                                            [-eps NUM_EPOCHS]
                                            [-ttim MAX_TRAIN_TIME]
                                            [-vtim VALIDATION_EVERY_N_SECS]
                                            [-stim SAVE_EVERY_N_SECS]
                                            [-sval SAVE_AFTER_VALID]
                                            [-veps VALIDATION_EVERY_N_EPOCHS]
                                            [-vp VALIDATION_PATIENCE]
                                            [-vmt VALIDATION_METRIC]
                                            [-vmm {max,min}]
                                            [-tblog TENSORBOARD_LOG]
                                            [-tbtag TENSORBOARD_TAG]
                                            [-tbmetrics TENSORBOARD_METRICS]
                                            [--dict-maxexs DICT_MAXEXS]
                                            [--dict-include-valid DICT_INCLUDE_VALID]
                                            [--dict-include-test DICT_INCLUDE_TEST]
                                            [-ltim LOG_EVERY_N_SECS]
                                            [-df DICT_FILE]
                                            [--dict-minfreq DICT_MINFREQ]
                                            [--dict-maxtokens DICT_MAXTOKENS]
                                            [-tok DICT_TOKENIZER]
                                            [--dict-lower DICT_LOWER]

Train a model

optional arguments:
  -h, --help            show this help message and exit

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

Training Loop Arguments:
  -et EVALTASK, --evaltask EVALTASK
                        task to use for valid/test (defaults to the one used
                        for training if not set)
  --display-examples DISPLAY_EXAMPLES
  -eps NUM_EPOCHS, --num-epochs NUM_EPOCHS
  -ttim MAX_TRAIN_TIME, --max-train-time MAX_TRAIN_TIME
  -vtim VALIDATION_EVERY_N_SECS, --validation-every-n-secs VALIDATION_EVERY_N_SECS
                        Validate every n seconds. Whenever the the best
                        validation metric is found, saves the model to the
                        model_file path if set.
  -stim SAVE_EVERY_N_SECS, --save-every-n-secs SAVE_EVERY_N_SECS
                        Saves the model to model_file.checkpoint after every n
                        seconds (default -1, never).
  -sval SAVE_AFTER_VALID, --save-after-valid SAVE_AFTER_VALID
                        Saves the model to model_file.checkpoint after every
                        validation (default False).
  -veps VALIDATION_EVERY_N_EPOCHS, --validation-every-n-epochs VALIDATION_EVERY_N_EPOCHS
                        Validate every n epochs. Whenever the the best
                        validation metric is found, saves the model to the
                        model_file path if set.
  -vp VALIDATION_PATIENCE, --validation-patience VALIDATION_PATIENCE
                        number of iterations of validation where result does
                        not improve before we stop training
  -vmt VALIDATION_METRIC, --validation-metric VALIDATION_METRIC
                        key into report table for selecting best validation
  -vmm {max,min}, --validation-metric-mode {max,min}
                        how to optimize validation metric (max or min)

Tensorboard Arguments:
  -tblog TENSORBOARD_LOG, --tensorboard-log TENSORBOARD_LOG
                        Tensorboard logging of metrics, default is False
  -tbtag TENSORBOARD_TAG, --tensorboard-tag TENSORBOARD_TAG
                        Specify all opt keys which you want to be presented in
                        in TB name
  -tbmetrics TENSORBOARD_METRICS, --tensorboard-metrics TENSORBOARD_METRICS
                        Specify metrics which you want to track, it will be
                        extracted from report dict.

Dictionary Loop Arguments:
  --dict-maxexs DICT_MAXEXS
                        max number of examples to build dict on
  --dict-include-valid DICT_INCLUDE_VALID
                        Include validation set in dictionary building for
                        task.
  --dict-include-test DICT_INCLUDE_TEST
                        Include test set in dictionary building for task.
  -ltim LOG_EVERY_N_SECS, --log-every-n-secs LOG_EVERY_N_SECS

Dictionary Arguments:
  -df DICT_FILE, --dict-file DICT_FILE
                        path to dictionary file. defaults to [model_file].dict
                        if not set and model_file is set.
  --dict-minfreq DICT_MINFREQ
                        minimum frequency of words to include them in sorted
                        dict or minimum frequency of bpe codecs
  --dict-maxtokens DICT_MAXTOKENS
                        max number of tokens to include in dictionary or bpe
                        codecs
  -tok DICT_TOKENIZER, --dict-tokenizer DICT_TOKENIZER
                        Which tokenizer to use. Defaults to "split", which
                        splits on whitespace as well as recognizing basic
                        punctuation. Other options include nltk and spacy.
  --dict-lower DICT_LOWER
                        Whether or not to lowercase all text seen.

eval_wordstat

This helper script can be used alone with modelfile and task: the output will contain the word statistics of the model outputs. One can also use the function defined here in other places in order to get such statistic for any agent given the agent object (with corr. dict) and a sequence.

Additionally provides function get_word_stats that can be used in other parts of runtime code since it depends only on the agent object. For example:

from parlai.scripts.eval_wordstat import get_word_stats
reqs, cnt = get_word_stats(predictions.tolist(), self.dict)

Examples

eval_wordstat.py -mf data/model -t convai2:self --freq-bins 10,100,1000

CLI help

usage: python -m parlai.scripts.eval_wordstat [-h] [-v] [-t TASK]
                                              [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                              [-im IMAGE_MODE]
                                              [-nt NUMTHREADS] [-bs BATCHSIZE]
                                              [-bsrt BATCH_SORT]
                                              [-clen CONTEXT_LENGTH]
                                              [-incl INCLUDE_LABELS]
                                              [-dp DATAPATH]
                                              [-pyt PYTORCH_TEACHER_TASK]
                                              [-pytd PYTORCH_TEACHER_DATASET]
                                              [--pytorch-datapath PYTORCH_DATAPATH]
                                              [-nw NUMWORKERS]
                                              [--pytorch-preprocess PYTORCH_PREPROCESS]
                                              [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                              [--batch-sort-cache-type {pop,index,none}]
                                              [--batch-length-range BATCH_LENGTH_RANGE]
                                              [--shuffle SHUFFLE]
                                              [--batch-sort-field BATCH_SORT_FIELD]
                                              [-pyclen PYTORCH_CONTEXT_LENGTH]
                                              [-pyincl PYTORCH_INCLUDE_LABELS]
                                              [-m MODEL] [-mf MODEL_FILE]
                                              [-df DICT_FILE]
                                              [--dict-minfreq DICT_MINFREQ]
                                              [--dict-maxtokens DICT_MAXTOKENS]
                                              [-tok DICT_TOKENIZER]
                                              [--dict-lower DICT_LOWER]
                                              [-ne NUM_EXAMPLES]
                                              [-ltim LOG_EVERY_N_SECS]
                                              [-ed EXTERNAL_DICT]
                                              [-fb FREQ_BINS]
                                              [-dup DUMP_PREDICTIONS_PATH]
                                              [-cun COMPUTE_UNIQUE]
                                              [-tblog TENSORBOARD_LOG]
                                              [-tbtag TENSORBOARD_TAG]
                                              [-tbmetrics TENSORBOARD_METRICS]

compute statistics from model predictions

optional arguments:
  -h, --help            show this help message and exit
  -ne NUM_EXAMPLES, --num-examples NUM_EXAMPLES
  -ltim LOG_EVERY_N_SECS, --log-every-n-secs LOG_EVERY_N_SECS
  -ed EXTERNAL_DICT, --external-dict EXTERNAL_DICT
                        External dictionary for stat computation
  -fb FREQ_BINS, --freq-bins FREQ_BINS
                        Bins boundaries for rare words stat
  -dup DUMP_PREDICTIONS_PATH, --dump-predictions-path DUMP_PREDICTIONS_PATH
                        Dump predictions into file
  -cun COMPUTE_UNIQUE, --compute-unique COMPUTE_UNIQUE
                        Compute % of unique responses from the model

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

Dictionary Arguments:
  -df DICT_FILE, --dict-file DICT_FILE
                        path to dictionary file. defaults to [model_file].dict
                        if not set and model_file is set.
  --dict-minfreq DICT_MINFREQ
                        minimum frequency of words to include them in sorted
                        dict or minimum frequency of bpe codecs
  --dict-maxtokens DICT_MAXTOKENS
                        max number of tokens to include in dictionary or bpe
                        codecs
  -tok DICT_TOKENIZER, --dict-tokenizer DICT_TOKENIZER
                        Which tokenizer to use. Defaults to "split", which
                        splits on whitespace as well as recognizing basic
                        punctuation. Other options include nltk and spacy.
  --dict-lower DICT_LOWER
                        Whether or not to lowercase all text seen.

Tensorboard Arguments:
  -tblog TENSORBOARD_LOG, --tensorboard-log TENSORBOARD_LOG
                        Tensorboard logging of metrics, default is False
  -tbtag TENSORBOARD_TAG, --tensorboard-tag TENSORBOARD_TAG
                        Specify all opt keys which you want to be presented in
                        in TB name
  -tbmetrics TENSORBOARD_METRICS, --tensorboard-metrics TENSORBOARD_METRICS
                        Specify metrics which you want to track, it will be
                        extracted from report dict.

interactive

Basic script which allows local human keyboard input to talk to a trained model.

Examples

python examples/interactive.py -m drqa -mf "models:drqa/squad/model"

When prompted, enter something like: Bob is Blue.\nWhat is Bob?

Input is often model or task specific, but in drqa, it is always context '\n' question.

CLI help

usage: python -m parlai.scripts.interactive [-h] [-v] [-t TASK]
                                            [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                            [-im IMAGE_MODE] [-nt NUMTHREADS]
                                            [-bs BATCHSIZE] [-bsrt BATCH_SORT]
                                            [-clen CONTEXT_LENGTH]
                                            [-incl INCLUDE_LABELS]
                                            [-dp DATAPATH]
                                            [-pyt PYTORCH_TEACHER_TASK]
                                            [-pytd PYTORCH_TEACHER_DATASET]
                                            [--pytorch-datapath PYTORCH_DATAPATH]
                                            [-nw NUMWORKERS]
                                            [--pytorch-preprocess PYTORCH_PREPROCESS]
                                            [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                            [--batch-sort-cache-type {pop,index,none}]
                                            [--batch-length-range BATCH_LENGTH_RANGE]
                                            [--shuffle SHUFFLE]
                                            [--batch-sort-field BATCH_SORT_FIELD]
                                            [-pyclen PYTORCH_CONTEXT_LENGTH]
                                            [-pyincl PYTORCH_INCLUDE_LABELS]
                                            [-m MODEL] [-mf MODEL_FILE]
                                            [-d DISPLAY_EXAMPLES]
                                            [--display-prettify DISPLAY_PRETTIFY]
                                            [--display-ignore-fields DISPLAY_IGNORE_FIELDS]
                                            [-fixedCands LOCAL_HUMAN_CANDIDATES_FILE]
                                            [--single-turn SINGLE_TURN]

Interactive chat with a model

optional arguments:
  -h, --help            show this help message and exit
  -d DISPLAY_EXAMPLES, --display-examples DISPLAY_EXAMPLES
  --display-prettify DISPLAY_PRETTIFY
                        Set to use a prettytable when displaying examples with
                        text candidates
  --display-ignore-fields DISPLAY_IGNORE_FIELDS
                        Do not display these fields

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

Local Human Arguments:
  -fixedCands LOCAL_HUMAN_CANDIDATES_FILE, --local-human-candidates-file LOCAL_HUMAN_CANDIDATES_FILE
                        File of label_candidates to send to other agent
  --single-turn SINGLE_TURN
                        If on, assumes single turn episodes.

verify_data

Verify data doesn’t have basic mistakes, like empty text fields or empty label candidates.

Examples

python parlai/scripts/verify_data.py -t convai2 -dt train:ordered

CLI help

usage: python -m parlai.scripts.verify_data [-h] [-v] [-t TASK]
                                            [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                            [-im IMAGE_MODE] [-nt NUMTHREADS]
                                            [-bs BATCHSIZE] [-bsrt BATCH_SORT]
                                            [-clen CONTEXT_LENGTH]
                                            [-incl INCLUDE_LABELS]
                                            [-dp DATAPATH]
                                            [-pyt PYTORCH_TEACHER_TASK]
                                            [-pytd PYTORCH_TEACHER_DATASET]
                                            [--pytorch-datapath PYTORCH_DATAPATH]
                                            [-nw NUMWORKERS]
                                            [--pytorch-preprocess PYTORCH_PREPROCESS]
                                            [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                            [--batch-sort-cache-type {pop,index,none}]
                                            [--batch-length-range BATCH_LENGTH_RANGE]
                                            [--shuffle SHUFFLE]
                                            [--batch-sort-field BATCH_SORT_FIELD]
                                            [-pyclen PYTORCH_CONTEXT_LENGTH]
                                            [-pyincl PYTORCH_INCLUDE_LABELS]
                                            [-m MODEL] [-mf MODEL_FILE]
                                            [-ltim LOG_EVERY_N_SECS]
                                            [-d DISPLAY_EXAMPLES]

Lint for ParlAI tasks

optional arguments:
  -h, --help            show this help message and exit
  -ltim LOG_EVERY_N_SECS, --log-every-n-secs LOG_EVERY_N_SECS
  -d DISPLAY_EXAMPLES, --display-examples DISPLAY_EXAMPLES

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

build_dict

Generates a dictionary file from the training data.

Examples

# learn the vocabulary from one task, then train on another task.
python -m parlai.scripts.build_dict -t convai2 --dict-file premade.dict
python -m parlai.scripts.train_model -t squad --dict-file premade.dict -m seq2seq

CLI help

usage: python -m parlai.scripts.build_dict [-h] [-v] [-t TASK]
                                           [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                           [-im IMAGE_MODE] [-nt NUMTHREADS]
                                           [-bs BATCHSIZE] [-bsrt BATCH_SORT]
                                           [-clen CONTEXT_LENGTH]
                                           [-incl INCLUDE_LABELS]
                                           [-dp DATAPATH]
                                           [-pyt PYTORCH_TEACHER_TASK]
                                           [-pytd PYTORCH_TEACHER_DATASET]
                                           [--pytorch-datapath PYTORCH_DATAPATH]
                                           [-nw NUMWORKERS]
                                           [--pytorch-preprocess PYTORCH_PREPROCESS]
                                           [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                           [--batch-sort-cache-type {pop,index,none}]
                                           [--batch-length-range BATCH_LENGTH_RANGE]
                                           [--shuffle SHUFFLE]
                                           [--batch-sort-field BATCH_SORT_FIELD]
                                           [-pyclen PYTORCH_CONTEXT_LENGTH]
                                           [-pyincl PYTORCH_INCLUDE_LABELS]
                                           [-m MODEL] [-mf MODEL_FILE]
                                           [--dict-maxexs DICT_MAXEXS]
                                           [--dict-include-valid DICT_INCLUDE_VALID]
                                           [--dict-include-test DICT_INCLUDE_TEST]
                                           [-ltim LOG_EVERY_N_SECS]
                                           [-df DICT_FILE]
                                           [--dict-minfreq DICT_MINFREQ]
                                           [--dict-maxtokens DICT_MAXTOKENS]
                                           [-tok DICT_TOKENIZER]
                                           [--dict-lower DICT_LOWER]

Build a dictionary.

optional arguments:
  -h, --help            show this help message and exit

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

Dictionary Loop Arguments:
  --dict-maxexs DICT_MAXEXS
                        max number of examples to build dict on
  --dict-include-valid DICT_INCLUDE_VALID
                        Include validation set in dictionary building for
                        task.
  --dict-include-test DICT_INCLUDE_TEST
                        Include test set in dictionary building for task.
  -ltim LOG_EVERY_N_SECS, --log-every-n-secs LOG_EVERY_N_SECS

Dictionary Arguments:
  -df DICT_FILE, --dict-file DICT_FILE
                        path to dictionary file. defaults to [model_file].dict
                        if not set and model_file is set.
  --dict-minfreq DICT_MINFREQ
                        minimum frequency of words to include them in sorted
                        dict or minimum frequency of bpe codecs
  --dict-maxtokens DICT_MAXTOKENS
                        max number of tokens to include in dictionary or bpe
                        codecs
  -tok DICT_TOKENIZER, --dict-tokenizer DICT_TOKENIZER
                        Which tokenizer to use. Defaults to "split", which
                        splits on whitespace as well as recognizing basic
                        punctuation. Other options include nltk and spacy.
  --dict-lower DICT_LOWER
                        Whether or not to lowercase all text seen.

display_model

Basic example which iterates through the tasks specified and runs the given model on them.

Examples

python examples/display_model.py -t babi:task1k:1 -m "repeat_label"
python examples/display_model.py -t "#MovieDD-Reddit" -m "ir_baseline" -mp "-lp 0.5" -dt test

CLI help

usage: python -m parlai.scripts.display_model [-h] [-v] [-t TASK]
                                              [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                              [-im IMAGE_MODE]
                                              [-nt NUMTHREADS] [-bs BATCHSIZE]
                                              [-bsrt BATCH_SORT]
                                              [-clen CONTEXT_LENGTH]
                                              [-incl INCLUDE_LABELS]
                                              [-dp DATAPATH]
                                              [-pyt PYTORCH_TEACHER_TASK]
                                              [-pytd PYTORCH_TEACHER_DATASET]
                                              [--pytorch-datapath PYTORCH_DATAPATH]
                                              [-nw NUMWORKERS]
                                              [--pytorch-preprocess PYTORCH_PREPROCESS]
                                              [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                              [--batch-sort-cache-type {pop,index,none}]
                                              [--batch-length-range BATCH_LENGTH_RANGE]
                                              [--shuffle SHUFFLE]
                                              [--batch-sort-field BATCH_SORT_FIELD]
                                              [-pyclen PYTORCH_CONTEXT_LENGTH]
                                              [-pyincl PYTORCH_INCLUDE_LABELS]
                                              [-m MODEL] [-mf MODEL_FILE]
                                              [-n NUM_EXAMPLES]
                                              [--display-ignore-fields DISPLAY_IGNORE_FIELDS]

Display model predictions.

optional arguments:
  -h, --help            show this help message and exit
  -n NUM_EXAMPLES, --num-examples NUM_EXAMPLES
  --display-ignore-fields DISPLAY_IGNORE_FIELDS

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

display_data

Basic example which iterates through the tasks specified and prints them out. Used for verification of data loading and iteration.

For example, to make sure that bAbI task 1 (1k exs) loads one can run and to see a few of them:

Examples

python display_data.py -t babi:task1k:1

CLI help

usage: python -m parlai.scripts.display_data [-h] [-v] [-t TASK]
                                             [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                             [-im IMAGE_MODE] [-nt NUMTHREADS]
                                             [-bs BATCHSIZE]
                                             [-bsrt BATCH_SORT]
                                             [-clen CONTEXT_LENGTH]
                                             [-incl INCLUDE_LABELS]
                                             [-dp DATAPATH]
                                             [-pyt PYTORCH_TEACHER_TASK]
                                             [-pytd PYTORCH_TEACHER_DATASET]
                                             [--pytorch-datapath PYTORCH_DATAPATH]
                                             [-nw NUMWORKERS]
                                             [--pytorch-preprocess PYTORCH_PREPROCESS]
                                             [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                             [--batch-sort-cache-type {pop,index,none}]
                                             [--batch-length-range BATCH_LENGTH_RANGE]
                                             [--shuffle SHUFFLE]
                                             [--batch-sort-field BATCH_SORT_FIELD]
                                             [-pyclen PYTORCH_CONTEXT_LENGTH]
                                             [-pyincl PYTORCH_INCLUDE_LABELS]
                                             [-m MODEL] [-mf MODEL_FILE]
                                             [-ne NUM_EXAMPLES]
                                             [-mdl MAX_DISPLAY_LEN]
                                             [--display-ignore-fields DISPLAY_IGNORE_FIELDS]

Display data from a task

optional arguments:
  -h, --help            show this help message and exit
  -ne NUM_EXAMPLES, --num-examples NUM_EXAMPLES
  -mdl MAX_DISPLAY_LEN, --max-display-len MAX_DISPLAY_LEN
  --display-ignore-fields DISPLAY_IGNORE_FIELDS

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

detect_offensive_language

Basic example which iterates through the tasks specified and checks them for offensive language.

Examples

python -m parlai.scripts.detect_offensive_language -t "convai_chitchat" --display-examples True

CLI help

usage: python -m parlai.scripts.detect_offensive_language [-h] [-v] [-t TASK]
                                                          [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                                          [-im IMAGE_MODE]
                                                          [-nt NUMTHREADS]
                                                          [-bs BATCHSIZE]
                                                          [-bsrt BATCH_SORT]
                                                          [-clen CONTEXT_LENGTH]
                                                          [-incl INCLUDE_LABELS]
                                                          [-dp DATAPATH]
                                                          [-pyt PYTORCH_TEACHER_TASK]
                                                          [-pytd PYTORCH_TEACHER_DATASET]
                                                          [--pytorch-datapath PYTORCH_DATAPATH]
                                                          [-nw NUMWORKERS]
                                                          [--pytorch-preprocess PYTORCH_PREPROCESS]
                                                          [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                                          [--batch-sort-cache-type {pop,index,none}]
                                                          [--batch-length-range BATCH_LENGTH_RANGE]
                                                          [--shuffle SHUFFLE]
                                                          [--batch-sort-field BATCH_SORT_FIELD]
                                                          [-pyclen PYTORCH_CONTEXT_LENGTH]
                                                          [-pyincl PYTORCH_INCLUDE_LABELS]
                                                          [-m MODEL]
                                                          [-mf MODEL_FILE]
                                                          [-ltim LOG_EVERY_N_SECS]
                                                          [-d DISPLAY_EXAMPLES]

Check task for offensive language

optional arguments:
  -h, --help            show this help message and exit
  -ltim LOG_EVERY_N_SECS, --log-every-n-secs LOG_EVERY_N_SECS
  -d DISPLAY_EXAMPLES, --display-examples DISPLAY_EXAMPLES

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

eval_ppl

Base script for model-agnostic perplexity evaluation.

While resistent to choices of model-added tokens like START and END, this requires fixing a specific vocabulary. Be sure to use the same build_dict parameters for all comparisons.

Tokens which are present in the data being evaluated but not in the vocabulary do not contribute to the perplexity score, but they are still sent to the model so the model can update its state. If the token is in the vocabulary but receives a probability score of zero by the model, the model will get a perplexity score of inf.

This requires agents to implement the following function:

def next_word_probability(self, partial_out):

Return probability distribution over next words given a partial true output. This is used to calculate the per-word perplexity.

Arguments: partial_out – list of previous “true” words

Returns a dict, where each key is a word and each value is a probability score for that word. Unset keys assume a probability of zero.

e.g. (previous observation: {‘text’: ‘Run test program.’}) [] => {‘hello’: 1.0} [‘hello’] => {‘world’: 1.0}

CLI help

usage: python -m parlai.scripts.eval_ppl [-h] [-v] [-t TASK]
                                         [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                         [-im IMAGE_MODE] [-nt NUMTHREADS]
                                         [-bs BATCHSIZE] [-bsrt BATCH_SORT]
                                         [-clen CONTEXT_LENGTH]
                                         [-incl INCLUDE_LABELS] [-dp DATAPATH]
                                         [-pyt PYTORCH_TEACHER_TASK]
                                         [-pytd PYTORCH_TEACHER_DATASET]
                                         [--pytorch-datapath PYTORCH_DATAPATH]
                                         [-nw NUMWORKERS]
                                         [--pytorch-preprocess PYTORCH_PREPROCESS]
                                         [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                         [--batch-sort-cache-type {pop,index,none}]
                                         [--batch-length-range BATCH_LENGTH_RANGE]
                                         [--shuffle SHUFFLE]
                                         [--batch-sort-field BATCH_SORT_FIELD]
                                         [-pyclen PYTORCH_CONTEXT_LENGTH]
                                         [-pyincl PYTORCH_INCLUDE_LABELS]
                                         [-m MODEL] [-mf MODEL_FILE]

Evaluate perplexity

optional arguments:
  -h, --help            show this help message and exit

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

eval_model

Basic example which iterates through the tasks specified and evaluates the given model on them.

Examples

python eval_model.py -t "babi:Task1k:2" -m "repeat_label"
python eval_model.py -t "#CornellMovie" -m "ir_baseline" -mp "-lp 0.5"

CLI help

usage: python -m parlai.scripts.eval_model [-h] [-v] [-t TASK]
                                           [-dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}]
                                           [-im IMAGE_MODE] [-nt NUMTHREADS]
                                           [-bs BATCHSIZE] [-bsrt BATCH_SORT]
                                           [-clen CONTEXT_LENGTH]
                                           [-incl INCLUDE_LABELS]
                                           [-dp DATAPATH]
                                           [-pyt PYTORCH_TEACHER_TASK]
                                           [-pytd PYTORCH_TEACHER_DATASET]
                                           [--pytorch-datapath PYTORCH_DATAPATH]
                                           [-nw NUMWORKERS]
                                           [--pytorch-preprocess PYTORCH_PREPROCESS]
                                           [-pybsrt PYTORCH_TEACHER_BATCH_SORT]
                                           [--batch-sort-cache-type {pop,index,none}]
                                           [--batch-length-range BATCH_LENGTH_RANGE]
                                           [--shuffle SHUFFLE]
                                           [--batch-sort-field BATCH_SORT_FIELD]
                                           [-pyclen PYTORCH_CONTEXT_LENGTH]
                                           [-pyincl PYTORCH_INCLUDE_LABELS]
                                           [-m MODEL] [-mf MODEL_FILE]
                                           [-ne NUM_EXAMPLES]
                                           [-d DISPLAY_EXAMPLES]
                                           [-ltim LOG_EVERY_N_SECS]
                                           [--metrics METRICS]
                                           [-tblog TENSORBOARD_LOG]
                                           [-tbtag TENSORBOARD_TAG]
                                           [-tbmetrics TENSORBOARD_METRICS]

Evaluate a model

optional arguments:
  -h, --help            show this help message and exit
  -ne NUM_EXAMPLES, --num-examples NUM_EXAMPLES
  -d DISPLAY_EXAMPLES, --display-examples DISPLAY_EXAMPLES
  -ltim LOG_EVERY_N_SECS, --log-every-n-secs LOG_EVERY_N_SECS
  --metrics METRICS     list of metrics to show/compute, e.g.
                        ppl,f1,accuracy,hits@1.If 'all' is specified [default]
                        all are shown.

Main ParlAI Arguments:
  -v, --show-advanced-args
                        Show hidden command line options (advanced users only)
  -t TASK, --task TASK  ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"
  -dt {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}, --datatype {train,train:stream,train:ordered,train:ordered:stream,train:stream:ordered,train:evalmode,train:evalmode:stream,train:evalmode:ordered,train:evalmode:ordered:stream,train:evalmode:stream:ordered,valid,valid:stream,test,test:stream}
                        choose from: train, train:ordered, valid, test. to
                        stream data add ":stream" to any option (e.g.,
                        train:stream). by default: train is random with
                        replacement, valid is ordered, test is ordered.
  -im IMAGE_MODE, --image-mode IMAGE_MODE
                        image preprocessor to use. default is "raw". set to
                        "none" to skip image loading.
  -nt NUMTHREADS, --numthreads NUMTHREADS
                        number of threads. If batchsize set to 1, used for
                        hogwild; otherwise, used for number of threads in
                        threadpool loading, e.g. in vqa
  -dp DATAPATH, --datapath DATAPATH
                        path to datasets, defaults to {parlai_dir}/data

Batching Arguments:
  -bs BATCHSIZE, --batchsize BATCHSIZE
                        batch size for minibatch training schemes
  -bsrt BATCH_SORT, --batch-sort BATCH_SORT
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.If enabled (default
                        False), create batches by flattening all episodes to
                        have exactly one utterance exchange and then sorting
                        all the examples according to their length. This
                        dramatically reduces the amount of padding present
                        after examples have been parsed, speeding up training.
  -clen CONTEXT_LENGTH, --context-length CONTEXT_LENGTH
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Number of past
                        utterances to remember when building flattened batches
                        of data in multi-example episodes.
  -incl INCLUDE_LABELS, --include-labels INCLUDE_LABELS
                        **NOTE: This is deprecated, if you would like to make
                        use of batch sort functionality, pleaseuse -pybsrt
                        with the PytorchDataTeacher**.Specifies whether or not
                        to include labels as past utterances when building
                        flattened batches of data in multi-example episodes.

PytorchData Arguments:
  -pyt PYTORCH_TEACHER_TASK, --pytorch-teacher-task PYTORCH_TEACHER_TASK
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a standard ParlAI
                        task, e.g. "babi:Task1k"
  -pytd PYTORCH_TEACHER_DATASET, --pytorch-teacher-dataset PYTORCH_TEACHER_DATASET
                        Specify to use the PytorchDataTeacher for
                        multiprocessed data loading with a pytorch Dataset,
                        e.g. "vqa_1" or "flickr30k"
  --pytorch-datapath PYTORCH_DATAPATH
                        datapath for pytorch data loader (note: only specify
                        if the data does not reside in the normal ParlAI
                        datapath)
  -nw NUMWORKERS, --numworkers NUMWORKERS
                        how many workers the Pytorch dataloader should use
  --pytorch-preprocess PYTORCH_PREPROCESS
                        Whether the agent should preprocess the data while
                        buildingthe pytorch data
  -pybsrt PYTORCH_TEACHER_BATCH_SORT, --pytorch-teacher-batch-sort PYTORCH_TEACHER_BATCH_SORT
                        Whether to construct batches of similarly sized
                        episodeswhen using the PytorchDataTeacher (either via
                        specifying `-pyt`or `-pytd`
  --batch-sort-cache-type {pop,index,none}
                        how to build up the batch cache
  --batch-length-range BATCH_LENGTH_RANGE
                        degree of variation of size allowed in batch
  --shuffle SHUFFLE     Whether to shuffle the data
  --batch-sort-field BATCH_SORT_FIELD
                        What field to use when determining the length of an
                        episode
  -pyclen PYTORCH_CONTEXT_LENGTH, --pytorch-context-length PYTORCH_CONTEXT_LENGTH
                        Number of past utterances to remember when building
                        flattened batches of data in multi-example
                        episodes.(For use with PytorchDataTeacher)
  -pyincl PYTORCH_INCLUDE_LABELS, --pytorch-include-labels PYTORCH_INCLUDE_LABELS
                        Specifies whether or not to include labels as past
                        utterances when building flattened batches of data in
                        multi-example episodes.(For use with
                        PytorchDataTeacher)

ParlAI Model Arguments:
  -m MODEL, --model MODEL
                        the model class name. can match parlai/agents/<model>
                        for agents in that directory, or can provide a fully
                        specified module for `from X import Y` via `-m X:Y`
                        (e.g. `-m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent`)
  -mf MODEL_FILE, --model-file MODEL_FILE
                        model file name for loading and saving models

Tensorboard Arguments:
  -tblog TENSORBOARD_LOG, --tensorboard-log TENSORBOARD_LOG
                        Tensorboard logging of metrics, default is False
  -tbtag TENSORBOARD_TAG, --tensorboard-tag TENSORBOARD_TAG
                        Specify all opt keys which you want to be presented in
                        in TB name
  -tbmetrics TENSORBOARD_METRICS, --tensorboard-metrics TENSORBOARD_METRICS
                        Specify metrics which you want to track, it will be
                        extracted from report dict.