Torch Agent implements much of the boilerplate necessary for creating a neural dialogue agent, so you can focus on modeling. Torch Agent limits its functionality to maintaining dialogue history, transforming text into vectors of indicies, and loading/saving models. The user is required to implement their own logic in methods like train_step and eval_step.

Torch Ranker Agent and Torch Generator have more specialized stub methods, and provide many rich features and benefits. Torch Ranker Agent assumes your model ranks possible responses from a set of possible candidates, and provides options around negative sampling, candidate sampling, and large-scale candidate prediction. Torch Generator Agent assumes your model generates utterances auto-regressively, and provides generic implementations of beam search.

Torch Agent

General utility code for building PyTorch-based agents in ParlAI.

Contains the following main utilities:

  • TorchAgent class which serves as a useful parent class for other model agents
  • Batch namedtuple which is the input type of the main abstract methods of the TorchAgent class
  • Output namedtuple which is the expected output type of the main abstract methods of the TorchAgent class

See below for documentation on each specific tool.

class parlai.core.torch_agent.Batch

Bases: tuple

Batch is a namedtuple containing data being sent to an agent.

This is the input type of the train_step and eval_step functions. Agents can override the batchify function to return an extended namedtuple with additional fields if they would like, though we recommend calling the parent function to set up these fields as a base.


bsz x seqlen tensor containing the parsed text data.


list of length bsz containing the lengths of the text in same order as text_vec; necessary for pack_padded_sequence.


bsz x seqlen tensor containing the parsed label (one per batch row).


list of length bsz containing the lengths of the labels in same order as label_vec.


list of length bsz containing the selected label for each batch row (some datasets have multiple labels per input example).


list of length bsz containing the original indices of each example in the batch. we use these to map predictions back to their proper row, since e.g. we may sort examples by their length or some examples may be invalid.


list of lists of text. outer list has size bsz, inner lists vary in size based on the number of candidates for each row in the batch.


list of lists of tensors. outer list has size bsz, inner lists vary in size based on the number of candidates for each row in the batch.


list of image features in the format specified by the –image-mode arg.


list of lists of tensors. outer list has size bsz, inner lists vary based on the number of memories for each row in the batch. these memories are generated by splitting the input text on newlines, with the last line put in the text field and the remaining put in this one.


the original observations in the batched order

class parlai.core.torch_agent.Output

Bases: tuple

Output is a namedtuple containing agent predictions.

This is the expected return type of the train_step and eval_step functions, though agents can choose to return None if they do not want to answer.


list of strings of length bsz containing the predictions of the model


list of lists of length bsz containing ranked predictions of the model. each sub-list is an ordered ranking of strings, of variable length.

class parlai.core.torch_agent.TorchAgent(opt, shared=None)

Bases: parlai.core.agents.Agent

A provided base agent for any model that wants to use Torch.

Exists to make it easier to implement a new agent. Not necessary, but reduces duplicated code.

Many methods are intended to be either used as is when the default is acceptable, or to be overriden and called with super(), with the extra functionality added to the initial result. See the method comment for recommended behavior.

This agent serves as a common framework for all ParlAI models which want to use PyTorch.

__init__(opt, shared=None)

Initialize agent.


Call batch_act with the singleton batch.

classmethod add_cmdline_args(argparser)

Add the default commandline args we expect most agents to want.


Process a batch of observations (batchsize list of message dicts).

These observations have been preprocessed by the observe method.

Subclasses can override this for special functionality, but if the default behaviors are fine then just override the train_step and eval_step methods instead. The former is called when labels are present in the observations batch; otherwise, the latter is called.

batchify(obs_batch, sort=False, is_valid=<function TorchAgent.<lambda>>)

Create a batch of valid observations from an unchecked batch.

A valid observation is one that passes the lambda provided to the function, which defaults to checking if the preprocessed ‘text_vec’ field is present which would have been set by this agent’s ‘vectorize’ function.

Returns a namedtuple Batch. See original definition above for in-depth explanation of each field.

If you want to include additonal fields in the batch, you can subclass this function and return your own “Batch” namedtuple: copy the Batch namedtuple at the top of this class, and then add whatever additional fields that you want to be able to access. You can then call super().batchify(…) to set up the original fields and then set up the additional fields in your subclass and return that batch instead.

  • obs_batch – List of vectorized observations
  • sort – Default False, orders the observations by length of vectors. Set to true when using torch.nn.utils.rnn.pack_padded_sequence. Uses the text vectors if available, otherwise uses the label vectors if available.
  • is_valid – Function that checks if ‘text_vec’ is in the observation, determines if an observation is valid
static dictionary_class()

Return the dictionary class that this agent expects to use.

Can be overriden if a more complex dictionary is required.


Process one batch but do not train on it.

get_dialog_history(observation, reply=None, add_person_tokens=False, add_p1_after_newln=False)

Retrieve dialog history and add current observations to it.

  • observation – current observation
  • reply – past utterance from the model to add to the history, such as the past label or response generated by the model.
  • add_person_tokens – add tokens identifying each speaking before utterances in the text & history.
  • add_p1_after_newln – add the other speaker token before the last newline in the input instead of at the beginning of the input. this is useful for tasks that include some kind of context before the actual utterance (e.g. squad, babi, personachat).

observation with text replaced with full dialog

init_optim(params, optim_states=None, saved_optim_type=None)

Initialize optimizer with model parameters.

  • params – parameters from the model, for example: [p for p in model.parameters() if p.requires_grad]
  • optim_states – optional argument providing states of optimizer to load

type of optimizer being loaded, if changed will skip loading optimizer states


Retrieve the last reply from the model.

If available, we use the true label instead of the model’s prediction.

By default, batch_act stores the batch of replies and this method will extract the reply of the current instance from the batch.

Parameters:use_label – default true, use the label when available instead of the model’s generated response.

Return opt and model states.

Override this method for more specific loading.

match_batch(batch_reply, valid_inds, output=None)

Match sub-batch of predictions to the original batch indices.

Batches may be only partially filled (i.e when completing the remainder at the end of the validation or test set), or we may want to sort by e.g the length of the input sequences if using pack_padded_sequence.

This matches rows back with their original row in the batch for calculating metrics like accuracy.

If output is None (model choosing not to provide any predictions), we will just return the batch of replies.

Otherwise, output should be a parlai.core.torch_agent.Output object. This is a namedtuple, which can provide text predictions and/or text_candidates predictions. If you would like to map additional fields into the batch_reply, you can override this method as well as providing your own namedtuple with additional fields.

  • batch_reply – Full-batchsize list of message dictionaries to put responses into.
  • valid_inds – Original indices of the predictions.
  • output – Output namedtuple which contains sub-batchsize list of text outputs from model. May be None (default) if model chooses not to answer. This method will check for text and text_candidates fields.

Process incoming message in preparation for producing a response.

This includes remembering the past history of the conversation.


Use the metrics to decide when to adjust LR schedule.

This uses the loss as the validation metric if present, if not this function does nothing. Note that the model must be reporting loss for this to work. Override this to override the behavior.


Get the model’s predicted reply history within this episode.

Parameters:batch – (default False) return the reply history for every row in the batch, otherwise will return just for this example.
Returns:list of lists of strings, each of the past model replies in in the current episode. will be None wherever model did not reply.

Clear internal states.


Save model parameters to path (or default to model_file arg).

Override this method for more specific saving.


Share fields from parent as well as useful objects in this class.

Subclasses will likely want to share their model as well.


Process one batch with training labels.

vectorize(obs, add_start=True, add_end=True, truncate=None, split_lines=False)

Make vectors out of observation fields and store in the observation.

In particular, the ‘text’ and ‘labels’/’eval_labels’ fields are processed and a new field is added to the observation with the suffix ‘_vec’.

If you want to use additional fields on your subclass, you can override this function, call super().vectorize(…) to process the text and labels, and then process the other fields in your subclass.

  • obs – Single observation from observe function.
  • add_start – default True, adds the start token to each label.
  • add_end – default True, adds the end token to each label.
  • truncate – default None, if set truncates all vectors to the specified length. Note that this truncates to the rightmost for inputs and the leftmost for labels and, when applicable, candidates.
  • split_lines – If set, returns list of vectors instead of a single vector for input text, one for each substring after splitting on newlines.

the input observation, with ‘text_vec’, ‘label_vec’, and ‘cands_vec’ fields added.

Torch Ranker Agent

class parlai.core.torch_ranker_agent.TorchRankerAgent(opt, shared=None)

Bases: parlai.core.torch_agent.TorchAgent

__init__(opt, shared=None)

Initialize agent.

static add_cmdline_args(argparser)

Add the default commandline args we expect most agents to want.


Build a new model (implemented by children classes)


Evaluate a single batch of examples.


Report loss and mean_rank from model’s perspective.


Reset metrics.

score_candidates(batch, cand_vecs)

Given a batch and candidate set, return scores (for ranking)


Load a set of fixed candidates and their vectors (or vectorize them here)

self.fixed_candidates will contain a [num_cands] list of strings self.fixed_candidate_vecs will contain a [num_cands, seq_len] LongTensor

See the note on the –fixed-candidate-vecs flag for an explanation of the ‘reuse’, ‘replace’, or path options.

Note: TorchRankerAgent by default converts candidates to vectors by vectorizing in the common sense (i.e., replacing each token with its index in the dictionary). If a child model wants to additionally perform encoding, it can overwrite the vectorize_fixed_candidates() method to produce encoded vectors instead of just vectorized ones.


Load the tokens from the vocab as candidates

self.vocab_candidates will contain a [num_cands] list of strings self.vocab_candidate_vecs will contain a [num_cands, 1] LongTensor


Share model parameters.


Train on a single batch of examples.


Do optim step and clip gradients if needed.


Convert a batch of candidates from text to vectors

Parameters:cands_batch – a [batchsize] list of candidates (strings)
Returns:a [num_cands] list of candidate vectors

By default, candidates are simply vectorized (tokens replaced by token ids). A child class may choose to overwrite this method to perform vectorization as well as encoding if so desired.

Torch Generator Agent

BETA: This module is in beta. Feedback is most welcome, and the API may change underneath you.

Generic Pytorch-based Generator agent. Implements quite a bit of boilerplate, including Beam search.

Contains the following utilities:

  • TorchGeneratorAgent class, which serves as a useful parent for generative torch agents.
  • Beam class which provides some generic beam functionality for classes to use
class parlai.core.torch_generator_agent.Beam(beam_size, min_length=3, padding_token=0, bos_token=1, eos_token=2, min_n_best=3, cuda='cpu', block_ngram=0)

Bases: object

Generic beam class. It keeps information about beam_size hypothesis.

__init__(beam_size, min_length=3, padding_token=0, bos_token=1, eos_token=2, min_n_best=3, cuda='cpu', block_ngram=0)

Instantiate Beam object.

  • beam_size – number of hypothesis in the beam
  • min_length – minimum length of the predicted sequence
  • padding_token – Set to 0 as usual in ParlAI
  • bos_token – Set to 1 as usual in ParlAI
  • eos_token – Set to 2 as usual in ParlAI
  • min_n_best – Beam will not be done unless this amount of finished hypothesis (with EOS) is done
  • cuda – What device to use for computations

Advance the beam one step.


Check if self.finished is empty and add hyptail in that case.

This will be suboptimal hypothesis since the model did not get any EOS


Return whether beam search is complete.

static find_ngrams(input_list, n)

Get list of ngrams with context length n-1


Get the backtrack at the current step.

get_beam_dot(dictionary=None, n_best=None)

Create pydot graph representation of the beam.

  • outputs – self.outputs from the beam
  • dictionary – tok 2 word dict to save words in the tree nodes

pydot graph


Extract hypothesis ending with EOS at timestep with hyp_id.

  • timestep – timestep with range up to len(self.outputs)-1
  • hyp_id – id with range up to beam_size-1

hypothesis sequence


Get the outputput at the current step.

static get_pretty_hypothesis(list_of_hypotails)

Return prettier version of the hypotheses.


Return finished hypotheses in rescored order.

Parameters:n_best – how many n best hypothesis to return
Returns:list with hypothesis

Get single best hypothesis.

Returns:hypothesis sequence and the final score
class parlai.core.torch_generator_agent.PerplexityEvaluatorAgent(opt, shared=None)

Bases: parlai.core.torch_generator_agent.TorchGeneratorAgent

Subclass for doing standardized perplexity evaluation.

This is designed to be used in conjunction with the PerplexityWorld at parlai/scripts/ It uses the next_word_probability function to calculate the probability of tokens one token at a time.

__init__(opt, shared=None)

Initialize evaluator.


Return probability distribution over next words.

This probability is based on both nn input and partial true output. This is used to calculate the per-word perplexity.

  • observation – input observation dict
  • partial_out – – list of previous “true” words

a dict, where each key is a word and each value is a probability score for that word. Unset keys will use a probability of 1e-7.

e.g. {‘text’: ‘Run test program.’}, [‘hello’] => {‘world’: 1.0}

class parlai.core.torch_generator_agent.TorchGeneratorAgent(opt, shared=None)

Bases: parlai.core.torch_agent.TorchAgent

Abstract Generator agent. Only meant to be extended.

TorchGeneratorAgent aims to handle much of the bookkeeping and infrastructure work for any generative models, like seq2seq or transformer. It implements the train_step and eval_step. The only requirement is that your model must implemented the interface TorchGeneratorModel interface.

__init__(opt, shared=None)

Initialize agent.

classmethod add_cmdline_args(argparser)

Add the default commandline args we expect most agents to want.

Beam search given the model and Batch

This function expects to be given a TorchGeneratorModel. Please refer to that interface for information.

  • model (TorchGeneratorModel) – Implements the above interface
  • batch (Batch) – Batch structure with input and labels
  • beam_size (int) – Size of each beam during the search
  • start (int) – start of sequence token
  • end (int) – end of sequence token
  • pad (int) – padding token
  • min_length (int) – minimum length of the decoded sequence
  • min_n_best (int) – minimum number of completed hypothesis generated from each beam
  • max_ts (int) – the maximum length of the decoded sequence

tuple (beam_pred_scores, n_best_pred_scores, beams)

  • beam_preds_scores: list of (prediction, score) pairs for each sample in Batch
  • n_best_preds_scores: list of n_best list of tuples (prediction, score) for each sample from Batch
  • beams :list of Beam instances defined in Beam class, can be used for any following postprocessing, e.g. dot logging.


Constructs the loss function. By default torch.nn.CrossEntropyLoss. The criterion function should be set to self.criterion.

If overridden, this model should (1) handle calling cuda and (2)
produce a sum that can be used for a per-token loss.

Construct the model. The model should be set to self.model, and support the TorchGeneratorModel interface.


Evaluate a single batch of examples.


Report loss and perplexity from model’s perspective.

Note that this includes predicting __END__ and __UNK__ tokens and may differ from a truly independent measurement.


Reset metrics for reporting loss and perplexity.


Share internal states between parent and child instances.


Train on a single batch of examples.


Do one optimization step.


Zero out optimizer.

class parlai.core.torch_generator_agent.TorchGeneratorModel(padding_idx=0, start_idx=1, end_idx=2, unknown_idx=3, input_dropout=0, longest_label=1)

Bases: torch.nn.modules.module.Module

This Interface expects you to implement model with the following reqs:

Field model.encoder:
 takes input returns tuple (enc_out, enc_hidden, attn_mask)
Field model.decoder:
 takes decoder params and returns decoder outputs after attn
Field model.output:
 takes decoder outputs and returns distr over dictionary
__init__(padding_idx=0, start_idx=1, end_idx=2, unknown_idx=3, input_dropout=0, longest_label=1)

Initialize self. See help(type(self)) for accurate signature.

decode_forced(encoder_states, ys)

Decode with a fixed, true sequence, computing loss. Useful for training, or ranking fixed candidates.

  • ys (LongTensor[bsz, time]) – the prediction targets. Contains both the start and end tokens.
  • encoder_states (model specific) – Output of the encoder. Model specific types.

pair (logits, choices) containing the logits and MLE predictions

Return type:

(FloatTensor[bsz, ys, vocab], LongTensor[bsz, ys])

decode_greedy(encoder_states, bsz, maxlen)

Greedy search

  • bsz (int) – Batch size. Because encoder_states is model-specific, it cannot infer this automatically.
  • encoder_states (model specific) – Output of the encoder model.
  • maxlen (int) – Maximum decoding length

pair (logits, choices) of the greedy decode

Return type:

(FloatTensor[bsz, maxlen, vocab], LongTensor[bsz, maxlen])

forward(xs, ys=None, cand_params=None, prev_enc=None, maxlen=None)

Get output predictions from the model.

  • xs (LongTensor[bsz, seqlen]) – input to the encoder
  • ys (LongTensor[bsz, outlen]) – Expected output from the decoder. Used for teacher forcing to calculate loss.
  • prev_enc – if you know you’ll pass in the same xs multiple times, you can pass in the encoder output from the last forward pass to skip recalcuating the same encoder output.
  • maxlen – max number of tokens to decode. if not set, will use the length of the longest label this model has seen. ignored when ys is not None.

(scores, candidate_scores, encoder_states) tuple

  • scores contains the model’s predicted token scores. (FloatTensor[bsz, seqlen, num_features])
  • candidate_scores are the score the model assigned to each candidate. (FloatTensor[bsz, num_cands])
  • encoder_states are the output of model.encoder. Model specific types. Feed this back in to skip encoding on the next call.

reorder_decoder_incremental_state(incremental_state, inds)

Reorder incremental state for the decoder.

Used to expand selected beams in beam_search. Unlike reorder_encoder_states, implementing this method is optional. However, without incremental decoding, decoding a single beam becomes O(n^2) instead of O(n), which can make beam search impractically slow.

In order to fall back to non-incremental decoding, just return None from this method.

  • incremental_state (model specific) – second output of model.decoder
  • inds (LongTensor[n]) – indices to select and reorder over.

The re-ordered decoder incremental states. It should be the same type as incremental_state, and usable as an input to the decoder. This method should return None if the model does not support incremental decoding.

Return type:

model specific

reorder_encoder_states(encoder_states, indices)

Reorder encoder states according to a new set of indices.

This is an abstract method, and must be implemented by the user.

Its purpose is to provide beam search with a model-agnostic interface for beam search. For example, this method is used to sort hypotheses, expand beams, etc.

For example, assume that encoder_states is an bsz x 1 tensor of values

indices = [0, 2, 2]
encoder_states = [[0.1]

then the output will be

output = [[0.1]
  • encoder_states (model specific) – output from encoder. type is model specific.
  • indices (list[int]) – the indices to select over. The user must support non-tensor inputs.

The re-ordered encoder states. It should be of the same type as encoder states, and it must be a valid input to the decoder.

Return type:

model specific