Mutators

Author: Stephen Roller

Mutators are task-independent data transformations, which are applicable to any dataset. Examples of mutators include:

  • Reversing all the turns in a conversation

  • Down-sampling the dataset

  • Shuffling the words in a turn

  • and much more

Mutators are particularly useful when you want to train or test on different variants of the data. For example, Sundkar et al. (2019) showed that different models react differently to having their turns or words shuffled.

For a full list of Mutators existing in ParlAI, check our [mutators][Mutators Reference].

:::{warning} New feature Mutators is a brand new feature in ParlAI. If you experience any issues with it, please file an issue on GitHub. :::

Usage

The --mutators argument should be available for every script where a --task argument is available. Simply begin adding it to use a mutated dataset.

For example, one of the simplest mutators is flatten, which just flattens the conversation.

parlai display_data --task dailydialog --mutators flatten

Composability

Mutators are intentionally designed to be composable. That is, we can stack mututators on top of each other by specifying multiple on the command line:

parlai display_data -t dailydialog --mutators word_shuffle+flatten
parlai display_data -t dailydialog --mutators word_shuffle,flatten  # equivalent

This runs the word_shuffle mutator, and pipes the output to the flatten mutator

Multi-task mutators

Mutators default to being applied on every task. For example, this applies the same mutators to both tasks (independently):

parlai display_data -t dailydialog,convai2 --mutators word_shuffle+flatten

If necessary, you may also supply mutators to specific tasks. Note that in this case, you can only use the + joiner, and , is unavailable.

parlai display_data -t dailydialog:mutators=word_shuffle,convai2:mutators=flatten+word_shuffle

Mutator arguments

Some mutators have additional arguments. For example, episode_shuffle has an argument preserve_context.

parlai display_data -t dailydialog --mutators episode_shuffle --preserve_context True

Unfortunately, mutator arguments cannot be directly specified when using the --task X:mutators= format. Instead, we can pass mutator arguments through the task argument.

parlai display_data -t dailydialog:mutators=episode_shuffle:preserve_context=True

Writing your own Mutators

Mutators are meant to be added too. Following other patterns in ParlAI, you can add your own mutators by making sure you decorate your Mutator class with @register_mutator("example_name") before the script runs if you’re using ParlAI in an IPython notebook; if you’ve checked out ParlAI code locally to make your own modifications, you can add a new file in parlai/mutators.

ParlAI has 3 base classes for Mutators. Choosing the right base class is only about making bookkeeping easier.

  • MessageMutator is used when you need to make changes to individual turns, and is no relationship between turns.

  • EpisodeMutator is when you want to make changes to whole conversations (episodes), but you want to keep the number of episodes fixed.

  • ManyEpisodeMutator is the most powerful setting, and lets you map each episode to 0 or more episodes. It is also slightly more complex.

:::{warning} Sharing Unlike you may expect from other parts of ParlAI, Mutators do not have any sort of sharing mechanism. There is always exactly one instance of each mutator specified on the command line. :::

For information on writing Mutators, please see the API Reference. As additional resources, we provide the following examples:

  • Word Shuffle: shows how to implement a simple MessageMutator.

  • Episode Reverse: shows how to implement a simple EpisodeMutator.

  • Flatten: shows how to implement a ManyEpisodeMutator.