agents.unigram

Baseline model which always emits the N most common non-punctuation unigrams. Typically this is mostly stopwords. This model is a poor conversationalist, but may get reasonable F1.

UnigramAgent has one option, –num-words, which controls the unigrams outputted.

This also makes a nice reference for a simple, minimalist agent.

class parlai.agents.unigram.unigram.UnigramAgent(opt, shared=None)

Bases: parlai.core.agents.Agent

__init__(opt, shared=None)

Construct a UnigramAgent.

Parameters:
  • opt – parlai options
  • shared – Used to duplicate the model for batching/hogwild.
act()

Stub act, which always makes the same prediction.

classmethod add_cmdline_args(parser)

Adds command line arguments

classmethod dictionary_class()

Returns the DictionaryAgent used for tokenization.

get_prediction()

Core algorithm, which gathers the most common unigrams into a string.

is_valid_word(word)

Marks whether a string may be included in the unigram list. Used to filter punctuation and special tokens.

load(path)

Stub load which ignores the model on disk, as UnigramAgent depends on the dictionary, which is saved elsewhere.

observe(obs)

Stub observe method.

save(path=None)

Stub save which dumps options. Necessary for evaluation scripts to load the model.

share()

Basic sharing function.