Baseline model which always emits the N most common non-punctuation unigrams. Typically this is mostly stopwords. This model is a poor conversationalist, but may get reasonable F1.
UnigramAgent has one option, –num-words, which controls the unigrams outputted.
This also makes a nice reference for a simple, minimalist agent.
Construct a UnigramAgent.
- opt – parlai options
- shared – Used to duplicate the model for batching/hogwild.
Stub act, which always makes the same prediction.
Adds command line arguments
Returns the DictionaryAgent used for tokenization.
Core algorithm, which gathers the most common unigrams into a string.
Marks whether a string may be included in the unigram list. Used to filter punctuation and special tokens.
Stub load which ignores the model on disk, as UnigramAgent depends on the dictionary, which is saved elsewhere.
Stub observe method.
Stub save which dumps options. Necessary for evaluation scripts to load the model.
Basic sharing function.