Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain Identity

Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston

Abstract

State-of-the-art dialogue models still often stumble with regards to factual accuracy and self-contradiction. Anecdotally, they have been observed to fail to maintain character identity throughout discourse; and more specifically, may take on the role of their interlocutor. In this work we formalize and quantify this deficiency, and show experimentally through human evaluations that this is indeed a problem. In contrast, we show that discriminative models trained specifically to recognize who is speaking can perform well; and further, these can be used as automated metrics. Finally, we evaluate a wide variety of mitigation methods, including changes to model architecture, training protocol, and decoding strategy. Our best models reduce mistaken identity issues by nearly 65% according to human annotators, while simultaneously improving engagingness. Despite these results, we find that maintaining character identity still remains a challenging problem.

Paper

Link to arXiv

Tasks

RPA Classifier Training

Full Datasplit:

parlai dd -t projects.light_whoami.task.agents:WhoIsSpeakingTeacher

Left to Right:

parlai dd -t projects.light_whoami.task.agents:WhoIsSpeakingLeftToRightTeacher

RPA Evaluation of Model Responses

parlai dd -t projects.light_whoami.task.agents:ResponseClassifierTeacher

Multi-Objective Training

parlai dd -t projects.light_whoami.task.agents:MultiObjectiveTeacher

Agent Code

NOTE: Each agent specified below can be used in tandem with the long-context generator agents from the MSC project by simply adding Long in front of the final agent name. E.g., projects.light_whoami.agents.rpa_rerank:RPARerankAgent becomes projects.light_whoami.agents.rpa_rerank:LongRPARerankAgent`, and so on.

RPA Re-ranker Agents

These agents will re-rank beams from the base model according to RPA score. One must specify a --predictor-model-file pointing to an RPA Classifier.

parlai i -m projects.light_whoami.agents.rpa_rerank:RPARerankAgent \
-mf <path_to_model> --predictor-model-file <path_to_predictor_model>

If you'd like to use a predictor model file other than that used for RPA re-ranking, please see instructions here for how to implement your own re-ranker. Then, subclass the AbstractGeneratorRerankAgent, implementing the get_reranker_class method to point to your re-ranker.

PACER Agents

In addition to re-ranking the final beams according to RPA score, these models will apply ranking on partial sequences. Use the following parameters to control this level of ranking:

--pacer-n-tokens: How many tokens to consider when rescoring on partial sequences
--pacer-frequency-ratio: How often to apply PACER re-ranking when decoding.

parlai i -m projects.light_whoami.agents.pacer:PacerAgent \ -mf --predictor-model-file

If you'd like to use a predictor model file other than that used for RPA Re-Ranking, simply subclass the PacerAgent and implement get_reranker_class() to return your constructed re-ranker object (see steps here).

Unlikelihood Agents

One can apply RPA Unlikelihood in order to discourage the agent from generating tokens that yield the wrong predicted speaker. This agent requires a predictor model file as well. The following parameters are important for controlling training:

--ul-top-k-toks: How many tokens to apply the UL loss to.
--only-wrong-class-toks: Set True to only apply UL loss to tokens that yield the wrong predicted speaker.
--all-wrong-class-toks: Set True to apply UL loss to all tokens in an utterance that result in the wrong predicted speaker.

parlai train_model -m projects.light_whoami.agents.rpa_ul:RpaUlAgent \ --predictor-model-file \ --init-model ...

Multi-Objective Agents

One can utilize the multi-objective agents to train both the generator NLL loss and a character prediction ranking loss. Important parameters:

--n-multiobjective-layers/heads: Specify number of layers/heads to use as additional components in predicting the speaker.
--multiobjective-latent-representation: One of ['encoder_final_layer', 'decoder_final_layer', 'encoder_and_decoder'], sets which representations to use when predicting the speaker.

parlai train_model -m projects.light_whoami.agents.multi_objective:MultiObjectiveGeneratorAgent \ --init-model ...

Profile Expanded Decoder Attention

Use these agents in an "expanded" attention scenario, where a portion of the input (or something otherwise specified) is attended to in a third round of attention in the decoder (following self-attention and encoder-attention). The following parameters are useful:

To set the context from which to pull expanded attention input - --expanded-attention-input-key: Key in the teacher message to pull from for expanded attention - --expanded-attention-input-extractor-phrases: If specified, the input for expanded attention will consist only of pieces of the delimited input that contain these phrases. - --expanded-attention-num-rounds: How many rounds to apply the expanded attention.

parlai train_model -m projects.light_whoami.agents.expanded_attention:ExpandedDecoderAttentionAgent \
--init-model <path_to_init_model> ...

Automated Expanded Decoder Attention

To automatically learn what to re-attend to within the context, you can use the same agent as above, but specify --expanded-attention-type <automated_classifier/automated_trainable_mask>. For automated_trainable_mask, there are no additional parameters required. For automated_classifier, one must specify the --predictor-model-file as before.

Automated Expanded Decoder Attention + Multi-Objective Training

To leverage multi-objective training within an automated expanded attention scenario, simply set --expanded-attention-type automated_trainable_mask, and the proper agent, along with any desired multi-objective arguments from above:

parlai train_model -m projects.light_whoami.agents.expanded_attention:ExpandedDecoderAttentionAndMultiObjectiveAgent \
--expanded-attention-type automated_trainable_mask --init-model <path_to_init_model> \
...

Expanded Decoder Attention + RPA Re-ranking / PACER

The following agents combine expanded decoder attention with RPA Re-Ranking or PACER Re-Ranking functionality:

parlai i -m projects.light_whoami.agents.expanded_attention:ExpandedDecoderAttentionAndRPARerankerAgent \
--model-file <path_to_expanded_attention_agent> --predictor-model-file <path_to_predictor_model_file >...

parlai i -m projects.light_whoami.agents.expanded_attention:ExpandedDecoderAttentionAndPacerAgent \
--model-file <path_to_expanded_attention_agent> --predictor-model-file <path_to_predictor_model_file >...

Pre-Trained Models

The following table provides the zoo paths for the released pre-trained models (used in --model-file or --init-model):

Model	RPA	Mistaken Identity	Zoo Path
LTR RPA Re-Ranker	-	-	zoo:light_whoami/rpa_reranker/model
128-Truncate Vanilla Baseline	87.61	6.45%	zoo:light_whoami/vanilla_128/model
1024-Truncate Vanilla Baseline	87.71	7.35%	zoo:light_whoami/vanilla_1024/model
128-Truncate RPA Unlikelihood (Top1)	87.48	7.13%	zoo:light_whoami/rpa_ul_128/model
1024-Truncate RPA Unlikelihood (Top1)	-	-	zoo:light_whoami/rpa_ul_1024/model
Multi-Objective (Vanilla, Dec. Only)	87.67	10.00%	zoo:light_whoami/multiobjective/model
Profile Expanded Attention (128, 2 rounds over ABC)	91.70	4.82%	zoo:light_whoami/profile_expanded_attention_128/model
Profile Expanded Attention (1024, 2 rounds over ABCD)	92.18	4.00%	zoo:light_whoami/profile_expanded_attention_1024/model
Automated Expanded Attention (1024, Classifier Attn.)	90.93	5.51%	zoo:light_whoami/automated_expanded_attention_1024/model
Automated Expanded Attention + Multi-Objective (1024, Dec. Only)	88.95	4.43%	zoo:light_whoami/expanded_and_multiobjective_1024/model