BlenderBot 3: A 175B-parameter, publicly available chatbot that improves its skills & safety over time
-
BlenderBot 3 (BB3) is a 175B-parameter, publicly available chatbot released with model weights, code, datasets, and model cards. We’ve deployed it in a live interactive conversational AI demo.
-
BB3 searches the internet to chat about nearly any topic, and is designed to learn how to improve its skills and safety through natural conversations and feedback from people "in the wild."
-
Initial experiments show that as people interact with the model, the more it can learn, particularly using our new Director architecture.
-
Learning from people "in the wild" is not straightforward. We have developed new techniques that enable learning from helpful teachers while avoiding learning from people who are trying to trick the model into unhelpful or toxic responses.
-
We are committed to sharing participating organic conversational data collected from the live demo as well as model snapshots in the future. The goal is to help the community build ever-improving AI systems that can interact with people in safer and more helpful ways.
Papers
BB3 main technical report:
- BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage. Kurt Shuster†, Jing Xu†, Mojtaba Komeili†, Da Ju†, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora+, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston.
We are also concurrently releasing two companion papers describing key innovations:
-
Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback. Jing Xu, Megan Ung, Mojtaba Komeili, Kushal Arora, Y-Lan Boureau, Jason Weston.
-
Learning from data in the mixed adversarial non-adversarial case: Finding the helpers and ignoring the trolls. Da Ju, Jing Xu, Y-Lan Boureau, Jason Weston.
Finally, BB3 is dependent on other recent work we have published, in particular SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures, DIRECTOR: Generator-Classifiers For Supervised Language Modeling and SeeKeR: An Open source Search-Augmented Language Model. BB3 also builds on all our previous work, including BB1 and BB2 and related papers. See our team's projects and publications here.
Logbook
We are also releasing a BB3 Logbook documenting the development of our system, available here.
Models
We are releasing three model sizes: 3B, 30B and 175B.
The 3B and 30B models are available in the ParlAI model zoo.
- BlenderBot 3 3B: --model-file zoo:bb3/bb3_3B/model
- BlenderBot 3 30B: --model-file zoo:bb3/bb3_30B/model
The BB3 175B model is shared by request here.
Model card
The BB3 model card is available here.
Data card
See here for the BB3 data card.
Training datasets
We are releasing the new FITS dataset of Feedback on Internet Talk & Search used to train BB3.
Training is also multi-tasked with all the existing datasets from BB1 and BB2, e.g. the existing BST tasks from BlenderBot 1, and Multi-Session Chat and Wizard of the Internet from BB2. To train for safety we use the SaFeRDialogues and BAD dataset. In addition, we use a number of QA tasks and task-oriented dialogue datasets that are all available in ParlAI. See the tech report for the full list.
See the ParlAI quickstart for help.
BB3 Module Tasks
These tasks are used to train BB3's modules, and are hence adapted slightly, e.g. with appropriate control tokens provided in the context (see the paper for a full explanation). We thus provide here the explicit setup used to train BB3. The following multitask teachers train each of the BB3 modules:
projects.bb3.tasks.module_level_tasks:AlwaysSearchTeacher
projects.bb3.tasks.module_level_tasks:MaybeSearchTeacher
projects.bb3.tasks.module_level_tasks:MemoryDecisionTeacher
projects.bb3.tasks.module_level_tasks:SearchQueryGenerationTeacher
projects.bb3.tasks.module_level_tasks:MemoryGenerationTeacher
projects.bb3.tasks.module_level_tasks:MemoryKnowledgeGenerationTeacher
projects.bb3.tasks.module_level_tasks:SearchKnowledgeGenerationTeacher
projects.bb3.tasks.module_level_tasks:EntityKnowledgeGenerationTeacher
projects.bb3.tasks.module_level_tasks:SearchDialogueGenerationTeacher
projects.bb3.tasks.module_level_tasks:EntityDialogueGenerationTeacher
projects.bb3.tasks.module_level_tasks:MemoryDialogueGenerationTeacher
projects.bb3.tasks.module_level_tasks:VanillaDialogueGenerationTeacher
Code
We describe below the top-level overview of how to download + interact with our models. For more detailed information, visit this page.
BB3 3B Model: Training
The following command demonstrates how the BB3 3B model was trained:
TASKS="projects.bb3.tasks.module_level_tasks:AlwaysSearchTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:MaybeSearchTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:MemoryDecisionTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:SearchQueryGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:MemoryGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:MemoryKnowledgeGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:SearchKnowledgeGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:EntityKnowledgeGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:SearchDialogueGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:EntityDialogueGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:MemoryDialogueGenerationTeacher"
TASKS+=",projects.bb3.tasks.module_level_tasks:VanillaDialogueGenerationTeacher"
EVAL_TASKS="projects.bb3.tasks:WoiSearchQueryTeacher"
EVAL_TASKS+=",projects.bb3.tasks:MSCMemoryGeneratorTeacher"
EVAL_TASKS+=",projects.bb3.tasks:MSCMemoryKnowledgePersOverlapTeacher"
EVAL_TASKS+=",projects.bb3.tasks:Convai2MemoryKnowledgePersOverlapTeacher"
EVAL_TASKS+=",projects.bb3.tasks:WoiSearchKnowledgeTeacher"
EVAL_TASKS+=",projects.bb3.tasks:WowSearchKnowledgeTeacher"
EVAL_TASKS+=",projects.bb3.tasks:Convai2EntityKnowledgeTeacher"
EVAL_TASKS+=",projects.bb3.tasks:WowSearchDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:WoiSearchDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:EDEntityDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:BSTEntityDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:BSTEntityDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:MSCMemoryDialogueFromPersOverlapTeacher"
EVAL_TASKS+=",projects.bb3.tasks:Convai2MemoryDialogueFromPersOverlapTeacher"
EVAL_TASKS+=",projects.bb3.tasks:FitsSearchDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:SaferdialoguesVanillaDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:GoogleSgdSearchDialogueTeacher"
EVAL_TASKS+=",projects.bb3.tasks:LightVanillaDialogueTeacher"
python -m parlai.scripts.multiprocessing_train \
-t $TASKS -et $EVAL_TASKS \
-vstep 1000 -lstep 50 --batchsize 1 --init-opt arch/r2c2_base_3B \
--init-model zoo:seeker/r2c2_blenderbot_3B/model \
--model projects.seeker.agents.seeker:ComboFidGoldDocumentAgent \
--n-docs 5 --text-truncate 1024 --label-truncate 128 --truncate 1024 --fp16 True \
-lr 1e-06 --lr-scheduler reduceonplateau --lr-scheduler-patience 3 --optimizer adamw \
--save-after-valid True --warmup-updates 100 --update-freq 1 --gradient-clip 1.0 --skip-generation True \
--dropout 0.1 --attention-dropout 0.0 -vp 10 -vmt ppl -vmm min -vme 100000 --load-from-checkpoint true \
--ddp-backend zero2 --checkpoint-activations true --model-file /path/to/model/
BB3 3B Model: Download + Interact
We provide the BB3 3B model in ParlAI's model zoo. You can interact with the model via the following:
parlai safe_interactive --model-file zoo:bb3/bb3_3B/model --init-opt gen/r2c2_bb3
BB3 30B Model: Download
You can download the BB3 30B model via the following command:
wget http://parl.ai/downloads/_models/bb3/bb3_30B/consolidated.pt
BB3 175B Model: Download
You will receive instructions for downloading the 175B model if approved.
BB3 30B/175B: Interact
(Docs adapted from OPT docs)
After downloading the consolidated BB3 30B or 175B checkpoints, you will need to reshard according to your GPU resources. The 30B model checkpoint requires 56gb of GPU memory, while the 175B checkpoint requires 384GB of GPU memory.
After identifying how many GPUs you will need to run the models, you can use the following commands to reshard appropriately:
BB3 30B:
CONSOLIDATED=/path/to/bb3_30B/consolidated/
RESHARD=/save/path/to/bb3_30B/resharded/
MP=8
python -m metaseq.scripts.reshard_model_parallel $CONSOLIDATED/consolidated $MP --save-prefix $RESHARD/reshard
BB3 175B:
CONSOLIDATED=/path/to/bb3_175B/consolidated/
RESHARD=/save/path/to/bb3_175B/resharded/
MP=16
python -m metaseq.scripts.reshard_model_parallel $CONSOLIDATED/consolidated $MP --save-prefix $RESHARD/reshard
Then, you can follow the instructions for running an API in metaseq
to spin up the API. You will need to update the constants in metaseq/service/constants.py
to point to the right directories -- specifically, set the CHECKPOINT_FOLDER
to where you have downloaded the models.
Note that the gpt2-merges.txt and gpt2-vocab.json files in projects/OPT/assets/ will need to be moved to the corresponding directories defined in the constants.py file. You can directly download them with:
cd /path/to/resharded-weights
wget https://github.com/facebookresearch/metaseq/raw/main/projects/OPT/assets/gpt2-merges.txt
wget https://github.com/facebookresearch/metaseq/raw/main/projects/OPT/assets/gpt2-vocab.json
Once you have an API up and running, you can utilize the BB3 agents we provide to interact with the model:
parlai safe_interactive --init-opt gen/opt_bb3 --opt-server API_SERVER --loglevel debug --raw-search-server RELEVANT_SEARCH_SERVER
Holistic Bias
Commands for evaluating the BB3 models on the HolisticBias dataset of sentences with demographic terms can be found here.
Live deployment / demo
The live demo is available here. We have been placing ads—and conducting user studies—to allow members of the public to participate in using the system, and to optionally record interaction and feedback data for use by the research community. See the tech report for full details and evaluation metrics thus far.
Sharing interaction data & model improvements: coming next!
We are committed to openly sharing participating de-identified organic conversational data collected from the live demo as well as model snapshots in the future, as soon as we have collected enough data and assessed quality, safety and other issues. The overall goal of this project is to help the community build ever-improving open AI systems that can interact with people in safer and more helpful ways.