BlenderBot 3: A 175B-parameter, publicly available chatbot that improves its skills & safety over time


BB3 main technical report:

We are also concurrently releasing two companion papers describing key innovations:

Finally, BB3 is dependent on other recent work we have published, in particular SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures, DIRECTOR: Generator-Classifiers For Supervised Language Modeling and SeeKeR: An Open source Search-Augmented Language Model. BB3 also builds on all our previous work, including BB1 and BB2 and related papers. See our team's projects and publications here.


We are also releasing a BB3 Logbook documenting the development of our system, available here.


We are releasing three model sizes: 3B, 30B and 175B.

The 3B and 30B models are available in the ParlAI model zoo. - BlenderBot 3 3B: --model-file zoo:bb3/bb3_3B/model - BlenderBot 3 30B: --model-file zoo:bb3/bb3_30B/model

The BB3 175B model is shared by request here.

Model card

The BB3 model card is available here.

Data card

See here for the BB3 data card.

Training datasets

We are releasing the new FITS dataset of Feedback on Internet Talk & Search used to train BB3.

Training is also multi-tasked with all the existing datasets from BB1 and BB2, e.g. the existing BST tasks from BlenderBot 1, and Multi-Session Chat and Wizard of the Internet from BB2. To train for safety we use the SaFeRDialogues and BAD dataset. In addition, we use a number of QA tasks and task-oriented dialogue datasets that are all available in ParlAI. See the tech report for the full list.

See the ParlAI quickstart for help.

BB3 Module Tasks

These tasks are used to train BB3's modules, and are hence adapted slightly, e.g. with appropriate control tokens provided in the context (see the paper for a full explanation). We thus provide here the explicit setup used to train BB3. The following multitask teachers train each of the BB3 modules:


We describe below the top-level overview of how to download + interact with our models. For more detailed information, visit this page.

BB3 3B Model: Download + Interact

We provide the BB3 3B model in ParlAI's model zoo. You can interact with the model via the following:

parlai interactive --model-file zoo:bb3/bb3_3B/model --init-opt gen/r2c2_bb3

BB3 30B Model: Download

You can download the BB3 30B model via the following command:


BB3 175B Model: Download

You will receive instructions for downloading the 175B model if approved.

BB3 30B/175B: Interact

(Docs adapted from OPT docs)

After downloading the consolidated BB3 30B or 175B checkpoints, you will need to reshard according to your GPU resources. The 30B model checkpoint requires 56gb of GPU memory, while the 175B checkpoint requires 384GB of GPU memory.

After identifying how many GPUs you will need to run the models, you can use the following commands to reshard appropriately:

BB3 30B:

python -m metaseq.scripts.reshard_model_parallel $CONSOLIDATED/consolidated $MP --save-prefix $RESHARD/reshard

BB3 175B:

python -m metaseq.scripts.reshard_model_parallel $CONSOLIDATED/consolidated $MP --save-prefix $RESHARD/reshard

Then, you can follow the instructions for running an API in metaseq to spin up the API. You will need to update the constants in metaseq/service/ to point to the right directories -- specifically, set the CHECKPOINT_FOLDER to where you have downloaded the models.

Note that the gpt2-merges.txt and gpt2-vocab.json files in projects/OPT/assets/ will need to be moved to the corresponding directories defined in the file. You can directly download them with:

cd /path/to/resharded-weights

Once you have an API up and running, you can utilize the BB3 agents we provide to interact with the model:

parlai interactive --init-opt gen/opt_bb3 --opt-server API_SERVER --loglevel debug --raw-search-server RELEVANT_SEARCH_SERVER

Holistic Bias

Commands for evaluating the BB3 models on the HolisticBias dataset of sentences with demographic terms can be found here.

Live deployment / demo

The live demo is available here. We have been placing ads—and conducting user studies—to allow members of the public to participate in using the system, and to optionally record interaction and feedback data for use by the research community. See the tech report for full details and evaluation metrics thus far.

Sharing interaction data & model improvements: coming next!

We are committed to openly sharing participating de-identified organic conversational data collected from the live demo as well as model snapshots in the future, as soon as we have collected enough data and assessed quality, safety and other issues. The overall goal of this project is to help the community build ever-improving open AI systems that can interact with people in safer and more helpful ways.