Generating Model Cards

Author: Wendy Zhang

What is a model card?

Think of model cards as a condensed medical card for models. It is a great way for people who might not have the time to read a paper in detail to get the gist of what a model is doing, the datasets involved, how it is performing, and any concerns that the author might have about the model.

You can check out the Model Cards for Model Reporting paper, and here’s a sample model card for the Blenderbot2.0 2.7B model. In addition, here is a link to some more model card examples.

The Process

There are two steps in generating the model cards. imageonline-co-whitebackgroundremoved (3)

For both steps, we should specify the following arguments:

  • --model-file / -mf: the model file

  • --folder-to-save / -fts: the location where we’re saving reports

and add the command --mode gen to signify we’re in (report) generation mode.

Step 1: Generating reports

In general, we can use a command like this for report generation:

# template
parlai gmc -mf <model file> -fts <folder name> --mode gen
# sample
parlai gmc -mf zoo:dialogue_safety/multi_turn/model -fts safety_single --mode gen

However, depending on the situation, we might need to add these arguments as well:

  • --wrapper / -w only if the model is a generation model

  • --model-type / -mt only if the model isn’t added to or already in model_list.py

    • possible choices include ranker, generator, classifier, retriever

  • --task / -t and --evaltask/-et only if the original model.opt used task/datasets not in the form of a teacher or if the task/dataset is no longer accessible

    • tasks starting with fromfile or jsonfile will be ignored unless --ignore-unfound-tasks is set to False (by default, it’s true)

In addition, if the model itself needs certain arguments (ie. --search-server), we should specify them at this stage too. We can also add --batchsize for faster generation.

Check out the section about generating reports for more information on the report generation process and how to generate single reports (very useful for debugging).

Step 2: Model Card Generation

If some kind of model description has already been added to the model_list.py (distinguished by path, which should be the same as model_file), and reports were sucessfully generated in the step before, then we can simply run the following command

# template
parlai gmc -mf <model file> -fts <folder to save>
# example
parlai gmc -mf zoo:dialogue_safety/multi_turn/model -fts safety_multi

Examples

Here are some samples commands:

  • Dialogue Safety (multi-turn)

parlai gmc -mf zoo:dialogue_safety/multi_turn/model -fts safety_multi -bs 128  --mode gen -t dialogue_safety:wikiToxicComments,dialogue_safety:adversarial:round-only=False:round=1,dialogue_safety:multiturn -et dialogue_safety:wikiToxicComments,dialogue_safety:adversarial:round-only=False:round=1,dialogue_safety:multiturn --data-parallel False
parlai gmc -mf zoo:dialogue_safety/multi_turn/model -fts safety_multi
  • Blenderbot 90M

parlai gmc -mf zoo:blender/blender_90M/model -fts blenderbot_90M -w blenderbot_90M -bs 128 --mode gen
parlai gmc -mf zoo:blender/blender_90M/model -fts blenderbot_90M

Report Generation Details

imageonline-co-whitebackgroundremoved (4)

In the end, it should generate the following reports under the --folder-to-save

  • a folder data_stats/ that contains the data stats of the training set

  • a eval_results.json that contains the evaluation results based on the evaltasks

  • a sample.json file contain a sample input and output from the model

  • for generators, it should generate a folder safety_bench_res that contains the safety_bench results (click here to learn more about the safety bench).

Here are some images of the expected behavior:

  • Successful generations should end with a green message like this: Screen Shot 2021-07-26 at 3 58 33 PM

  • Unsucessful generations should tell us which reports are missing and why. Screen Shot 2021-07-26 at 11 32 17 AM

  • When tasks are dropped due to being unaccessible or in a fromfile or jsonfile format, it should look like this (w/o the blackout) Screen Shot 2021-08-06 at 9 10 56 AM

Generating single reports

Sometimes, you might want to generate only certain reports. In this case, instead of using --mode gen, we should use following possibilites:

  • --mode gen:data_stats to generate the data_stats/ folder

  • --mode gen:eval to generate the eval_results.json file (evaluation results)

  • --mode gen:safety to generate the safety_bench_res folder

  • --mode gen:sample to generate the sample.json file

Optional Customizations

  • Use --evaluation-report-file to specify the location of your own evaluation report file.

  • Use --mode editing/final to specify which mode you would like to use for model card generation.

    Currently, there are two different modes editing or final for step 2. For the editing mode, the code will generate messages like this:

    :warning: missing section name: Probably need to be grabbed from paper & added to model_list.py by u (the creator) :warning:

    In final mode, such messages will not exist. By default, the mode is editing.

Using --extra-args-path

We can use --extra-args-path to pass in longer arguments. By default, the ---extra-args-path will be <folder-to-save>/args.json, so if we create a file at that location, we don’t need to add args.json.

Adding Custom Dataset and Model Info

By default, the code will try to find a sections in model_list.py. However, instead of changing model_list.py, we can also pass in a .json file to --extra-args-path with out new section. Here’s us trying to add the intended use section

# args.json
{
   "extra_models": {
      "zoo:blender/blender_90M/model": {
         # section name (lowercased and underscores removed): section content
         "privacy": "Our model is intended for research purposes only, and is not yet production ready...."
      }
   }   
}

Similarly, if we don’t want to touch task_list.py (information about the tasks), we can also pass the details via --extra-args-path. Here’s us trying add a description for dummy_task:

# args.json
{
   "extra_tasks": {
      "dummy_task": {
         # type of info: info
         "description": "This is a dummy task, not a real task"
      }
   }   
}

The information passed via this method can partially overwrite what’s written in task_list.py and model_list.py.

Add Custom Sections or Changing Section Order

To add sections, there’s two ways to do this.

  1. After we generate the inital model card, we can directly edit the generated markdown file.

  2. If there’s a lot section movement or deletion, use add a user_sections key to specify the entire section order to the .json file that we pass to --extra-args-path. For instance, this is the default order of sections:

         section_list = [
            "model_details",
            "model_details:_quick_usage",
            "model_details:_sample_input_and_output",
            "intended_use",
            "limitations",
            "privacy",
            "datasets_used",
            "evaluation",
            "extra_analysis",
            "related_paper",
            "hyperparameters",
            "feedback",
         ]
    

    Note that adding :_ implies that it’s a subsection, and I would advise to use underscore _ in place of spaces (don’t worry; they’ll be changed back to spaces for the section title).

    Here’s us trying to to reverse the order and remove the model_details section (for kudos):

    # args.json
    {
       "user_sections": [
          "feedback",
          "hyperparameters",
          "related_paper",
          "extra_analysis",
          "evaluation",
          "datasets_used",
          "privacy",
          "limitations",
          "intended_use"
       ]
    }