parlai.core.teachers¶

This module provides a set of teachers that deal with dialog.

FixedDialogTeacher(Teacher) Base class for teachers in tasks that have fixed dialog - i.e., dialog that is not dynamically generated but rather is pulled from set examples. However, the class can be extended to all tasks involved fixed data. Implements much of the basic functionality of these teachers, including observe(), act(), next_example()

DialogTeacher(FixedDialogTeacher)
Base teacher class for doing dialog specifically with fixed chat logs.

ParlAIDialogTeacher(DialogTeacher)
Teacher class that provides access to data in the ParlAI Dialog format. See the class description for more details.

ConversationTeacher(DialogTeacher) Teacher class that provides access to data in the Conversations format. See the class description for more details.

FbDeprecatedDialogTeacher(DialogTeacher)
Teacher class that provides access to data in the Facebook Dialog format. See the class description for more details. This class is deprecated.

This module also includes DataLoader, a threadpool data loader for FixedDialogTeacher, and DialogData/StreamDialogData, data structures for accessing textual dialog data and utilized by DialogTeacher

class parlai.core.teachers.DataLoader(opt)[source]¶

Bases: Thread

A worker thread that provides a threadpool for data loading.

A teacher may submit a request to the loader, which will return the appropriate data.

To submit a request, a teacher should call request_load.

__init__(opt)[source]¶

This constructor should always be called with keyword arguments. Arguments are:

group should be None; reserved for future extension when a ThreadGroup class is implemented.

target is the callable object to be invoked by the run() method. Defaults to None, meaning nothing is called.

name is the thread name. By default, a unique name is constructed of the form “Thread-N” where N is a small decimal number.

args is the argument tuple for the target invocation. Defaults to ().

kwargs is a dictionary of keyword arguments for the target invocation. Defaults to {}.

If a subclass overrides the constructor, it must make sure to invoke the base class constructor (Thread.__init__()) before doing anything else to the thread.

request_load(receive_fn, load_fn, args)[source]¶

Queue a request for loading.

Parameters

receive_fn – a receive function (for receiving the data)
load_fn – a load function (for loading the data)
args – arguments for the load function. args can be either a dictionary of arguments for a function, or a list of positional arguments

run()[source]¶: Run the execution loop.

class parlai.core.teachers.Teacher(opt: Opt, shared=None)[source]¶

Bases: Agent

Basic Teacher agent that keeps track of how many times it’s received messages.

Teachers provide the report() method to get back metrics.

__init__(opt: Opt, shared=None)[source]¶

act()[source]¶: Act upon the previous observation.

epoch_done()[source]¶: Return whether the epoch is done.

num_examples()[source]¶

Return the number of examples (e.g. individual utterances) in the dataset.

Default implementation returns None, indicating an unknown number.

num_episodes()[source]¶

Return the number of episodes (e.g. conversations) in the dataset.

Default implementation returns None, indicating an unknown number.

report()[source]¶: Return metrics showing total examples and accuracy if available.

reset()[source]¶: Reset the teacher.

reset_metrics()[source]¶: Reset metrics.

share()[source]¶: In addition to default Agent shared parameters, share metrics.

class parlai.core.teachers.FixedDialogTeacher(opt, shared=None)[source]¶

Bases: Teacher

A teacher agent for all teachers involved in tasks with fixed data.

This class provides the following functionality for its subclasses:

Resets a teacher
Provides an observe method
Computes and retrieves the next episode index for a teacher
Provides a threadpool option for loading data (especially useful for large data, e.g. images)

In order to take advantage of the first few features, all a subclass has to implement is three functions: num_episodes, num_examples, and get (which returns a specific example from a specific episode).

To utilize the DataLoader for threadpool loading, a teacher should implement the submit_load_request function to send a load request to the DataLoader by calling self.data_loader.request_load with the appropriate arguments (receive_fn, load_fn, args). The DataLoader then returns the data to the teacher’s data_queue, which the teacher can poll in its act method.

The following is an example of the DataLoader usage in the VQA-V1 teacher.

In the teacher’s init function, the teacher calls its submit_load_request function to preload an image.
The submit_load_request function gets the next episode_idx, and computes the image path for the load request.
At the end of submit_load_request, the teacher calls self.data_loader.request_load with three args:
- self.receive_data - the function that the DataLoader calls to return the the loaded object
- self.image_loader.load - the function used to load the image from the image path
- [img_path] - a list of arguments for the load function, which in this case is the path of the image.
In the teacher’s act function, the teacher loads the data from its data queue.
At the end of the act function, the teacher calls submit_load_request to preload an image for the next example.

To see this in action, take a look at this teacher in tasks.vqa_v1.agents.

__init__(opt, shared=None)[source]¶

reset()[source]¶: Reset the dialog to the start of the epoch, and reset all metrics.

submit_load_request()[source]¶

Submit a load request.

An agent should implement this method to submit requests to the data loader. At the end of this method, the agent should call self.data_loader.request_load() with the appropriate args.

By default, this method does nothing.

receive_data(future: Future)[source]¶

Receive data from the data loader.

Parameters: future – result from the load request.

share()[source]¶: Share the data and dataloader.

next_episode_idx(num_eps=None, loop=None)[source]¶

Return the next episode index.

Parameters

num_eps – default None uses num_episodes value.
loop – default None loops during training but not evaluation.

next_example()[source]¶

Return the next example.

If there are multiple examples in the same episode, returns the next one in that episode. If that episode is over, gets a new episode index and returns the first example of that episode.

num_episodes() → int[source]¶: Get the number of episodes in this dataset.

num_examples() → int[source]¶: Get the total number of examples in this dataset.

get(episode_idx, entry_idx=0)[source]¶

Get the specified episode and the specified entry in that episode.

Children must override this method in order to inherit the next_example method.

Parameters

episode_idx – which episode to return examples from
entry_idx – which example to return from the episode. Many datasets have only single-entry episodes, so this defaults to zero.

observe(observation)[source]¶: Process observation for metrics.

custom_evaluation(teacher_action: Message, labels: Optional[Tuple[str]], model_response: Message) → None[source]¶

A method designated for hooking custom evaluations into teachers.

Generally, a user will want to use self.metrics.add to record any specialized metrics that only make sense for this one dataset.

Parameters

teacher_action – The message last sent from this teacher.
labels – The previous correct labels, if there were any.
model_response – The raw response from the model. Generally you want to rely on the text field, but others may be necessary in specific situations.

act()[source]¶: Send new dialog message.

get_orig_action() → Message[source]¶

Get the unprocessed action and reset if needed.

This function will return the raw action from self.next_example(), before the self.last_act and self.lastY attributes have been defined based on this action for metrics or custom evaluations. This is so that wrapper teachers can modify the raw action first, such as to change the contents of its ‘text’ and ‘label’ fields, without the action becoming out of sync with self.last_act and self.lastY.

process_action(action: Message) → Message[source]¶: Remember the raw action and prepare its fields for passing out of the teacher.

class parlai.core.teachers.DialogTeacher(opt, shared=None)[source]¶

Bases: FixedDialogTeacher

A base teacher class for doing dialog with fixed chat logs.

This class provides a set a basic functionality:

uses data class to store and query text data
generates action tables to send to the student agent from the data

In order to subclass this class, you must implement setup_data() in your class, which reads your data file as an iterator.

__init__(opt, shared=None)[source]¶

abstract setup_data(datafile: str)[source]¶

The core method which the user should override.

Yields the data, one message at a time, as well as markers indicating new episodes.

Parameters: datafile (str) – If the initializer set a ‘datafile’ field within the initialization, this will be provided here. Otherwise, datafile will be the fold: either “train”, “valid”, or “test”.
Returns: Yields pairs (message, new_episode) containing a Message object and whether the message marks the beginning of a totally new episode.

reset()[source]¶: Reset the dialog to the start of the epoch, reset all metrics.

share()[source]¶: Share the data.

label_candidates()[source]¶

Provide consistent label candidates for all examples.

Default implementation returns None always, but this may be overridden to provide candidates in all areas. See FbDialogueTeacher.

num_episodes() → int[source]¶: Return the number of episodes in the data.

num_examples() → int[source]¶: Return the number of examples in the data.

get(episode_idx, entry_idx=0)[source]¶: Get a specific example.

next_example()[source]¶: Get the next example.

class parlai.core.teachers.DialogData(opt, data_loader=None, cands=None, shared=None, **kwargs)[source]¶

Bases: object

Provides a data structure for accessing textual dialog data.

This can be used whenever the dialog data is a fixed log of chats (i.e not a simulator setting). The logs can include dialog text and possibly supervised labels, candidate labels and rewards.

All these are stored in this internal data format which is used by the DialogTeacher class.

Parameters

opt – options to initialize the class
data_loader – an iterable with each call returning a tuple in the form ((x, y, r, c, i), new_episode?) where the x and new_episode fields are mandatory and other fields may be omitted or None.
cands – can be set to provide a list of candidate labels for every example in this dataset, which the agent can choose from (the correct answer should be in this set).
random – tells the data class whether or not to visit episodes sequentially or randomly when returning examples to the caller.

The contents of the ((x, y, r, c, i), new_episode?) tuples returned by the data loader is the following:

x (str) is a query and possibly context
y (iter) is an iterable of label(s) for that query
r (str) is the str reward for getting that query correct
c (iter) is an iterable of label candidates that the student can choose from
i (str) is a str path to an image on disk, which will be loaded by the data class at request-time. should always point to the raw image file.
new_episode? (bool) is a boolean value specifying whether that example is the start of a new episode. If you don’t use episodes set this to True every time.

__init__(opt, data_loader=None, cands=None, shared=None, **kwargs)[source]¶

share()[source]¶: Share the data.

num_episodes()[source]¶: Return number of episodes in the dataset.

num_examples()[source]¶

Return total number of entries available.

Each episode has at least one entry, but might have many more.

get(episode_idx, entry_idx=0)[source]¶

Get the specified episode and the specified entry in that episode.

Parameters

episode_idx – which episode to return examples from
entry_idx – which example to return from the episode. Many datasets have only single-entry episodes, so this defaults to zero.

build_table(entry)[source]¶

Packs an entry into an action-observation dictionary.

Parameters: entry – a tuple in the form described in the class docstring.

class parlai.core.teachers.StreamDialogData(opt, data_loader=None, cands=None, shared=None, **kwargs)[source]¶

Bases: DialogData

Provides a data structure for streaming textual dialog data.

This can be used whenever the dialog data follows the format described in DialogData but cannot fit entirely into memory.

Additional keyword-argument cycle defines if the stream should restart from the beginning after an epoch is finished (defaults to True).

Parameters

opt – options to initialize the class
data_loader – an iterable with each call returning a tuple in the form ((x, y, r, c, i), new_episode?) where the x and new_episode fields are mandatory and other fields may be omitted or None.
cands – can be set to provide a list of candidate labels for every example in this dataset, which the agent can choose from (the correct answer should be in this set).
random – tells the data class whether or not to visit episodes sequentially or randomly when returning examples to the caller.
cycle – (default True) whether to restart at beginning when end of stream reached without reset being called.

__init__(opt, data_loader=None, cands=None, shared=None, **kwargs)[source]¶

share()[source]¶: Share the stream.

load_length()[source]¶

Calculate the length of the dataset and caches it in a file.

Note that this can take some time for large datasets. Episode and entry indexes cannot be specified during streaming.

num_examples()[source]¶: Return the number of examples in the data.

num_episodes()[source]¶: Return the number of episodes in the data.

get()[source]¶

Get the next entry from the stream.

When episode is done returns first entry of next episode.

reset()[source]¶: Reset the datastream to its beginning.

class parlai.core.teachers.FbDeprecatedDialogTeacher(opt, shared=None)[source]¶

Bases: DialogTeacher

This module provides access to data in the Facebook Dialog format.

Subclasses DialogTeacher for functionality and provides an implementation of setup_data() which iterates over datasets in the “fbdialog” format. If your data is in the format below, use this class to handle file parsing for you.

The way FB Dialog data is set up is as follows:

Sam went to the kitchen.
Pat gave Sam the milk.
Where is the milk?<TAB>kitchen<TAB>1<TAB>hallway|kitchen|bathroom
Sam went to the hallway.
Pat went to the bathroom.
Where is the milk?<TAB>hallway<TAB>1<TAB>hallway|kitchen|bathroom

Lines 1-6 represent a single episode, with two different examples: the first example is lines 1-3, and the second is lines 4-6.

Lines 1,2,4, and 5 represent contextual information.

Lines 3 and 6 contain a query, a label, a reward for getting the question correct, and three label candidates.

Since both of these examples are part of the same episode, the information provided in the first example is relevant to the query in the second example and therefore the agent must remember the first example in order to do well.

In general dialog in this format can contain any speech, not just QA pairs:

Hi how's it going?<TAB>It's going great. What's new?
Well I'm working on a new project at work.<TAB>Oh me too!
Oh cool!<TAB>Tell me about yours.

etc.

Note that dialogs are interpreted as being one-way. For example, consider this dialog:

X1    Y1
X2    Y2
X3    Y3

A set of examples X1 => Y1, X2 => Y2, and X3 => Y3 will be generated. However, Y1 => X2 and Y2 => X3 are not created as separate examples by default. This makes sense for some data (we don’t need to train on the idea that “kitchen” should be followed by “Sam went to the hallway…” above), but for other datasets it may be helpful to add additional examples in the reverse direction (“Oh cool!” is a response to “Oh me too!” above).

__init__(opt, shared=None)[source]¶

share()[source]¶: Share the data and candidates.

label_candidates()[source]¶: Return the candidates.

load_cands(path)[source]¶

Load a global fixed set of candidates.

The candidates will be provided by the teacher for every example (the true labels for a specific example are also added to this set, so that it’s possible to get the right answer).

setup_data(path)[source]¶

Read data in the fbdialog format.

Returns ((x,y,r,c), new_episode?) tuples.

x represents a query, y represents the labels, r represents any reward, and c represents any label_candidates.

The example above will be translated into the following tuples:

x: 'Sam went to the kitchen\nPat gave Sam the milk\nWhere is the milk?'
y: ['kitchen']
r: '1'
c: ['hallway', 'kitchen', 'bathroom']
new_episode = True (this is the first example in the episode)

x: 'Sam went to the hallway\\nPat went to the bathroom\\nWhere is the
    milk?'
y: ['hallway']
r: '1'
c: ['hallway', 'kitchen', 'bathroom']
new_episode = False (this is the second example in the episode)

class parlai.core.teachers.ParlAIDialogTeacher(opt, shared=None)[source]¶

Bases: FixedDialogTeacher

This module provides access to data in the ParlAI Text Dialog format.

Subclasses FixedDialogTeacher for functionality and provides an implementation of setup_data() which iterates over datasets in the “ParlAI text” format. If your data is in the format below, use this class to handle file parsing for you.

The way the data is set up is as follows:

text:Sam went to the kitchen. <NEWL>
Pat gave Sam the milk. <NEWL>
Where is the milk? <TAB> labels:kitchen <TAB> reward:1
<TAB> label_candidates:hallway|kitchen|bathroom
text:Sam went to the hallway. <NEWL>
Pat went to the bathroom. <NEWL>
Where is the milk? <TAB> labels:hallway <TAB> reward:1
<TAB> label_candidates:hallway|kitchen|bathroom <TAB> episode_done:True

Lines 1-2 represent a single episode, with a different example on each line. The lines contain a query and a label for getting the question correct, and three label candidates.

Since both of these examples are part of the same episode, the information provided in the first example is relevant to the query in the second example and therefore the agent must remember the first example in order to do well.

In general dialog this format can contain any speech, not just QA pairs:

text:Hi how's it going?<TAB>labels:It's going great. What's new?
text:Well I'm working on a new project at work.<TAB>labels:Oh me too!
text:Oh cool!<TAB>labels:Tell me about yours.

etc.

Note that dialogs are interpreted as being one-way. For example, consider this dialog:

X1    Y1
X2    Y2
X3    Y3

A set of examples X1 => Y1, X2 => Y2, and X3 => Y3 will be generated. However, Y1 => X2 and Y2 => X3 are not created as separate examples by default. This makes sense for some data (we don’t need to train on the idea that “kitchen” should be followed by “Sam went to the hallway…” above), but for other datasets it may be helpful to add additional examples in the reverse direction (“Oh cool!” is a response to “Oh me too!” above).

__init__(opt, shared=None)[source]¶

share()[source]¶: Share the episodes.

num_examples()[source]¶: Return the number of examples from the data.

num_episodes()[source]¶: Return the number of episodes from the data.

get(episode_idx, entry_idx=None)[source]¶: Get a specific example from the dataset.

class parlai.core.teachers.YamlTeacher(opt, shared=None)[source]¶

Bases: DialogTeacher

Teacher which loads data generated by parlai.utils.testing.AutoTeacherTest.

__init__(opt, shared=None)[source]¶

setup_data(datafile)[source]¶

The core method which the user should override.

Yields the data, one message at a time, as well as markers indicating new episodes.

Parameters: datafile (str) – If the initializer set a ‘datafile’ field within the initialization, this will be provided here. Otherwise, datafile will be the fold: either “train”, “valid”, or “test”.
Returns: Yields pairs (message, new_episode) containing a Message object and whether the message marks the beginning of a totally new episode.

class parlai.core.teachers.ConversationTeacher(opt, shared=None)[source]¶

Bases: DialogTeacher

This module provides access to data in the Conversations format.

Subclasses DialogTeacher for functionality and provides an implementation of setup_data() which iterates over datasets in the “Conversations” format. If your data is in the format below, use this class to handle file parsing for you.

The data should be set up so that each dialogue instance (or, episode) occupies one line of valid JSON. The way the data is set up is as follows:

:: { “dialog”: [ [ {“id”: “partner1”, “text”: “hello!”}, {“id”: “partner2”, “text”: “hi back!”} ] ] }

NOTE: If the data is not on one line per dialogue, it will not load. Further, note that by default, dialogs are interpreted as being one-way. For example, consider this dialog (not that the data below is not on:

{
    "dialog":[ [
        {"id":"modelx", "text": X1},
        {"id":"modely", "text": Y1},
        {"id":"modelx", "text": X2},
        {"id":"modely", "text": Y2},
        {"id":"modelx", "text": X3},
        {"id":"modely", "text": Y3},
    ] ]
}

(Note: we use line breaks for readability above, but this data will not load as stated, it must be on one line.)

A set of examples X1 => Y1, X2 => Y2, and X3 => Y3 will be generated, forming one episode. However, Y1 => X2 and Y2 => X3 are not created as separate examples by default. To change this behavior, you can set opt['label_turns'] or --label-turns flag. The default value is ‘secondspeaker’ (i.e., the second speaker’s utterances are used as labels), but ‘firstspeaker’ and ‘both’ are also options. In the case of ‘both’, two episodes are generated for each conversation.

__init__(opt, shared=None)[source]¶

setup_data(path)[source]¶

The core method which the user should override.

Yields the data, one message at a time, as well as markers indicating new episodes.

Parameters: datafile (str) – If the initializer set a ‘datafile’ field within the initialization, this will be provided here. Otherwise, datafile will be the fold: either “train”, “valid”, or “test”.
Returns: Yields pairs (message, new_episode) containing a Message object and whether the message marks the beginning of a totally new episode.

class parlai.core.teachers.AbstractImageTeacher(opt, shared=None)[source]¶

Bases: FixedDialogTeacher

Abstract class to allow easier creation of image + dialogue tasks.

This class handles creating image features via ImageLoader if applicable (resnet, resnext variants) or loading existing image features from a dict path as per get_image_features_path().

Important methods and properties (override in subclass if needed):

get_data_path(): where data file is found (default: <datapath>/<task>)
get_image_path(): where images found (default: <datapath>/<task>/images)
get_image_features_path(): dict of image features (default: <datapath>/<task>/image_features)
@property image_id_key: which key in data file objects represents image_id
@property text_key: which key in data file objects represents text

Note: Assumes data files are named <dt>.json

@abstractmethod image_id_to_image_path() must be implemented in subclass

Example with the key defaults (but the keys can be customized):

obs = {
    'text': <caption>,
    'image': <image features if specified else image>
}

__init__(opt, shared=None)[source]¶

get_available_image_mode_names()[source]¶

Available image model names.

resnet and resnext variants available from the ImageLoader. resnext101_XXXXX_wsl is the open-sourced FB AI model (960m images, 1.5k hashtags, finetuned on ImageNet).

property image_id_key¶

Which key in the input data dict objects uniquely identify each image.

Common image keys are “image_id” or “image_num”. May be implemented by subclass.

property text_key¶

Which key in the input data dict objects identifies the text.

Common keys are “text” or “comment”. May be implemented by subclass.

abstract image_id_to_image_path(image_id)[source]¶

Get the path of the image on disk.

Must be implemented by subclass.

get_data_path(opt)[source]¶: Determines path to the data file.

get_image_path(opt)[source]¶

Return the path to the data directory and to the image directory.

Is based on opt fields: task, datatype (train, valid, test), datapath.

Subclass can override this.

get_image_features_path(task, image_model_name, dt)[source]¶

Image features for the dataset images are stored here.

Can be overridden in subclass to use custom paths. Image features can be manually copied into this directory or in the case of ImageLoader eligible models, they will be built and stored here if not already there.

is_image_mode_buildable(model_name)[source]¶

Is buildable if features can be calculated by ImageLoader.

Users may wish to compute features for the dataset offline and use in the model, in which case, the image model should return False and get_image_features() should be overridden in subclass.

load_data(data_path, opt)[source]¶

Loading the data file, which is the index to the images and text.

It is often a .json file with the name of the <datatype>.json (i.e. train.json). Stores in self.data.

Can be override by subclass.

setup_image_features(data_path)[source]¶

Load text and image data.

The image features all live in dicts by default in <data_path>/ image_features/ but get_image_features_path() above can be overridden by subclass to put them elsewhere.

In the (very odd) case that the resnet or resnext dicts (models buildable using ImageLoader) are not found, we build them.

reset()[source]¶: Reset the dialog to the start of the epoch, and reset all metrics.

num_episodes()[source]¶: Get the number of episodes in this dataset.

num_examples()[source]¶: Get the total number of examples in this dataset.

get_image_features(example)[source]¶

Get image features for example.

Can be overridden in subclass for different behavior. For large datasets, it may be more appropriate to use the ImageLoader.load() method to load image features (as this is essentially streaming the features from disk, so that we do not have to load a large image feature dict in memory). #TODO Could be the default option if we are using -dt train:stream

get(episode_idx, entry_idx=0)[source]¶: Override this in subclass if your data should be handled in a different format.

share()[source]¶: Share the data and dataloader.

class parlai.core.teachers.MultiTaskTeacher(opt: Opt, shared=None)[source]¶

Bases: Teacher

MultiTaskTeacher which teaches multiple tasks.

Creates a teacher that is actually a set of teachers each based on a task string – each of these teachers will get called in turn, either randomly or in order. They are all in the same world (they are the same agent switching tasks).

The task string format is described for the create_task_agents() function above.

__init__(opt: Opt, shared=None)[source]¶

num_examples()[source]¶: Return the number of examples.

num_episodes()[source]¶: Return the number of episodes.

observe(observation)[source]¶: Make an observation.

act()[source]¶: Act on the previous observation.

epoch_done()[source]¶: Return whether all subtasks are completed.

report()[source]¶: Report aggregated metrics across all subtasks.

reset()[source]¶: Reset all subtasks.

reset_metrics()[source]¶: Reset metrics for each subtask.

share()[source]¶: Shares this teacher by sharing each subtask.

shutdown()[source]¶: Shutdown each agent.

class parlai.core.teachers.ChunkTeacher(opt, shared=None)[source]¶

Bases: FixedDialogTeacher, ABC

Useful for loading large amounts of data.

Data is separated into chunks and loaded one chunk at a time. Loads the data off of the main thread.

__init__(opt, shared=None)[source]¶

abstract get_num_samples(opt: Opt) → Tuple[int, int][source]¶

[Abstract] Return the number of samples.

Returns a tuple of (num_examples, num_episodes) based on the data split.

abstract get_fold_chunks(opt: Opt) → List[int][source]¶

[Abstract] Return a list of chunk IDs (integer).

Given the datatype (train/test/valid), return the list of chunk IDs that correspond to that split.

get_buffersize()[source]¶

Size of buffer.

Override this in your child class to change the buffer size.

share()[source]¶: Share the data and dataloader.

num_episodes()[source]¶: Get the number of episodes in this dataset.

num_examples()[source]¶: Get the total number of examples in this dataset.

next_episode_idx()[source]¶

Return the next episode index.

Parameters

num_eps – default None uses num_episodes value.
loop – default None loops during training but not evaluation.

receive_data(future)[source]¶

Receive loaded data and place it onto the sample queue.

Parameters: future – A Future object which will return a value from a call to get_chunk()

abstract load_from_chunk(chunk_idx: int) → List[ChunkOutput][source]¶

[Abstract] Given the chunk index, load examples from that chunk.

Return a list of tuples. The function _create_message will take these tuples to form the Message object that is returned by the teacher.

abstract create_message(queue_output: ChunkOutput, entry_idx=0) → Message[source]¶

[Abstract] Given the tuple output of the queue, return an act.

May depend on entry index if queue output is a multi-turn episode.

get_chunk()[source]¶: Refill the buffer.

next_example()[source]¶

Return the next example.

If there are multiple examples in the same episode, returns the next one in that episode. If that episode is over, gets a new episode index and returns the first example of that episode.

get(episode_idx, entry_idx=0)[source]¶

Get the specified episode and the specified entry in that episode.

Children must override this method in order to inherit the next_example method.

Parameters

episode_idx – which episode to return examples from
entry_idx – which example to return from the episode. Many datasets have only single-entry episodes, so this defaults to zero.

reset()[source]¶: Reset the dialog to the start of the epoch, and reset all metrics.

shutdown()[source]¶: Perform any final cleanup if needed.

parlai.core.teachers.create_task_agent_from_taskname(opt: Opt)[source]¶

Create task agent(s) assuming the input task_dir:teacher_class.

e.g. def_string is a shorthand path like babi:Task1k:1 or #babi or a complete path like parlai.tasks.babi.agents:Task1kTeacher:1, which essentially performs from parlai.tasks.babi import Task1kTeacher with the parameter 1 in opt['task'] to be used by the class Task1kTeacher.