A base teacher class for doing dialog with fixed chat logs.
This class provides a set a basic functionality:
- uses data class to store and query text data
- generates action tables to send to the student agent from the data
- metrics tracking count of sent vs correctly answered queries
If you have
opt.numthreads > 1, this also activates a shared memory array for the data and lock-protected shared-memory metrics.
In order to subclass this class, you must implement
setup_data()in your class (or subclass another class which does, like
FbDialogTeacher), which reads your data file as an iterator.
Noneby default, but override this in children (such as
FbDialogTeacher) to load up candidate labels for every example.
DialogData(opt, data_loader=None, cands=None, shared=None, **kwargs)¶
Provides a data structure for accessing textual dialog data. This can be used whenever the dialog data is a fixed log of chats (i.e not a simulator setting). The logs can include dialog text and possibly supervised labels, candidate labels and rewards.
All these are stored in this internal data format which is used by the
data_loaderis an iterable, with each call returning:
(x, ...), new_episode?
xis a query and possibly context
...can contain additional fields, specifically
yis an iterable of label(s) for that query
ris the str reward for getting that query correct
cis an iterable of label candidates that the student can choose from
iis a str path to an image on disk, which will be loaded by the data class at request-time. should always point to the raw image file.
new_episode?is a boolean value specifying whether that example is the start of a new episode. If you don’t use episodes set this to
candscan be set to provide a list of candidate labels for every example in this dataset, which the agent can choose from (the correct answer should be in this set).
randomtells the data class whether or not to visit episodes sequentially or randomly when returning examples to the caller.
Returns total number of entries available. Each episode has at least one entry, but might have many more.
Return number of episodes in the dataset.
Returns a specific entry from the dataset.
Packs an entry into an action-observation dictionary.
StreamDialogData(opt, data_loader=None, cands=None, shared=None, **kwargs)¶
Provides a data structure for streaming textual dialog data. This can be used whenever the dialog data follows the format described in DialogData but cannot fit entirely into memory.
Additional keyword-argument cycle defines if the stream should restart from the beginning after an epoch is finished (defaults to True).
Returns a the next entry from the stream in the current episode for this instance. When episode is done returns first entry of next episode.
Reset the datastream to its beginning