reader
This section contains the documentation of movinets_helper\reader.py
.
Functionalities to read a TFRecordDataset ready to train a network.
add_states(video, label, stream_states={})
This function is expected to modify the dataset to make it ready for the movinet stream models, but couldn't get to train them
Parameters:
Name | Type | Description | Default |
---|---|---|---|
video |
_type_
|
description |
required |
label |
_type_
|
description |
required |
stream_states |
dict
|
description. Defaults to {}. |
{}
|
Returns:
Type | Description |
---|---|
Tuple[Dict[str, tf.Tensor], tf.Tensor]
|
Tuple[Dict[str, tf.Tensor], tf.Tensor]: description |
Source code in movinets_helper/reader.py
encode_label(label, num_classes)
One hot encodes the labels according to the number of classes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
label |
str
|
Label representing the movement of the video. |
required |
num_classes |
int
|
Total number of classes in the dataset. |
required |
Returns:
Type | Description |
---|---|
tf.Tensor
|
tf.Tensor: Encoded representation of the label |
Source code in movinets_helper/reader.py
format_features(video, label, resolution=172, scaling_factor=255.0, num_classes=2)
Transforms the data to have the appropriate shape.
This function must be called on a tf.data.Dataset (passed via its .map method).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
video |
tf.Tensor
|
Decoded video. |
required |
label |
str
|
Corresponding class of the video. |
required |
resolution |
int
|
The resolution will be model dependent. Movinet a0 and a1 use 172, a2 uses 224. Defaults to 172. |
172
|
scaling_factor |
float
|
Given the videos have the pixels in the range 0.255, transforms the data to the range [0, 1]. Defaults to 255.. |
255.0
|
num_classes |
int
|
Number of classes the model is trained on. I.e. for Kinetics 600 will be 600, for UCF 101 that will be the number. Defaults to 2. |
2
|
Returns:
Type | Description |
---|---|
Tuple[tf.Tensor, tf.Tensor]
|
Tuple[tf.Tensor, tf.Tensor]: When iterated, the first element will be the video, and the second will be the label as required by the model. |
Source code in movinets_helper/reader.py
get_dataset(filenames)
Generates a td.data.Dataset from the TFRecord files.
This is the appropriate format to be passed to model.fit, after it is formated and there is some batch called, so the final video object ingested by the model will have the shape [n_videos, n_frames, resolution, resolution, channels].
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filenames |
List[str]
|
List of .tfrecord files. |
required |
Returns:
Type | Description |
---|---|
tf.data.Dataset
|
tf.data.Dataset: Dataset ready to train the model. |
Example
target_path is the path to the .tfrecords files directory.
ds = get_dataset(list(Path(target_path).iterdir()))
This iterable may be formatted appropriately:
ds = get_dataset(list(Path(target_path_train).iterdir())) ds = ds.map(format_features)
To see a single example:
next(iter(ds))