Takes the dataframe and the path to a directory and generates batches of augmented/normalized data.
flow_images_from_dataframe(
dataframe,
directory = NULL,
x_col = "filename",
y_col = "class",
generator = image_data_generator(),
target_size = c(256, 256),
color_mode = "rgb",
classes = NULL,
class_mode = "categorical",
batch_size = 32,
shuffle = TRUE,
seed = NULL,
save_to_dir = NULL,
save_prefix = "",
save_format = "png",
subset = NULL,
interpolation = "nearest",
drop_duplicates = TRUE
)
data.frame
containing the filepaths relative to
directory (or absolute paths if directory is NULL
) of the images in a
character column. It should include other column/s depending on the
class_mode
:
if class_mode
is "categorical" (default value) it must
include the y_col
column with the class/es of each image. Values in
column can be character/list if a single class or list if multiple classes.
if class_mode
is "binary" or "sparse" it must include the given
y_col
column with class values as strings.
if class_mode
is "other" it
should contain the columns specified in y_col
.
if class_mode
is "input" or NULL no extra column is needed.
character, path to the directory to read images from.
If NULL
, data in x_col
column should be absolute paths.
character, column in dataframe that contains the filenames
(or absolute paths if directory is NULL
).
string or list, column/s in dataframe that has the target data.
Image data generator to use for augmenting/normalizing image data.
Either NULL
(default to original size) or integer vector
(img_height, img_width)
.
one of "grayscale", "rgb". Default: "rgb". Whether the images will be converted to have 1 or 3 color channels.
optional list of classes (e.g. c('dogs', 'cats')
. Default:
NULL
If not provided, the list of classes will be automatically inferred
from the y_col
, which will map to the label indices, will be alphanumeric).
The dictionary containing the mapping from class names to class indices
can be obtained via the attribute class_indices
.
one of "categorical", "binary", "sparse", "input", "other" or None. Default: "categorical". Mode for yielding the targets:
"binary": 1D array of binary labels,
"categorical": 2D array of one-hot encoded labels. Supports multi-label output.
"sparse": 1D array of integer labels,
"input": images identical to input images (mainly used to work with autoencoders),
"other": array of y_col data,
NULL
, no targets are returned (the generator will only yield batches of
image data, which is useful to use in predict_generator()
).
int (default: 32
).
boolean (defaut: TRUE
).
int (default: NULL
).
NULL
or str (default: NULL
). This allows you to
optionally specify a directory to which to save the augmented pictures being
generated (useful for visualizing what you are doing).
str (default: ''). Prefix to use for filenames of saved
pictures (only relevant if save_to_dir
is set).
one of "png", "jpeg" (only relevant if save_to_dir is set). Default: "png".
Subset of data ("training"
or "validation"
) if
validation_split
is set in image_data_generator()
.
Interpolation method used to resample the image if the target size is different from that of the loaded image. Supported methods are "nearest", "bilinear", and "bicubic". If PIL version 1.1.3 or newer is installed, "lanczos" is also supported. If PIL version 3.4.0 or newer is installed, "box" and "hamming" are also supported. By default, "nearest" is used.
Boolean, whether to drop duplicate rows based on filename.
(x, y)
where x
is an array of image data and y
is a
array of corresponding labels. The generator loops indefinitely.
Yields batches indefinitely, in an infinite loop.
Other image preprocessing:
fit_image_data_generator()
,
flow_images_from_data()
,
flow_images_from_directory()
,
image_load()
,
image_to_array()