Convert a the data into a torch::dataset()
which the vaeac model creates batches from.
vaeac_dataset(X, one_hot_max_sizes)
A torch_tensor contain the data of shape N x p, where N and p are the number of observations and features, respectively.
A torch tensor of dimension n_features
containing the one hot sizes of the n_features
features. That is, if the i
th feature is a categorical feature with 5 levels, then one_hot_max_sizes[i] = 5
.
While the size for continuous features can either be 0
or 1
.
This function creates a torch::dataset()
object that represent a map from keys to data samples.
It is used by the torch::dataloader()
to load data which should be used to extract the
batches for all epochs in the training phase of the neural network. Note that a dataset object
is an R6 instanc, see https://r6.r-lib.org/articles/Introduction.html, which is classical
object-oriented programming, with self reference. I.e, vaeac_dataset()
is a subclass
of type torch::dataset()
.
if (FALSE) { # \dontrun{
p <- 5
N <- 14
batch_size <- 10
one_hot_max_sizes <- rep(1, p)
vaeac_ds <- vaeac_dataset(
torch_tensor(matrix(rnorm(p * N), ncol = p),
dtype = torch_float()
),
one_hot_max_sizes
)
vaeac_ds
vaeac_dl <- torch::dataloader(
vaeac_ds,
batch_size = batch_size,
shuffle = TRUE,
drop_last = FALSE
)
vaeac_dl$.length()
vaeac_dl$.iter()
vaeac_iterator <- vaeac_dl$.iter()
vaeac_iterator$.next() # batch1
vaeac_iterator$.next() # batch2
vaeac_iterator$.next() # Empty
} # }