powered by
This function takes a selection of documents and bootstraps words from said sentences until there are N total sentences (both sudo and original).
Bootstrap_Vocab(vocab, N, stopwds, min_length = 7, max_length = 15)
The collection of documents to boostrap.
The total amount of sentences to end up with
A list of stopwords to not include in the bootstrapping proccess
The shortest allowable bootstrapped doument
The longest allowable bootstrapped document
A vector of bootstrapped sentences.
The min and max length arguements to not gaurantee that a sentence will reach that length. These senteces will be nonsensical.
# NOT RUN { testing_set = c(paste('this is test', as.character(seq(1, 10, 1)))) Bootstrap_Vocab(testing_set, 20, c('this')) # }
Run the code above in your browser using DataLab