A wrapper for the CTM function based on Blei's original code that returns a nicely-formatted topic model.
FitCtmModel(
dtm,
k,
calc_coherence = TRUE,
calc_r2 = FALSE,
return_all = TRUE,
...
)
A document term matrix of class dgCMatrix
Number of topics
Do you want to calculate probabilistic coherence of topics
after the model is trained? Defaults to TRUE
.
Do you want to calculate R-squared after the model is trained?
Defaults to FALSE
.
Logical. Do you want the raw results of the underlying
function returned along with the formatted results? Defaults to TRUE
.
Other arguments to pass to CTM or TmParallelApply. See note below.
Returns a list with a minimum of two objects, phi
and
theta
. The rows of phi
index topics and the columns index tokens.
The rows of theta
index documents and the columns index topics.
When passing additional arguments to CTM, you must unlist the
elements in the control
argument and pass them one by one. See examples for
how to dot this correctly.
# Load a pre-formatted dtm
data(nih_sample_dtm)
# Fit a CTM model on a sample of documents
model <- FitCtmModel(dtm = nih_sample_dtm[ sample(1:nrow(nih_sample_dtm) , 10) , ],
k = 3, return_all = FALSE)
#> Error in loadNamespace(x): there is no package called ‘topicmodels’
# the correct way to pass control arguments to CTM
if (FALSE) {
topics_CTM <- FitCtmModel(
dtm = nih_sample_dtm[ sample(1:nrow(nih_sample_dtm) , 10) , ],
k = 10,
calc_coherence = TRUE,
calc_r2 = TRUE,
return_all = TRUE,
estimate.beta = TRUE,
verbose = 0,
prefix = tempfile(),
save = 0,
keep = 0,
seed = as.integer(Sys.time()),
nstart = 1L,
best = TRUE,
var = list(iter.max = 500, tol = 10^-6),
em = list(iter.max = 1000, tol = 10^-4),
initialize = "random",
cg = list(iter.max = 500, tol = 10^-5)
)
}