R/topic_modeling_core.R
Cluster2TopicModel.Rd
Represents a document clustering as a topic model of two matrices. phi: P(term | cluster) theta: P(cluster | document)
Cluster2TopicModel(dtm, clustering, ...)
A document term matrix of class dgCMatrix
or whose class
inherits from the Matrix
package. Columns must index terms, rows must
index documents.
A vector of length nrow(dtm)
whose entries form a
partitional clustering of the documents.
Other arguments to be passed to TmParallelApply
.
Returns a list with two elements, phi and theta. 'phi' is a matrix whose j-th row represents P(terms | cluster_j). 'theta' is a matrix whose j-th row represents P(clusters | document_j). Each row of theta should only have one non-zero element.