Represents a document clustering as a topic model of two matrices. phi: P(term | cluster) theta: P(cluster | document)

Cluster2TopicModel(dtm, clustering, ...)



A document term matrix of class dgCMatrix or whose class inherits from the Matrix package. Columns must index terms, rows must index documents.


A vector of length nrow(dtm) whose entries form a partitional clustering of the documents.


Other arguments to be passed to TmParallelApply.


Returns a list with two elements, phi and theta. 'phi' is a matrix whose j-th row represents P(terms | cluster_j). 'theta' is a matrix whose j-th row represents P(clusters | document_j). Each row of theta should only have one non-zero element.


if (FALSE) { # Load pre-formatted data for use data(nih_sample_dtm) data(nih_sample) result <- Cluster2TopicModel(dtm = nih_sample_dtm, clustering = nih_sample$IC_NAME) }