Calculates the probabilistic coherence of a topic or topics. This approximates semantic coherence or human understandability of a topic.
CalcProbCoherence(phi, dtm, M = 5)
A numeric matrix or a numeric vector. The vector, or rows of the matrix represent the numeric relationship between topic(s) and terms. For example, this relationship may be p(word|topic) or p(topic|word).
A document term matrix or co-occurrence matrix of class
matrix
or whose class inherits from the Matrix
package. Columns
must index terms.
An integer for the number of words to be used in the calculation. Defaults to 5
Returns an object of class numeric
corresponding to the
probabilistic coherence of the input topic(s).
# Load a pre-formatted dtm and topic model
data(nih_sample_topic_model)
data(nih_sample_dtm)
CalcProbCoherence(phi = nih_sample_topic_model$phi, dtm = nih_sample_dtm, M = 5)
#> t_1 t_2 t_3 t_4 t_5 t_6 t_7
#> 0.05409345 0.41333333 0.16700000 0.19807143 0.15443478 0.26428571 0.24666667
#> t_8 t_9 t_10 t_11 t_12 t_13 t_14
#> 0.26169048 0.16490323 0.09089027 0.05856349 0.21644444 0.17644444 0.11154266
#> t_15 t_16 t_17 t_18 t_19 t_20 t_21
#> 0.35659259 0.20107407 0.24287931 0.48583598 0.15270529 0.37666667 0.38766667
#> t_22 t_23 t_24 t_25 t_26 t_27 t_28
#> 0.03250000 0.04352301 0.23952941 0.05400000 0.12155796 0.16772038 0.08503759
#> t_29 t_30
#> 0.04866667 0.33266667