Calculates the probabilistic coherence of a topic or topics. This approximates semantic coherence or human understandability of a topic.

CalcProbCoherence(phi, dtm, M = 5)

Arguments

phi

A numeric matrix or a numeric vector. The vector, or rows of the matrix represent the numeric relationship between topic(s) and terms. For example, this relationship may be p(word|topic) or p(topic|word).

dtm

A document term matrix or co-occurrence matrix of class matrix or whose class inherits from the Matrix package. Columns must index terms.

M

An integer for the number of words to be used in the calculation. Defaults to 5

Value

Returns an object of class numeric corresponding to the probabilistic coherence of the input topic(s).

Examples

# Load a pre-formatted dtm and topic model data(nih_sample_topic_model) data(nih_sample_dtm) CalcProbCoherence(phi = nih_sample_topic_model$phi, dtm = nih_sample_dtm, M = 5)
#> t_1 t_2 t_3 t_4 t_5 t_6 t_7 #> 0.05409345 0.41333333 0.16700000 0.19807143 0.15443478 0.26428571 0.24666667 #> t_8 t_9 t_10 t_11 t_12 t_13 t_14 #> 0.26169048 0.16490323 0.09089027 0.05856349 0.21644444 0.17644444 0.11154266 #> t_15 t_16 t_17 t_18 t_19 t_20 t_21 #> 0.35659259 0.20107407 0.24287931 0.48583598 0.15270529 0.37666667 0.38766667 #> t_22 t_23 t_24 t_25 t_26 t_27 t_28 #> 0.03250000 0.04352301 0.23952941 0.05400000 0.12155796 0.16772038 0.08503759 #> t_29 t_30 #> 0.04866667 0.33266667