You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I needed this today as well coincidentally, so coded something up based on Fleiss, Nee, and Landis (1979) "Large sample variance of kappa in the case of different set of raters." Equation 3 in this paper (which says don't do it). This is what stata uses. If the number of raters is not the same for each subject, they don't produce anything for inference.
deffleiss_standard_error(table):
n, k=table.shape# n_subjects, n_choicesm=table.sum(axis=1)[0] # assume they all have the same ratings countp_bar=table.sum(axis=0) / (n*m)
q_bar=1-p_barreturn (
(2**.5/ (p_bar.dot(q_bar) *np.sqrt(n*m* (m-1))))
* (
(p_bar.dot(q_bar) **2) -np.sum(p_bar*q_bar* (q_bar-p_bar))
) **.5
)
jseabold
added a commit
to jseabold/statsmodels
that referenced
this issue
May 9, 2024
https://stackoverflow.com/questions/78323943/statistic-values-of-fleiss-kappa-using-statsmodels-stats-inter-rater/78324041#78324041
Note our fleiss_kappa includes also randolph's kappa, i.e. we would need p-values also for those.
(needs reference, I have not looked at this in a long time)
copy from answer
The text was updated successfully, but these errors were encountered: