Reflection after meeting with Sophia

October 20, 2022

From this link, we learned the posterior mean is basically adding observations depending on the concentration value of the Dirichlet distribution for each of the observed outcomes. Thus, the sensitivity of the alpha value on the posterior means depends on the sample size (N) and if the data-generating model has the outlier class. If the data-generating model has an outlier class, using a large alpha value will cause the posterior mean to shrink too much towards the balanced class proportions.

Therefore, having an alpha = 2 is relatively uninformative (flat distribution), but why our Bayesian approach has a wired solution (outlier class with large s.d. estimates)?

Recall our GMM is a kind of latent class model. So we did not actually observe these classes. The inference of mixing proportions is conditional on all other parameters.

So maybe we should look at the prior distribution for class-specific random-effects standard deviation parameters.