Finding the mean in a partition distribution
Journal Publication ResearchOnline@JCUAbstract
Bayesian clustering algorithms, in particular those utilizing Dirichlet Processes (DP), return a sample of the posterior distribution of partitions of a set. However, in many applied cases a single clustering solution is desired, requiring a 'best' partition to be created from the posterior sample. It is an open research question which solution should be recommended in which situation. However, one such candidate is the sample mean, defined as the clustering with minimal squared distance to all partitions in the posterior sample, weighted by their probability. In this article, we review an algorithm that approximates this sample mean by using the Hungarian Method to compute the distance between partitions. This algorithm leaves room for further processing acceleration.
Journal
BMC Bioinformatics
Publication Name
N/A
Volume
19
ISBN/ISSN
1471-2105
Edition
N/A
Issue
N/A
Pages Count
10
Location
N/A
Publisher
BioMed Central
Publisher Url
N/A
Publisher Location
N/A
Publish Date
N/A
Url
N/A
Date
N/A
EISSN
N/A
DOI
10.1186/s12859-018-2359-z