Finding the mean in a partition distribution

Journal Publication ResearchOnline@JCU
Glassen, Thomas J.;Oertzen, Timo von;Konovalov, Dmitry A.
Abstract

Bayesian clustering algorithms, in particular those utilizing Dirichlet Processes (DP), return a sample of the posterior distribution of partitions of a set. However, in many applied cases a single clustering solution is desired, requiring a 'best' partition to be created from the posterior sample. It is an open research question which solution should be recommended in which situation. However, one such candidate is the sample mean, defined as the clustering with minimal squared distance to all partitions in the posterior sample, weighted by their probability. In this article, we review an algorithm that approximates this sample mean by using the Hungarian Method to compute the distance between partitions. This algorithm leaves room for further processing acceleration.

Journal

BMC Bioinformatics

Publication Name

N/A

Volume

19

ISBN/ISSN

1471-2105

Edition

N/A

Issue

N/A

Pages Count

10

Location

N/A

Publisher

BioMed Central

Publisher Url

N/A

Publisher Location

N/A

Publish Date

N/A

Url

N/A

Date

N/A

EISSN

N/A

DOI

10.1186/s12859-018-2359-z