Clustering is a well-established area of data and text analysis, oriented at segmentation of the phenomenon in question into homogeneous fragments. Yet there are a number of haunting issues and concerns in clustering, of which probably the most imperative are: (a) Is there any cluster structure at all? (b) If yes, how many clusters? (c) What object-to-object similarity measure to choose? (d) Which features are useful for clustering, and which are not? (e) Can a mixed, categorical and numerical, feature space be used for clustering? (f) How can one reconcile clustering solutions if they do differ from each other? The talk brings in some novel and not-so-novel developments in clustering, that are useful in addressing these and similar concerns. Specifically, an SVD-like modeling will be described to underlie approaches such as k-means clustering, Ward divisive clustering, one-cluster clustering, consensus clustering, spectral clustering, network community detection. The first lecture will concentrate on the case of object-to-feature data, and the second, on object-to-object (dis)similarity data. A number of examples of application will be presented as well.
Model- and experiment-driven recommendations for haunting issues in clustering
Оргкомитет:
Prof. Dr. Boris Mirkin
National Research University Higher School of Economics, Russia
Член комитета
Professor
Программный комитет
Файл:
Добавить комментарий