Tech Report
Hierarchical clustering speed up using position lists and data position hierarchy
The aim of this paper is to address the nature of hierarchical clustering problems in systems with very large numbers of entities, and to propose specific speed improvements in the clustering algorithm. The motivation for this theme arises from the challenge of visualising the geographic and logical distribution of many tens of thousands of distance-learning students at the UK's Open University. A general algorithm for solving hierarchical clustering is mentioned at the beginning. Then the paper describes (i) a speed-up technique based on lists sorted according to particular dimensions or attributes of the entities to be visualised and (ii) a speed-up technique based upon hierarchical partitioning into regions. At the end, the paper discusses the algorithm's complexity and presents experimental results.
Keywords
hierarchical clustering, position hierarchy, position list, geographical information system