Submit Paper

Article Processing Fee

Pay Online

           

Crossref logo

  DOI Prefix   10.20431


 

International Journal of Research Studies in Computer Science and Engineering
Volume 6, Issue 1, 2019, Page No: 6-15

A Comparison-Based Soft Clustering Algorithm for Documents

Ganesh Yadav1, Vipul Kumar Verma2

1.Assistant Professor, Department of CSE, IIMT Greater Noida, India.
2.Assistant Professor, Department of CSE, IIMT Greater Noida, India.

Citation : Ganesh Yadav, Vipul Kumar Verma, A Comparison-Based Soft Clustering Algorithm for Documents International Journal of Research Studies in Computer Science and Engineering 2019, 6(1) : 6-15.

Abstract

Data document clustering is an most important tool for searching document such as Web search engines. Clustering data documents enables the accessor to have a good overall view of the information contained in the documents that he has. However, existing clustering algorithms faces from various aspects; complex clustering algorithms (where each document belongs to exactly one cluster) cannot detect the multiple themes of a document, while flexible such as soft clustering algorithms (where each document can belong to multiple clusters) are usually inefficient. We propose CSCA (Comparison-based Soft Clustering), an efficient soft clustering algorithm based on a given similarity measure. CSCA requires only a similarity measure for clustering and uses randomization to help make the clustering efficient. Comparison with existing complex hard clustering algorithms like K-means and its variants shows that CSCA is both effective and efficient.


Download Full paper: Click Here