Text Classification Using Mahout

G.V. Ramana Reddy; K. Mounika; A. Chinmayi; S. Fareed Hussain

Information

Journal Policies

Useful Links

International Journal of Research Studies in Computer Science and Engineering

Volume 1, Issue 5, 2014, Page No: 1-5

Text Classification Using Mahout

G.V. Ramana Reddy¹, K. Mounika², A. Chinmayi², S. Fareed Hussain²

1.Assistant Processor, Dept. of CSE BITS-Knl, JNTUA University.
2.IV Year Student of ECE, Dept. of ECE BITS-Knl, JNTUA University.

Citation : G.V. Ramana Reddy, K. Mounika, A. Chinmayi, S. Fareed Hussain, Text Classification Using Mahout International Journal of Research Studies in Computer Science and Engineering 2014, 1(5) : 1-5

Abstract

The storage, processing and analysis of BIGDATA present a plethora of new challenges to computer science researchers and IT professionals. Mahout is a set of distributed data mining libraries that interface with an underlying distributed system. The frame-work for the distributed system is Hadoop, which implements Mapreduce. Mahout provides a library of scalable machine learning algorithms useful for big data analysis based on Hadoop or other storage systems. Classification techniques decide how much a thing is or isn't part of some type or category, or how much it does or doesn't have some attribute. Classification, like clustering, is ubiquitous, but it's even more behind the scenes. This paper exhibits the classification technique by using Mahout. The sample data was taken from 20 Newsgroups and the resulting Confusion matrix is presented.

Download Full paper: Click Here