HiSP-GC: A Classification Method Based on Probabilistic Analysis of Patterns
AbstractClassification is one of the most important tasks in data mining and, nowadays, has been applied to solve problems related to different areas, such as administration, finance, education, health and others. Therefore, the construction of precise and computationally efficient classifiers is a relevant challenge in data mining field. In previous works we presented an efficient method for protein classification, called HiSP (Highest Subset Probability) classifier, capable of yielding highly accurate results, outperforming the results obtained by other researchers. Aiming to construct a general purpose classifier based on the ideas explored to solve the protein classification problem, the method previously proposed was adapted and extended. Here we present this expanded and general classification method, called HiSP-GC (HiSP General Classifier), and show that it is appropriate and efficient for several kinds of databases associated with different applications.