Learning Accurate and Interpretable Classifiers Using Optimal Multi-Criteria Rules
Keywords:
Association rules, Classification, Pareto frontierAbstract
The Occam's Razor principle has become the basis for many Machine Learning algorithms, under the interpretation that the classifiershould not be more complex than necessary. Recently, this principle has shown to be well suited to associative classifiers, where the
number of rules composing the classifier can be substantially reduced by using condensed representations such as maximal or closed rules.
While it is shown that such a decrease in the complexity of the classifier (usually) does not compromise its accuracy, the number of remaining rules
is still larger than necessary and making it hard for experts to interpret the corresponding classifier. In this paper we propose a much more
aggressive filtering strategy, which decreases the number of rules within the classifier
dramatically without hurting its accuracy.
Our strategy consists in evaluating each rule under different statistical criteria, and filtering only those rules that show a positive
balance between all the criteria considered. Specifically, each candidate rule is associated with a point in an n-dimensional scattergram, where
each coordinate corresponds to a statistical criterion. Points that are not dominated by any other point in the scattergram compose
the {\em Pareto frontier}, and correspond to rules that are optimal in the sense that there is no rule that is better off when all
the criteria are taken into account. Finally, rules lying in the Pareto frontier are filtered and compose the classifier. Our Pareto-Optimal filtering strategy may receive as input either the entire set of rules or even a condensed representation (i.e., closed rules).
A systematic set of experiments involving benchmark data as well as recent data from actual application scenarios,
followed by an extensive set of significance tests, reveal that the proposed strategy
decreases the number of rules by up to two orders of magnitude and
produces classifiers that are extremely readable (i.e., allow interpretability of the classification results) without hurting accuracy.
Downloads
Download data is not yet available.
Downloads
Additional Files
Published
2013-09-11
Issue
Section
SBBD Articles