RO  EN
IMI/Publicaţii/CSJM/Ediţii/CSJM v.9, n.2 (26), 2001/

An efficient algorithm for mining interesting set-valued rules

Authors: Savinov Alexandr
Keywords: Data mining, Rule induction, Set-valued possibilistic rule, Prime disjunction, Dual transformation.

Abstract

We describe the problem of mining set valued rules in large relational tables containing categorical attributes taking a finite number of values. An example of such a rule might be “IF HOUSEHOLDSIZE = { Two OR Tree} AND OCCUPATION = {Professional OR Clerical} THEN PAYMENT_METHOD = { CashCheck (Max=249, Sum=4952) OR DebitCard (Max=175, Sum=3021)} WHERE Confidence=85%, Support=10%.” Such rules allow for an interval of possible values to be selected for each attribute in condition instead of a single value for association rules, while conclusion contains a projection of the data restricted by the condition onto a target attribute. An original conceptional and formal framework for representing multidimensional distributions induced from data is used. The distribution is represented by a number of so-called prime disjunctions upper bounding its surface and interpreted as a wide multidimensional interval of impossible combinations of attribute values. This original formalism generalises the conventional boolean approach in two directions: (i) finite-valued attributes (instead of only 0 and 1), and (ii) continuous-valued semantics (instead of true and false). In addition, we describe an efficient algorithm, which carries out the generalised dual transformation from possibilistic disjunctive normal form (DNF) representing data into conjunctive normal form (CNF) representing knowledge.

Alexandr A.Savinov,
GMD - German National Research Center
for Information Technology
Schloss Birlinghoven,
D-53754 Sankt-Augustin, Germany
E-mail:
Institute of Mathematics,
Academy Sciences of Moldova
str. Academiei 5,
MD-2028 Chisinau, Moldova

Fulltext

Adobe PDF document0.22 Mb