Date of Completion

12-10-2015

Embargo Period

12-9-2015

Keywords

Data mining, Algorithms, Frequent itemsets, Apriori algorithm, weighted frequent pattern mining algorithms, Uncertain databases, Conjunctive rules mining, Disjunctive rules mining, Causality, partial association, causal rule.

Major Advisor

Sanguthevar Rajasekaran

Associate Advisor

Reda A. Ammar

Associate Advisor

Chun-Hsi Huang

Field of Study

Computer Science and Engineering

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

Association rules mining is a common data mining problem that explores the relationships among items based on their occurrences in transactions. Traditional approaches to mine frequent patterns may not be applicable for several real life applications. There are many domains such as social networks, sensor networks, protein-protein interaction analysis, and inaccurate surveys where the data are uncertain. As opposed to deterministic or certain data where the occurrences of items in transactions are definite, in an uncertain database, the occurrence of an item in a transaction is characterized as a discrete random variable and thus represented by a probability distribution. In this case the frequency of an item (or an itemset) is calculated as the expected number of occurrences of the item (itemset) in the transactions. In this research work, we present efficient computational algorithms for three important problems in data mining involving uncertain data. Specifically, we offer algorithms for weighted frequent pattern mining, disjunctive association rules mining, and causal rules mining, all from uncertain data. Even though algorithms can be found in the literature for these three versions of rules mining, we are the first ones to address these problems in the context of uncertain data.

COinS