Hiding Sensitive XML Association Rules via Bayesian Network

3647 Words8 Pages

Abstract—Privacy Preserving Data Mining (PPDM) is getting attention of the researchers in different domain especially in Association Rule Mining. The purpose of the preserving association rules is to minimize the disclosing risk on shared information to the external parties. In this paper, we proposed a PPDM model for XML Association Rules (XARs). The proposed model identifies the most probable item called as sensitive to modify the original data source with more accuracy and reliability. Such reliability is not addressed before in the literature in any kind of methodology used in PPDM domain and especially in XML association rules mining. Thus, the significance of the suggested model sets and open new dimension to the academia in order to control the sensitive information in a more unyielding line of attack.

Keywords: XARs, PPDM, K2 algorithm,Bayesian Network, Association Rules

I. INTRODUCTION

I

n data mining, trends and patterns are identified on a huge set of data to discover knowledge. In such analysis, varieties of algorithms exist for extracting knowledge such as clustering, classification and association rule mining. Thus, association rules mining one domain for delivering knowledge on complex data. Moreover, the basis of the discovered association rules is usually determined by the minimum support s % and minimum confidence c% to represent the transactional items in database D. Thus, it has the implication of the form AB, where A is the antecedent and B is the consequent. The problem with such display of rules is the disclosure of sensitive information to the external part when data is shared. Hence Privacy Preserving in Data Mining (PPDM) related to Association Rules emerges.

In PPDM, Sensitive information is con...

... middle of paper ...

...066-1395, IEEE Computer Society Washington, DC, USA

[7]. M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, V. Verykios, “Disclosure Limitation of Sensitive Rules”, Page:45-52,Year of Publication: 1999, ISBN:0-7695-0453-1,IEEE Computer Society , Washington, DC, USA

[8]. Gregory F. Cooper and Edward Herskovits. A Bayesian method for the induction of probabilistic networks from data. Mach. Learn., 9(4):309{347, 1992.

[9]. R. Agralwal, T.Imielinski, and A.Swami. Mining associations between sets of items in large databases. In P.Buneman and S. Jajodia, editors, SIGMOD93, pages 207-216, Washington, D.C, USA, May 1993

[10]. O. Doguc, and J.E. Ramirez-Marquez “A generic method for estimating system reliability using Bayesian Networks”, in proc. Reliability Engineering and System Safety, ( 2008)

[11]. http://tunedit.org/repo/UCI/lymph.arff,DatasetAccessDate:31-03-2010

Open Document