Discovery of Frequent Itemsets: Frequent Item Tree-Based Approach

A. V. Senthil Kumar, R. S. D. Wahidabanu

Abstract


Mining frequent patterns in large transactional databases is a highly researched area in the field of data mining. Existing frequent pattern discovering algorithms suffer from many problems regarding the high memory dependency when mining large amount of data, computational and I/O cost. Additionally, the recursive mining process to mine these structures is also too voracious in memory resources. In this paper, we describe a more efficient algorithm for mining complete frequent itemsets from transactional databases. The suggested algorithm is partially based on FP-tree hypothesis and extracts the frequent itemsets directly from the tree. Its memory requirement, which is independent from the number of processed transactions, is another benefit of the new method. We present performance comparisons for our algorithm against the Apriori algorithm and FP-growth.

Full Text:

PDF

References


R.Agrawal, T.Imielinski, and A.Swami, “Mining Association Rules between Sets of Items in Large Databases”, Proc. Of ACM SIGMOD,Washington DC, 1993.

R.Agrawal and R.Srikant, “Fast Algorithms for Mining Association Rules”, Proc.of the 20th Intl. Conf. on VLDB, Santiago, Chile, 1994.

J.Han, J.Pei, and Y.Yin, “Mining Frequent Patterns without Candidate Generation”, Proc. Of the ACM SIGMOD, Dallas, TX, 2000.

J.Pei, J.Han, H. Lu, S.Nishio, S.Tang, and D.Yang, “H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases”, Proc. Of IEEE ICDM, San Jose, California, 2001.

M.H. Zaki, “Scalable Algorithms for Association Mining”, IEEE Transactions on Knowledge and Data Engineering, May/June 2000, 372-390.

V.S. Ananthanarayana, D.K.Subramanian and M.N. Murty, “Scalable, distributed and dynamic mining of association rules”, Proc.of the 7th Intl.Conf. on High Performance Computing, Bangalore, India, pp.559-566.

R.J. Bayardo, “Efficiently mining long patterns from databases”, Proc. Of the ACM SIGMOD Intl. Conf.on Management of Data,Seatle, WA, pp. 85-93.

P.Shenoy, J.R.Haritsa, S.Sundarshan, G.Bhalotia, M.Bawa and D.Shah, “Turbo-charging vertical mining of large databases”, Proc.of the ACM SIGMOD, Dallas, TX, pp.22-33.

R.Ivancsy, F.Kovacs and I.Vajk, “An Analysis of Association Rule Mining Algorithms”, In CD-ROM Proc.of Fourth International ICSC Symposium on Engineering of Intelligent Systems (EIS 2004), Island of Madeira, Portugal.

I.Almaden. Quest synthetic data generation code. http://www.almaden.ibm.com/cs/quest/syndata.html




DOI: http://dx.doi.org/10.5614%2Fitbj.ict.2007.1.1.4

Refbacks

  • There are currently no refbacks.


Contact Information:

ITB Journal Publisher, LPPM – ITB, 

Center for Research and Community Services (CRCS) Building Floor 7th, 
Jl. Ganesha No. 10 Bandung 40132, Indonesia,

Tel. +62-22-86010080,

Fax.: +62-22-86010051;

e-mail: jictra@lppm.itb.ac.id.