A Narrative Approach for Removal Embedded Prototype from Big Tree Data
DOI:
https://doi.org/10.53555/nncse.v6i9.790Keywords:
Data mining, Algorithm design and analysis, Informatics, Computational modeling, Databases, Upper bound, Data modelsAbstract
Many modern functions and systems represent and exchange data in tree-structured form and process and produce large tree datasets. Discovering informative patterns in large tree datasets is an important research area that has many practical applications. We propose a novel approach that exploits efficient homomorphic pattern matching algorithms to compute pattern support incrementally and avoids the costly enumeration of all patterns matching required by previous approaches. To reduce space consumption, matching information of already computed patterns is materialized as bitmaps. We further optimize our basic support computation method by designing an algorithm which incrementally generates the bitmaps of the embeddings of a new candidate pattern without first explicitly computing the embeddings of this pattern. Our extensive experimental results on real and synthetic large-tree datasets show that our approach displays orders of magnitude performance improvements over a state-of-the-art tree mining algorithm and a recent graph mining algorithm.
References
Asai T, Arimura H, Uno T, Nakano S-I (2003) Discovering frequent substructures in large unordered trees. In: Discovery, Science, pp 47–61
Bruno N, Koudas N, and Srivastava D (2002) Holistic twig joins: optimal XML pattern matching. In: SIGMOD, pp 310–321.
Chi Y, Xia Y, Yang Y, Muntz RR (2005) Mining closed and maximal frequent subtrees from databases of labeled rooted trees. IEEE Trans Knowl Data Eng 17(2):190–202
Chi Y, Yang Y, and Muntz RR (2004) Hybridtreeminer: an efficient algorithm for mining frequent rooted trees and free trees using canonical form. In: SSDBM, pp 11–20
Chi Y, Yang Y, Muntz RR (2005) Canonical forms for labeled trees and their applications in frequent subtree mining. Knowl Inf Syst 8(2):203–234.
Dries A, Nijssen S (2012) Mining patterns in networks using homomorphism. In: SDM, pp 260–271
Feng Z, Hsu W, and Lee M-L (2005) Efficient pattern discovery for semistructured data. In: ICTAI, pp 294–301
Goethals B, Hoekx E, and den Bussche JV (2005) Mining tree queries in a graph. In: KDD, pp 61–69
Kibriya AM, Ramon J (2013) Nearly exact mining of frequent trees in large networks. Data Min Knowl Discov 27(3):478–504
Kilpela¨inen P, Mannila H (1995) Ordered and unordered tree inclusion. SIAM J Comput 24(2):340–356
Miklau G, Suciu D (2004) Containment and equivalence for a fragment of xpath. J ACM 51(1):2–45
Nijssen S, Kok JN (2004) A quickstart in frequent structure mining can make a difference. In: KDD, pp 647–652
Tan H, Hadzic F, Dillon TS, Chang E, Feng L (2008) Tree model guided candidate generation for mining frequent subtrees from xml documents. TKDD 2(2):1–43
Tatikonda S, Parthasarathy S, Kurc¸ TM (2006) Trips and tides: new algorithms for tree mining. In: CIKM, pp 455–464
Termier A, Rousset M-C, Sebag M (2002) Treefinder: a first step towards xml data mining. In: ICDM, pp 450–457.
Downloads
Published
Issue
Section
License
Copyright (c) 2019 Journal of Advance Research in Computer Science & Engineering (ISSN: 2456-3552)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Terms & Condition
Submission -
Author can submit the manuscript through our online submission process or email us at the designated email id in contact details.
The other mode of submission not accepted than online and email.
Before submission please read the submission guidelines.
NN Publication accepts only article submitted in pdf/doc/docx/rtf file format. Another format except given file formats will no be considered .
Author will be responsible for the error mistakes in the submission files. The minor changes can be done without any cost after publication. But for major changes NN Publication may charges you the editing charges.
Publication (Online) -
The online publication is scheduled on last date of every month, but it can be delayed by 24 to 48 hours due to editorial process if huge number of articles comes to publish in single issue.
Automatic notificatation email will be sent to the all users on publication of an issue, so its author’s duty to check their email inbox or SPAM folder to get this notification.
After publication of article author can not withdraw their article.
If editor’s found any issue after publication of article then the NN Publication have the authority to remove the article from online website.
No refund will be provided after online publication of article.
Publication (Print) -
The print copy publication are sent as per the author’s request after 2 weeks of online publication of that issue.
NN Publication will ship the article by India Post and provide the consignment number on dispatch of print copy.
NN Publication follows all the guidelines of delivery provided by IndiaPost and hence not responsible for delay in delivery due to any kind of reasons.
Refund of hard copy will not be provided after dispatch or print of the journal.
NN Publication will be responsible for raise a complain if there is any issue occurs in delivery, but still will not be responsible for providing the refund.
NN Publication will be responsible to resend the print copy only and only if the print copy is lost or print copy is damaged in delivery / or there is delay more than 6 months.
According to India Post the delivery should be completed with in 1-3 weeks after dispatch of articles.
Privacy Policy-
NN Publicationl uses the email ids of authors and editors and readers for sending editorial or publication notification only, we do not reveal or sell the email ids to any other website or company.