[ Pobierz całość w formacie PDF ]
The complexity level of aforementioned problem is
[3] Abbasi, A., & Chen, H. Visualizing Authorship for Identification ,
determined by the various parameters like the number of
English, 60 71, (2006).
authors and size of training set. This both the parameters play
[4] Stańczyk, U., & Cyran, K. A. Machine learning approach to authorship
vital role to determine prediction accuracy. Although these attribution of literary texts , Journal of Applied Mathematics, 1(4), 151
158, (2007).
parameters are considered critical to the complexity of the
[5] Pavelec, D., Justino, E., & Oliveira, L. S. Author Identification using
problem and therefore the prediction accuracy, there are no
Stylometric Features ,Inteligencia Artificial, 11(36), 59 65.
studies examining their impact on the authorship-identification
doi:10.4114/ia.v11i36.892, (2007).
performance in a systematic way. The problem of authorship
[6] Stamatatos, E. Author identification: Using text sampling to handle the
attribution is explored well in the area of literature,
class imbalance problem , English, 44, 790 799.
newspapers etc but limited work has been done for the
doi:10.1016/j.ipm.2007.05.012, (2008).
authorship identification of online messages like blogs, emails
[7] Iqbal, F., Hadjidj, R., Fung, B. C. M., & Debbabi, M. A novel approach
and chat. This comparative study concluded that if number of
of mining write-prints for authorship attribution in e-mail forensics ,
author s increases and size of training sets decreases then Information Systems, 5, 42 51. doi:10.1016/j.diin.2008.05.001, (2008).
performance degrades. Thus, by considering all these [8] Iqbal, F., Binsalleeh, H., Fung, B. C. M., & Debbabi, M. Mining
writeprints from anonymous e-mails for forensic investigation , Digital
parameters further research direction is to improve prediction
Investigation, 1 9. doi:10.1016/j.diin.2010.03.003, (2010).
accuracy.
[9] Mikros, G. K., & Perifanos, K. Authorship identification in large email
collections: Experiments using features that belong to different linguistic
REFERENCES
levels, (2011).
[1] Estival 2008] [Abbasi et. al. 2008] [Koppel et. al. 2003] [De Vel et. al.
[10] Tanguy, L., Sajous, F., Calderone, B., & Hathout, N. Authorship
2001].
attribution: using rich linguistic features when training data is scarce ,
[2] Li, J., Chen, H., & Huang, Z. A Framework for Authorship
(2012).
Identification of Online Messages: Writing-Style Features and
35 | P a g e
www.ijacsa.thesai.org
[ Pobierz całość w formacie PDF ]