Big Data in Forensic Linguistics: Improving Authorship Attribution and Threat Detection in Cyber Contexts

Authors

  • Moh Atikurrahman Prodi Sastra Indonesia, UIN Sunan Ampel Surabaya

Keywords:

Authorship Attribution, Big Data, Cybercrime, Forensic Linguistics, Threat Detection

Abstract

Forensic linguistics, the study of language in legal and investigative contexts, has gained increasing relevance in the digital era. The proliferation of online communication has created both challenges and opportunities for authorship attribution and threat detection.  This study explores how Big Data enhances forensic linguistic practices by enabling large-scale analysis of digital texts, such as emails, chat messages, and social media posts.  Using natural language processing (NLP), stylometry, and machine learning techniques, we analyze millions of documents to identify linguistic fingerprints, detect threatening language, and attribute authorship in cybercrime cases.  Results demonstrate that Big Data improves accuracy in identifying authorship through stylistic markers and enhances the detection of threats by analyzing lexical, syntactic, and pragmatic patterns. However, ethical concerns—including privacy, consent, and the risk of algorithmic bias—pose significant challenges. This article argues that Big Data-driven forensic linguistics represents a powerful tool for law enforcement and legal proceedings, but its application must be guided by strict ethical frameworks.  By combining linguistic theory, computational models, and Big Data analytics, forensic linguistics can significantly contribute to cybercrime prevention and the protection of digital communities.

References

Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact. California Law Review, 104(3), 671–732. https://doi.org/10.2139/ssrn.2477899

Coulthard, M., & Johnson, A. (2017). An introduction to forensic linguistics: Language in evidence (2nd ed.). Routledge.

Grant, T. (2022). Applied forensic linguistics: Problems and perspectives. Palgrave Macmillan.

Holt, T. J., & Bossler, A. M. (2021). The Palgrave handbook of international cybercrime and cyberdeviance. Springer.

Juola, P. (2006). Authorship attribution. Foundations and Trends in Information Retrieval, 1(3), 233–334. https://doi.org/10.1561/1500000005

Stamatatos, E. (2009). A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology, 60(3), 538–556. https://doi.org/10.1002/asi.21001

Downloads

Published

30-12-2024

How to Cite

Moh Atikurrahman. (2024). Big Data in Forensic Linguistics: Improving Authorship Attribution and Threat Detection in Cyber Contexts . Prosiding SENALA (Seminar Nasional Linguistik Indonesia), 1(1), 23–27. Retrieved from https://senala.upnjatim.ac.id/index.php/senala/article/view/6

Similar Articles

You may also start an advanced similarity search for this article.