Andi Ashari

Tech Voyager & Digital Visionary

Corruption Cases Mapping Based on Indonesia’s Corruption Perception Index

File Typeapplication/pdf
File Size1,038,554 KB
Published At3/24/2017
File Hash7a20ca9f677a4f236e235f49f9f59bf6
Download Link

As Andi Muhammad Muqsith Ashari, a Computer Science student specializing in Artificial Intelligence at BINUS University, my research focuses on developing a corruption mapping system. Guided by Lili Ayu Wulandhari, S.Si., M.Sc., Ph.D., and co-authored with Noerlina, Sasmoko, and M Alamsyah from various faculties within BINUS University, we aimed to contribute to the fight against corruption by leveraging technology.


This study aims to develop a system to map corruption in Indonesia through online news analysis, reflecting the nation's current state. Acknowledging the significant role of government in economic growth and the persistent issue of corruption among civil servants, this research advocates for increased central government vigilance across regions. Utilizing the Naïve Bayes classifier, the study distinguishes between corruption-related and non-corruption news articles. Additionally, N-Gram and Hash Table methods are deployed to map the geographic distribution of corruption cases within Indonesia's administrative boundaries. Experimental results demonstrate that the Naïve Bayes classifier achieves 100% accuracy in both training and testing phases. However, the geographical mapping accuracy stands at 85%, indicating areas for improvement in pinpointing corruption instances. 


To achieve accurate classification with the Naïve Bayes classifier, we analyzed 60 articles, equally divided between corruption-related and non-corruption topics. The testing phase involved 20 additional articles, split the same way. To collect data, we utilized web crawling and scraping techniques on seven prominent Indonesian news sites: Tempo, Detik, Kompas, Merdeka, Liputan6, Tribun News, and Sindo News. The data collection process involved three key techniques: web crawling to recursively navigate and index news articles, web scraping to extract news content, and cron scheduling to automate these processes periodically.

Results and Analysis

The Naïve Bayes classifier demonstrated a 100% accuracy rate in identifying both corruption and non-corruption articles in the testing phase. Through text mining, we extracted data on corruption modes from online media, which will be further analyzed to identify the location referred to by each article. This approach yielded a comprehensive dataset of classified articles, enabling targeted actions by the central government to mitigate corruption. By collecting 396,565 news articles using our methodology, we established a robust foundation for our analysis and mapping system, aiming to contribute significantly to reducing corruption across Indonesia.

The study demonstrates the potential of using AI and machine learning techniques to map corruption cases in Indonesia. While the classification accuracy is high, further refinement is needed in the mapping process to better support government and anti-corruption agencies in targeting and reducing corruption.

This research, published in the Journal of Physics Conference Series and available on ResearchGate, represents a step towards utilizing technology in governance and public administration to address corruption more effectively.