Scientific Publications
Improving Apache Spot Using Autoencoders for Network Anomaly Detection
Topics: Apache Spot, Security, Machine Learning, Hadoop, Spark
Apache Spot is an increasingly popular opensource platform for advanced network insights, focusing on the detection and analysis of anomalies, which can potentially correspond to security incidents. In this paper, we propose an improvement over Apache Spot’s built-in Machine Learning algorithm (Latent Dirichlet Allocation - LDA), replacing it with an Autoencoder based on deep learning techniques. We implement the Autoencoder functional block and deploy it into the Apache Spot’s pipeline, integrating it with Hadoop and Spark. Finally, we evaluate and benchmark the Autoencoder against the built-in LDA, using a publicly available network traffic dataset with cyber-attacks. The result is a considerable increase of accuracy, precision and recall.