Self-Supervised Log Anomaly Detection with LogBERT-Style Transformers: Full Empirical Evaluation on a Reproducible SynHDFS Benchmark

Main Article Content

Qi Xin

Abstract

Log-based anomaly detection is a core problem in AIOps because system logs provide fine-grained evidence of failures, performance regressions, and security incidents. Recent work has shown that self-supervised sequence modeling substantially improves generalization compared with purely frequency-based detectors, especially when labeled anomalies are scarce. This paper presents a LogBERT-style transformer framework for session-level log anomaly detection and reports a complete, reproducible experimental evaluation. Due to download constraints of large archived log datasets in this environment, we construct a faithful fallback benchmark, SynHDFS-6k, which mimics HDFS-style block workflows by composing normal execution patterns and injecting five realistic anomaly types. SynHDFS-6k contains 6000 sessions with a fixed 5.0% anomaly rate and a vocabulary of 20 event templates. We train a two-layer transformer encoder with masked language modeling on normal sessions only and derive an anomaly score using pseudo log-likelihood (PLL) computed by masking each token position once. We compare against unigram and bigram probabilistic models, PCA reconstruction error, one-class SVM, isolation forest, a DeepLog-style GRU next-event predictor, and a supervised logistic regression upper bound. On the SynHDFS-6k test split, the proposed LogBERT-PLL achieves Precision=0.615, Recall=0.533, F1=0.571, ROC-AUC=0.898, and PR-AUC=0.594. We additionally analyze transformer scoring strategies (PLL mean, PLL top-k, PLL max, random masking, and CLS Mahalanobis), report runtime and model capacity trade-offs, and quantify per-anomaly-type detection behavior. The study provides an end-to-end blueprint for transformer-based self-supervised log anomaly detection under a fully specified protocol, and it highlights strengths and limitations that inform deployment on real-world HDFS logs.

Article Details

Section

Articles

How to Cite

Self-Supervised Log Anomaly Detection with LogBERT-Style Transformers: Full Empirical Evaluation on a Reproducible SynHDFS Benchmark. (2026). JEECS (Journal of Electrical Engineering and Computer Sciences), 11(1), 23-35. https://doi.org/10.54732/jeecs.v11i1.3

References

[1] S. He, P. He, Z. Chen, T. Yang, Y. Su, and M. R. Lyu, “A Survey on Automated Log Analysis for Reliability Engineering,” ACM Computing Surveys, vol. 54, no. 6, 2022, doi: 10.1145/3460345/SUPPL_FILE/HE.ZIP. DOI: https://doi.org/10.1145/3460345

[2] S. He, J. Zhu, P. He, and M. R. Lyu, “Experience Report: System Log Analysis for Anomaly Detection,” Proceedings - International Symposium on Software Reliability Engineering, ISSRE, pp. 207–218, 2016, doi: 10.1109/ISSRE.2016.21. DOI: https://doi.org/10.1109/ISSRE.2016.21

[3] J. Zhu, S. He, P. He, J. Liu, and M. R. Lyu, “Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics,” Proceedings - International Symposium on Software Reliability Engineering, ISSRE, pp. 355–366, 2023, doi: 10.1109/ISSRE59848.2023.00071. DOI: https://doi.org/10.1109/ISSRE59848.2023.00071

[4] P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An Online Log Parsing Approach with Fixed Depth Tree,” Proceedings - 2017 IEEE 24th International Conference on Web Services, ICWS 2017, pp. 33–40, 2017, doi: 10.1109/ICWS.2017.13. DOI: https://doi.org/10.1109/ICWS.2017.13

[5] M. Du, F. Li, G. Zheng, and V. Srikumar, “DeepLog: Anomaly detection and diagnosis from system logs through deep learning,” Proceedings of the ACM Conference on Computer and Communications Security, pp. 1285–1298, 2017, doi: 10.1145/3133956.3134015/SUPPL_FILE/MINDU-DEEPLOGANOMALY.MP4. DOI: https://doi.org/10.1145/3133956.3134015

[6] W. Meng et al., “Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs,” IJCAI International Joint Conference on Artificial Intelligence, vol. 2019-August, pp. 4739–4745, 2019, doi: 10.24963/IJCAI.2019/658. DOI: https://doi.org/10.24963/ijcai.2019/658

[7] X. Zhang et al., “Robust log-based anomaly detection on unstable log data,” ESEC/FSE 2019 - Proceedings of the 2019 27th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, vol. 19, pp. 807–817, 2019, doi: 10.1145/3338906.3338931. DOI: https://doi.org/10.1145/3338906.3338931

[8] M. Landauer, S. Onder, F. Skopik, and M. Wurzenberger, “Deep learning for anomaly detection in log data: A survey,” Machine Learning with Applications, vol. 12, p. 100470, 2023, doi: 10.1016/J.MLWA.2023.100470. DOI: https://doi.org/10.1016/j.mlwa.2023.100470

[9] Z. A. Khan, D. Shin, D. Bianculli, and L. C. Briand, “Impact of log parsing on deep learning-based anomaly detection,” Empirical Software Engineering, vol. 29, no. 6, pp. 139-, 2024, doi: 10.1007/S10664-024-10533-W/TABLES/8. DOI: https://doi.org/10.1007/s10664-024-10533-w

[10] S. Ali, C. Boufaied, D. Bianculli, P. Branco, and L. Briand, “A comprehensive study of machine learning techniques for log-based anomaly detection,” Empirical Software Engineering, vol. 30, no. 5, pp. 129-, 2025, doi: 10.1007/S10664-025-10669-3/FIGURES/12. DOI: https://doi.org/10.1007/s10664-025-10669-3

[11] Y. Lee, J. Kim, and P. Kang, “LAnoBERT: System log anomaly detection based on BERT masked language model,” Applied Soft Computing, vol. 146, p. 110689, 2023, doi: 10.1016/J.ASOC.2023.110689. DOI: https://doi.org/10.1016/j.asoc.2023.110689

[12] W. Niu, X. Liao, S. Huang, Y. Li, X. Zhang, and B. Li, “A robust Wide & Deep learning framework for log-based anomaly detection,” Applied Soft Computing, vol. 153, p. 111314, 2024, doi: 10.1016/J.ASOC.2024.111314. DOI: https://doi.org/10.1016/j.asoc.2024.111314

[13] A. Vaswani et al., “Attention Is All You Need,” 31st Conference on Neural Information Processing Systems (NIPS 2017, 2017.

[14] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019, vol. 1, pp. 4171–4186, 2019. DOI: https://doi.org/10.18653/v1/N19-1423

[15] H. Guo, S. Yuan, and X. Wu, “LogBERT: Log Anomaly Detection via BERT,” Proceedings of the International Joint Conference on Neural Networks, vol. 2021-July, 2021, doi: 10.1109/IJCNN52387.2021.9534113. DOI: https://doi.org/10.1109/IJCNN52387.2021.9534113

[16] P. Himler, M. Landauer, F. Skopik, and M. Wurzenberger, “Anomaly detection in log-event sequences: A federated deep learning approach and open challenges,” Machine Learning with Applications, vol. 16, p. 100554, 2024, doi: 10.1016/J.MLWA.2024.100554. DOI: https://doi.org/10.1016/j.mlwa.2024.100554

[17] I. T. Jolliffe, Principal Component Analysis. New York: Springer-Verlag, 2002.

[18] B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, “Estimating the Support of a High-Dimensional Distribution,” Neural Computation, vol. 13, no. 7, pp. 1443–1471, 2001, doi: 10.1162/089976601750264965. DOI: https://doi.org/10.1162/089976601750264965

[19] F. T. Liu, K. M. Ting, and Z. H. Zhou, “Isolation forest,” Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 413–422, 2008, doi: 10.1109/ICDM.2008.17. DOI: https://doi.org/10.1109/ICDM.2008.17

[20] P. C. Mahalanobis, “On the Generalised Distance in Statistics,” Proc. National Institute of Sciences of India, vol. 2, pp. 49–55, 1936.