Existing log parsers extract the common part as log templates using statistical features. Log parsing, which converts raw log messages into structured data, is the first step to enabling automated log analytics. Logs generated by large-scale software systems provide crucial information for engineers to understand the system status and diagnose problems of the systems. Precisely, with only 30% training data labeled, our model can achieve the comparable results as the fully supervised version. Experimental results show favorable effect of our model on prevalent HDFS and Hadoop Application datasets. Furthermore, to further utilize unlabeled samples effectively, we propose a flexible label screening strategy that takes into account the confidence and stability of pseudo-labels. In the teacher model, the log features are augmented with small Gaussian noise, while in the student model, the strong augmentation is injected to force the model to learn a more robust feature representation with the guidance of teacher model provided soft labels. Specifically, our model consists of two homogeneous networks that share the same parameters, one is called weak augmented teacher model and the other is termed as strong augmented student model. In this paper, we put forward a novel semi-supervised dual branch model that alleviate the need for large scale labeled logs for training a deep system log anomaly detector. Thus, the data incompleteness is not conducive to the deep learning for this practical application. However, abnormal system logs in the real world are often difficult to collect, and effectively and accurately categorize the logs is an even time-consuming project. To detect anomalies in system logs, deep learning is a promising way to go. To effectively monitor system’s status, system logs are critical. With versatility and complexity of computer systems, warning and errors are inevitable.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |