HIGH TECH IN EARTH SPACE RESEARCH

Binary classification of multi-attribute tagged data about anomalous events in computer systems using the SVDD algorithm

Sheluhin O. I., Rakovskiy D.I.

Introduction: At present, the volume of system logs of computer systems integrated into a distributed network infrastructure makes it impossible to manually check them in real time. Typically, the structure of each log record contains the numeric value of the observed attribute and a corresponding flag to mark the record as normal or abnormal.

The support vector data description algorithm demonstrates high classification accuracy even with small volumes of the training sample. A feature of the algorithm is the work with a multi-­attribute dataset, where each observation contains a common classifying marking. Consequently, the problem arises of reducing the set of markings of the attributes of the initial data to one marking of the entire observation. Purpose: to investigate the accuracy of the binary classification of experimental data of the Support Vector Data Description algorithm with a small volume of the training sample, provided that the data are labeled for each attribute separately. Methods: a method is proposed for solving the problem of reducing the set of markings of the attributes of the initial data to one single marking of the entire observation by means of two approaches: "normal observation" and voting by the majority principle. Two types of data are considered: ordered in time and uniformly mixed. The classification accuracy was assessed by calculating the area under the ROC curves with cross-­validation for a different number of attributes. Results: a comparative analysis of observation labeling methods showed the advantage of the "completely normal observation" approach over the "majority vote" approach without "weighting". It is shown that the classification accuracy on mixed data is 7% higher compared to the variant of data ordering in time. The accuracy of the algorithm was investigated for a different number of attributes using the "completely normal observation" approach. The maximum achieved classification accuracy was about 96% when working with 6 attributes, with uniform mixing of the input dataset. A further increase in the number of attributes leads to a decrease in the average classification accuracy due to an increase in the proportion of anomalous observations. It is shown that when using uniform mixing of input data, the gain in accuracy can be increased by 15–20%. Practical relevance: the algorithm demonstrates an exponential growth in the consumption of computing resources with an increase in the amount of input data. Discussion: to achieve the maximum classification accuracy with acceptable resource consumption, it is necessary to form a compact set of input data, which most fully reflects the functioning of the computer system in normal mode.

Editorial board

Bobrowsky V.I.
(Ph.D., Associate Professor, Head of Department of "INTELTEH")

Borisov V.V.
(Ph.D., Professor, Actual Member of the Academy of Military Sciences, Professor, Department of Computer Science of MPEI)

Budko P.A.
(Ph.D., Professor, Department of Technical communication and automation in S.M. Budjonny Military Academy of the Signal Corps)

Budnikov S.A.
(Ph.D., associate professor, Actual Member of the Academy of Education Informatization, Head of the automated control systems Department in Russian Air Force Military Educational and Scientific Center “Air Force Academy named after Professor N.E. Zhukovsky and Y.A. Gagarin”)

Verhova G.V.
(Ph.D., Professor, Head of Department of Automation communication companies In the Bonch-Bruevich Saint Petersburg State University of Telecommunications)

Goncharevsky V.S.
(Ph.D., Professor, Honored Worker of Science and Technology of the Russian Federation, Professor of technologies and technical support and maintenance of the automated control systems in Military Space Academy of A.F. Mozhaysky)

Komashinskiy V.I.
(Ph.D., Professor, professor of processing and transmission discrete messages in the Bonch-Bruevich Saint Petersburg State University of Telecommunications)

Kirpanev A.V.
(Ph.D., Associate Professor, Head of JSC "Scientific Production Enterprise "Radar MMS")

Kurnosov V.I.
(Ph.D., Professor, Academician of Academy of Sciences of the Arctic, Academician of the International Academy of Informatization, International Academy of defense, security, law and order, corresponding member of the Academy of Natural Sciences, Senior Researcher" Open Joint Stock Company "Scientific Research Institute "Rubin")

Manuilov Y.S.
(Ph.D., Professor, Department of automated control systems space complexes in Military Space Academy of A.F. Mozhaysky)

Morozov A.V.
(Ph.D., Professor, Actual Member of the Academy of Military Sciences, Head of the Department of automated command and control systems in Military Аcademy of troops of antiaircraft defense)

Moshak N.N.
(Ph.D., Associate Professor, head of the department of "INTELTEH")

Prorok V.Y.
(Ph.D., Professor, professor of automatic control systems in Military Space Academy of A.F. Mozhaysky)

Semenov S.S.
(Ph.D., associate professor, professor of technical communication and automation in S.M. Budjonny Military Academy of the Signal Corps)

Sinicyn E.A.
(Ph.D., Professor, Head of the Research Department of JSC "The All-Russian research institute of radio equipment")

Shatrakov Y.G.
(Ph.D., Professor, Honored Worker of Science, Scientific Secretary of JSC "The All-Russian research institute of radio equipment")