+7 (495) 987 43 74 ext. 3304
Join us -              
Рус   |   Eng

Authors

Okunev Boris V.

Degree
Cand. Sci. (Eng.), Associate Professor, department of Information Technology in Economics and Management, the Branch of National Research University MPEI in Smolensk
E-mail
ok-bmv@rambler.ru
Location
Smolensk, Russia
Articles

Solving the problem of calendar data preprocessing during the implementation of Data Mining technology

At the moment, dirty data, that is, low-quality data, is becoming one of the main problems of effectively solving Data Mining tasks. Since the source data is accumulated from a variety of sources, the probability of getting dirty data is very high. In this regard, one of the most important tasks that have to be solved during the implementation of the Data Mining process is the initial processing (clearing) of data, i.e. preprocessing. It should be noted that preprocessing calendar data is a rather time-consuming procedure that can take up to half of the entire time of implementing the Data Mining technology. Reducing the time spent on the data cleaning procedure can be achieved by automating this process using specially designed tools (algorithms and programs). At the same time, of course, it should be remembered that the use of the above elements does not guarantee one hundred percent cleaning of "dirty" data, and in some cases may even lead to additional errors in the source data. The authors developed a model for automated preprocessing of calendar data based on parsing and regular expressions. The proposed algorithm is characterized by flexible configuration of preprocessing parameters, fairly simple implementability and high interpretability of results, which in turn provides additional opportunities for analyzing unsuccessful results of Data Mining technology application. Despite the fact that the proposed algorithm is not a tool for cleaning absolutely all types of dirty calendar data, nevertheless, it successfully functions in a significant part of real practical situations. Read more...

Virtualization of information object vulnerability testing container based on DeX technology and deep learning neural networks

The modern development of information security tools, along with the improvement of remote access methods, allows software and hardware to be audited without the need for direct access to the system under test. One of its components is related to the implementation of software on mobile ARM processor architectures. Within this direction of development, the approach that allows integrating Linux kernel-based distributions by introducing a virtual container chroot (change root) into the Android OS- based system and, consequently, performing penetration testing without the need to use personal computers is highlighted. An example of this approach is the Kali NetHunter distribution which allows remote system administration functionality through the KeX module. Besides the obvious advantages of KeX functionality, some disadvantages should also be mentioned: firstly, the low speed of GUI processing due to translation to remote hosts and the need to support translation at operating system level; secondly, the consumption of energy resources when using the desktop features of the KeX module. In order to solve the mentioned problems, a system of virtualization of energy-efficient container for testing the vulnerabilities of critical information objects has been developed and based on the principle of multi-containerization. The software of the system is represented by two components: an enlarged module for integration of the chroot container into the DeX environment (primary), and an enlarged module for ensuring energy efficiency using predictive neural network models based on variable time intervals (secondary). As a result of comparing the effectiveness of existing and implemented approaches in penetration testing, it is noted that the proposed system can be used in testing the security of particular platforms and systems, including highly sensitive information objects or resources. Read more...