+7 (495) 987 43 74 ext. 3304
Join us -              
Рус   |   Eng


Okunev Boris V.

Cand. Sci. (Eng.), Associate Professor, department of Information Technology in Economics and Management, the Branch of National Research University MPEI in Smolensk
Smolensk, Russia

Solving the problem of calendar data preprocessing during the implementation of Data Mining technology

At the moment, dirty data, that is, low-quality data, is becoming one of the main problems of effectively solving Data Mining tasks. Since the source data is accumulated from a variety of sources, the probability of getting dirty data is very high. In this regard, one of the most important tasks that have to be solved during the implementation of the Data Mining process is the initial processing (clearing) of data, i.e. preprocessing. It should be noted that preprocessing calendar data is a rather time-consuming procedure that can take up to half of the entire time of implementing the Data Mining technology. Reducing the time spent on the data cleaning procedure can be achieved by automating this process using specially designed tools (algorithms and programs). At the same time, of course, it should be remembered that the use of the above elements does not guarantee one hundred percent cleaning of "dirty" data, and in some cases may even lead to additional errors in the source data. The authors developed a model for automated preprocessing of calendar data based on parsing and regular expressions. The proposed algorithm is characterized by flexible configuration of preprocessing parameters, fairly simple implementability and high interpretability of results, which in turn provides additional opportunities for analyzing unsuccessful results of Data Mining technology application. Despite the fact that the proposed algorithm is not a tool for cleaning absolutely all types of dirty calendar data, nevertheless, it successfully functions in a significant part of real practical situations. Read more...

Virtualization of information object vulnerability testing container based on DeX technology and deep learning neural networks

The modern development of information security tools, along with the improvement of remote access methods, allows software and hardware to be audited without the need for direct access to the system under test. One of its components is related to the implementation of software on mobile ARM processor architectures. Within this direction of development, the approach that allows integrating Linux kernel-based distributions by introducing a virtual container chroot (change root) into the Android OS- based system and, consequently, performing penetration testing without the need to use personal computers is highlighted. An example of this approach is the Kali NetHunter distribution which allows remote system administration functionality through the KeX module. Besides the obvious advantages of KeX functionality, some disadvantages should also be mentioned: firstly, the low speed of GUI processing due to translation to remote hosts and the need to support translation at operating system level; secondly, the consumption of energy resources when using the desktop features of the KeX module. In order to solve the mentioned problems, a system of virtualization of energy-efficient container for testing the vulnerabilities of critical information objects has been developed and based on the principle of multi-containerization. The software of the system is represented by two components: an enlarged module for integration of the chroot container into the DeX environment (primary), and an enlarged module for ensuring energy efficiency using predictive neural network models based on variable time intervals (secondary). As a result of comparing the effectiveness of existing and implemented approaches in penetration testing, it is noted that the proposed system can be used in testing the security of particular platforms and systems, including highly sensitive information objects or resources. Read more...

Fuzzy model of a multi-stage chemical-energy-technological processing system fine ore raw materials

The results of the study, the purpose of which was to build a software model of a multi-stage integrated system for processing finely dispersed ore raw materials, are presented. The role of such raw materials can be processed waste at mining and processing plants of apatite-nepheline and other types of ores, which accumulate in large volumes in tailing dumps. They create a significant environmental threat in the territories adjacent to the plants due to weathering, dust formation, penetration into the soil and aquifers of chemical compounds and substances hazardous to human health. Therefore, the improvement of existing production processes, the development of new technological systems for mining and processing plants, including the application of the principles of the circular economy, waste recycling, justifies the relevance of the chosen research area. The proposed program model is based on the use of trainable trees of systems (blocks) of fuzzy inference of the first and second types. This approach made it possible to avoid unnecessary complication of the bases of fuzzy inference rules when using only one fuzzy block when building a multi-parameter model of the entire multi-stage complex system. The use of several fuzzy inference blocks that describe the behavior of individual units of the system and their configuration in accordance with the physical structure of the system allows the use of relatively simple sets of rules for individual blocks. The joint selection of their parameters when training a tree of fuzzy blocks makes it possible to achieve high accuracy of the solutions obtained. The novelty of the research results is the proposed software fuzzy model of an integrated system for processing finely dispersed ore raw materials. The results of a simulation experiment conducted in the MatLab environment using a synthetic data set generated in Simulink are presented. The results showed that the trained fuzzy model provides good fidelity of the parameters and variables from the test part of the synthetic set. Read more...