Journal of Applied Informatics

Editorial Board

Journal archive

For authors

For advertisers

Useful links

Contacts

Journal archive

< Back to archive

№2(116) March-april 2025 year

Content:

IT management
Performance management
Fuzzy bioinspired method for forming a set of candidates for linear positions Authors: Olga Bulygina, Denis D. Yartsev, Nikolay S. Kulyasov, Margarita Yu. Vorotilova	Line personnel occupy the vast majority of positions in many organizations, which determines the importance of timely and successful filling of such vacancies. The search for candidates for such positions is carried out through mass recruitment, which is characterized by high labor intensity, budgetary and time constraints, and the need for regular repetition due to high staff turnover rates. The noted features make it impossible to carry out this process without the use of modern software. Since mass recruitment does not require finding the best candidate for each vacancy and is limited to searching for specialists based on formal criteria from their resume, the main share of labor and time costs falls on the primary selection of candidates. Existing software does not have sufficient functionality to effectively automate this process. Given the need to process large volumes of multidimensional data, they do not provide a comprehensive accounting of different types of candidate characteristics and automatic adjustment of selection criteria taking into account their priority for the vacancy being filled. To solve the problem, an automated method for forming a set of candidates for linear positions was developed. It is based on the integrated use of an adaptive neuro-fuzzy inference system and a bioinspired algorithm inspired by the behavior of a fish school. The developed hybrid method was implemented as a computer program using the Python language. The results of its testing showed the convergence of the optimization algorithm, and their comparison with manual selection confirmed the prospects for using it to solve tasks of mass recruitment of line personnel.
Software engineering
Forecasting mudflow characteristics with incomplete and imprecise data based on machine learning models Authors: Larisa A. Lyutikova, Elena M. Kazakova	In this paper we propose a method for analyzing incomplete and inaccurate data in order to identify factors for predicting the volume of mudflows. The analysis is based on the mudflow activity inventory data for the south of Russia, which is poorly formalized, has missing values in the mudflow types field, and requires significant additional processing. Due to the lack of information on the mudflow type in the cadastral records, the primary objective of the study is to develop and apply a methodology for classifying mudflow types to fill in the missing data. For this purpose, a comparative study of machine learning methods was performed, including neural networks, support vector machines, and logistic regression. The experimental results indicate that the neural network-based model has the highest prediction accuracy among the methods considered. However, the support vector machine method demonstrated a higher sensitivity rate for classes represented by a small number in the test sample. In this regard, it was concluded that an integrated approach is appropriate, combining the strengths of both methods, which can help improve the overall classification accuracy in this subject area. Forecasting the volume of material removal and data clustering showed the presence of nonlinear dependencies, incompleteness and poor structuring of data even after filling in missing values of the mudflow type, which required a transition from numerical data to categorical data. This transition increased the model’s resistance to outliers and noise, allowing for a highly accurate forecast of a one-time removal. Since the forecast does not reveal the factors influencing its result, an analysis was conducted to identify these factors and present the found patterns in the form of logical rules. The formation of logical rules was carried out using two methods: the associative analysis method and the construction of a logical classifier. As a result of applying associative analysis, rules were found that reflect some patterns in the data, which, as it turned out, need significant correction. The use of the developed logical methods made it possible to clarify and correct the patterns identified using associative rules, which, in turn, ensured the determination of a set of factors influencing the volume of the mudflow.
Algorithmic efficiency
Application of causal inference and machine learning methods for automated interpretation of medical data Authors: I. Ilin, Daniel B. Alliti	In the context of rapid advancements in machine learning and causal inference methodologies, their integration into medical research is of paramount importance. Implementing appropriate methods in the medical domain facilitates robust assessment of treatment efficacy at the individual level. This study aims to conduct experiments on synthetic data and evaluate the accuracy of predicting individual treatment effects using T-learner and S-learner methods. The article presents an integrated approach to medical data analysis, combining causal inference techniques with machine learning algorithms. For the first time, a comprehensive comparison of the effectiveness of T-learner and S-learner methods in assessing individual treatment effects has been conducted. Based on simulated data, the study experimentally determines the optimal application conditions for these methods, depending on the characteristics of clinical data. The experiments revealed that the T-learner method demonstrated higher accuracy (87%) compared to the S-learner (84%), making it preferable when there are significant differences between treatment and control groups. However, the S-learner method exhibited greater generalization capability in scenarios with limited data volume. The c-for-benefit index was employed to validate the predicted treatment effects, with results confirming the high accuracy of both methods. These findings underscore the potential of integrating machine learning and causal inference methods to develop personalized therapeutic strategies and automate medical data analysis, thereby improving clinical outcomes and treatment quality. The developed approach enhances the precision of predicting treatment outcomes at the individual level and can be integrated into clinical decision support systems. The presented results offer new opportunities for personalized healthcare and can serve as a foundation for subsequent research in this field.
Probabilistic forecasting and data reliability assessment in intelligent transportation systems Authors: Petr M. Trefilov, Vladimir I. Venets, Maria A. Romanova	Probabilistic models for forecasting and assessing the reliability of navigation parameters in intelligent transportation systems are proposed. The relevance of the study is driven by the need to enhance the reliability of robotic transportation systems operating in dynamically changing urban environments. In such environments, sensor failures, signal distortions, and a high degree of data uncertainty are possible. The proposed approach is based on the application of probabilistic analysis methods and statistical control to detect anomalies in navigation parameters such as coordinates, speed, and orientation. The concept of navigation data reliability is introduced as a quantitative measure characterizing the degree of correspondence between the measured parameters and the actual state of the system. Key validity criteria are defined: confidence probability, significance level and confidence coefficients. To improve the reliability of parameter assessment, a combination of statistical analysis methods and filtering algorithms is proposed. Forecasting involves preliminary data processing aimed at smoothing noise and verifying data consistency. Outlier detection is performed using statistical methods, including confidence intervals and variance minimization. An forecasting model based on the Kalman filter and dynamic updating of probabilistic estimates has been developed. The integration of various methods into a unified system minimizes the impact of random and systematic errors, ensuring more accurate assessment of navigation parameters. The proposed approach is applicable to the development of navigation systems for autonomous robots and unmanned vehicles, enabling them to adapt to external conditions without the need for precise a priori data.
Models and methods
Organization of control of digital devices based on the assessment of the reliability of their functioning Authors: E. Berezkin, V. SHuvalov	The performance reliability indicators characterize the operability of the “test object – test tool” system and significantly depend on the performance reliability parameters of the testing equipment. Consequently, they can serve as criteria for selecting the necessary tools at the design stage of digital devices and assessing their effectiveness. The paper proposes quantitative criteria for assessing the effectiveness of the hardware testing method based on the assumption that a digital device performs a certain generalizing function, the values of which depend on a set of quantities reflecting individual operating modes of the device, and can be classified as correct only if there are no errors in the operating device. To quantitatively assess the performance reliability of the equipment, it is proposed to use the probability value that the digital device as a whole will function error-free provided that there is no detectable fault. The calculated evaluation value allows you to select the best of several possible test circuit options or synthesize a new one. Cases of organizing test procedures based on various principles and their combinations are considered. An optimization problem of placing test circuits in a tested device is formulated and a technique for solving it under certain restrictions is proposed. A distinctive feature of the proposed approach is the elimination of the need to use the values of conditional probabilities of detected faults, on the use of which known methods are built, although their practical receipt is very labor-intensive. The operation of the method of rational placement of control circuits is illustrated by the example of a control signal block.
Software engineering
Forward-looking ways of enhancing software development life cycle processes Authors: O. Stoyanova, Ivan S. Okuskov	The article explores the ways to increase the operational efficiency of IT companies by improving approaches to solving tasks at different stages of the software development life cycle (SDLC). It is shown that the greatest potential for growth of operational efficiency lies in the tasks of the stage of processing requirements. Automation of these tasks via artificial intelligence tools, especially those based on large language models (LLM), will significantly reduce the duration of SDLC by cutting the number of iterations for rework and error correction. The paper analyses the capabilities of such tools, as well as the complexities associated with their implementation, including the risk of technological dependence, the problems of integrating new tools into the current IT landscape of companies, the risks of disrupting the continuity of processes during implementation, as well as the difficulties of assessing the economic effects. The paper concludes that a comprehensive approach to SDLC modernisation is needed, which should combine technological innovation with organisational changes in the form of transformation of corporate culture and processes aimed at adapting to new technologies. Directions for further research are suggested, including the development of universal methodologies for implementing AI tools and models for assessing their economic efficiency.
Laboratory
Researching of processes and systems
Application of neural networks to reconstruct a continuous representation of a signal from discrete samples taken at random moments in time Authors: A. Puchkov, Maksim V. Maksimkin, N. Prokimnov, Petr N. Mashegov	Digital signal processing in cyber-physical technological systems is based on algorithms that operate with information presented in a discretized form both by level and by time. In the latter case, the constancy of the time quantization interval is assumed as one of the postulated conditions for the application of algorithms. At the same time, in practice, such constancy is not always ensured, which leads to the omission of individual samples or even to a random nature of the discretization. Therefore, an urgent research task is to develop methods and algorithms for signal processing under conditions of random discretization, in particular, for the restoration of continuous signals from their discrete samples taken with violation of the requirements of the Kotelnikov – Shannon theorem. If the discretization interval of a continuous signal is calculated taking into account its requirements (i. e. discretization is carried out with a frequency not lower than the Nyquist frequency), then its exact restoration from discrete samples is allowed, otherwise it is impossible. However, even for this situation, there are approaches to the restoration of continuous signals that take into account additional a priori information about the nature of the signal. Some of these approaches are based on complex mathematical apparatus, which makes them difficult to apply and not universal, while others use deep machine learning models that are expensive in terms of computational resources and demanding in terms of training data volumes. Under these conditions, a method is proposed for restoring a signal with a limited spectrum from discrete samples, the time interval between which is random, and its mathematical expectation is greater than the value determined by the Kotelnikov – Shannon theorem for regular discretization. The novelty of the research results lies in the proposed method and algorithm for restoring a continuous signal, as well as in the results of the analysis of a numerical experiment conducted with a software model executed in the MatLab environment and implementing the developed algorithm.
Construction of fuzzy situational-precedent models of regional economic systems using ontologies Authors: Maxim Dli, M. Chernovalova, Andrey Sokolov	The article is devoted to solving the problem of constructing and using models of regional economic systems that take into account management situational aspects. The specifics of information about these systems functioning, which cause difficulties in forming analytical and statistical dependencies, allowed us to conclude that it is advisable to use fuzzy situational models based on precedents for solving this problem. A procedure of constructing fuzzy situational precedent models of regional economic systems is proposed. It is characterized by the presence of additional transitions between nodes of the network graph (states) due to the presence of uncertainty factors leading to different observed reactions of the system to similar control effects. The procedure also allows the use of natural language information, which significantly expands the possibilities for economic and mathematical modeling of situational aspects of rational systems and processes management. This is achieved by applying domain ontology to determine the degree of proximity between the elements of the use case database and the current situation. The results of using the developed software tools implementing the proposed procedure to support decision-making on managing a regional IT cluster when implementing joint programs by its participants showed a fairly high degree of validity of the proposed recommendations.
Forecasting electricity consumption time series in the R programming environment Authors: Diana V. Dzizinskaya, Olga V. Ledneva, Maria G. Tindova, Svetlana V. Yazykova	To ensure effective management of energy systems, it is necessary to analyze and forecast time series of electricity consumption. Obtaining accurate forecasts of electricity consumption to optimize the operation of energy networks, planning the production and distribution of electricity explains the relevance of this study. This article presents a comparative analysis of medium-term electricity consumption forecasting models utilizing the R programming environment. The study encompasses classical forecasting models such as SARIMA and ETS, as well as less commonly referenced machine-oriented models like TBATS and Prophet in the scientific literature. The paper details the R functions necessary for performing calculations and includes a code snippet intended for preliminary data analysis and forecasting. All examined models demonstrate high accuracy in medium-term electricity consumption forecasting. However, variability in model fitting quality metrics is observed depending on the regional branches of the Unified Energy System of Russia. The application of ETS algorithms and bagging ETS yields the best forecasts with a minimal mean absolute error (slightly over 1%) for Russia as a whole, as well as for the consolidated energy system of the Urals. The TBATS model is recommended for predicting electricity consumption in the Center and East zones, while SARIMA is suggested for the South zone. Although the Prophet model exhibited satisfactory forecasting quality, the analysis indicates that its effectiveness significantly increases when applied to high-frequency data, such as weekly or hourly time series.