Journal of Applied Informatics

Editorial Board

Journal archive

For authors

For advertisers

Useful links

Contacts

Authors

< Back to the list of authors

Ivashko Alexander G.

Degree	Dr. Sci. (Eng.), Professor, Head of the Software and Systems Engineering Department, Institute of Mathematics and Computer Sciences, University of Tyumen
E-mail	a.g.ivashko@utmn.ru
Location	Tyumen, Russia
Articles	Classification system for documents with mine surveying data All enterprises engaged in exploration activities on the territory of the Russian Federation, are facing the need to formulate tasks for the mine surveyor service and control their execution. It affects enterprise’s workflow process. Due to it, a problem of organization of efficient document processing in electronic document management systems (timely identification of documents containing mine surveying data) takes place. The article presents possible solution of this problem – automated document classification system into EDMS in the form of optional add-on for 1C:Document Management. Within the classification system creation a preprocessing script for primary document texts, including cleaning, lemmatization, stop words removing, as well as preparation of input features for the classifier were developed and implemented. Applicability of different machine learning algorithms to solution of considering classification problem was studied, the values of hyperparameters providing the highest value of the ROC AUC metric were determined. The quality of all obtained models was assessed using metrics Precision, Recall and F-measures, the stability of the classification quality to changes in the input data was investigated. The identified problem of instability of classification results was solved by building and implementing a machine learning model in the form of ensemble of classifiers. Classification model (an ensemble of clusters) was tested on the set of real documents of Gazprom nedra Ltd; classiffication quality on the test sample by ROC AUC metric was 0,91. Except the classification module itself, developed system contains the storage database for learning outcomes, function library for organization of work with the database and API interfaces allowing to process classification requests, coming from external systems. These API interfaces, in particular, implement the ability to load saved trained models, validate data coming from external systems, preprocess input text documents, train new models and assess their quality, save both trained models and the results of their testing. Also the possibility of the additional training of the models on a new data was realized. Read more... Mathematical modeling of the assessment of credibility in a message in social networks on Russian language The problem of unreliable information is currently the most critical in the field of information dissemination in the Internet environment. The global transition of information sources to the Internet has led to the fact that information is distributed too quickly, and it is quite difficult to verify the accuracy of the information. This topic is raised when talking about the media, social networks, blogs, and other sources of information. The transmission of information has ceased to be a matter only for the media. Any Internet user can be a source of information. The development of free sources of information and the digitalization of sources have led to a loss of confidence in the official media. The consequence of this is the development of methods for automatically detecting false information. The objectives of this work are to study the possibility of building a model for automatically determining the level of trust in a message in a social network in Russian language and determine the most influential parameters. The considered method is aimed at a multi-sided analysis of the post, including parameters obtained from the text of the message, user data and the distribution of the message on the social network. To work with machine learning methods, a data sample was collected and marked up, on which machine learning models were trained. The data sample underwent a balancing process to obtain stable results. After training the models, five models were obtained trained on both balanced and conventional data samples. The results were obtained for models with a restriction on parameters to identify the most influential parameters. The results were machine learning models with high readings of metric values on test data and the most influential parameters were identified, which included parameters unique to the Russian language. Read more... Development of a retail visitor counting system using computer vision Information about store traffic is of great value to businesses. It allows you to evaluate the effectiveness of marketing campaigns and optimize staff work schedules. Moreover, data on the number of visitors can be indirectly used to analyze the competitive environment. Despite the existence of various technological approaches to solving the problem of counting visitors, each of them has a number of significant disadvantages. The purpose of the research is to develop a software system for counting visitors based on the application of machine vision technologies to a video stream. To do this, it was proposed to split the counting task into two subtasks: detection and tracking of visitors’ movements in the frame, each of which was solved using convolutional neural networks. Training and validation of neural networks were carried out on data collected in real conditions exclusively from cameras of the system customer. Together with the advanced counting algorithm, the system became capable of: a) excluding from the count employees of the retail chain wearing a corporate uniform; b) correctly handles complex and unpredictable trajectories of visitors in a video surveillance scene; c) without compromising the calculation accuracy, correctly handle video stream decoding errors, which result in dropped frames. Testing of the quality of the system’s operation was carried out on 504 test videos, in which a total of 739 visitors entered and left the outlet. When processing each frame, the final calculation error was 3%. And in the course of a number of experiments, it was found that when processing only every 4 frames (the load on the system in this case was reduced by 4 times), the calculation error increased by only 1%. Read more...