Scientific Journal

Applied Aspects of Information Technology

METHODOLOGY OF INFORMATION MONITORING AND DIAGNOSTICS OF OBJECTS REPRESENTED BY QUANTITATIVE ESTIMATES BASED ON CLUSTER ANALYSIS
Abstract:
The paper discusses the methodological foundations of informational diagnostics on the base of cluster analysis for the objects represented by quantitative estimates. The literature review showed that the application of cluster analysis in some cases was successful; also, the theory of cluster analysis is well developed, and the properties of methods and distance measures are studied, which indicates the appropriateness of using the cluster analysis apparatus. Therefore, the development of a general methodology to diagnose any objects represented by quantitative estimates is a topical task. The purpose of this work is to develop methodological bases for determining diagnostic states and behavioral patterns for objects represented by quantitative estimates on the base of cluster analysis. Because of informational diagnostics is a targeted activity on the assessment of object state based on a dynamic information model, the model of a diagnosis object is discussed first. We examine the lifecycle of instances of diagnosis objects that are described by a plurality of parameters whose values are determined by a time slice along the lifeline of the instance. It is shown that a different number of measured values characterize each state of the diagnosis object. There are identified characteristics that should be analyzed to indicate a threat to the instance and the need for supportive procedures to prevent premature interruption of an instance's lifecycle. Experts should carry out the formalization of conditions for termination of the life cycle of the diagnosis object and formation of the list of supporting procedures. Because the quality of any information technology depends on the input data quality, a procedure for the analysis of diagnostic characters is developed. In order to start the diagnosis as early as possible and apply the available data as fully as possible, the methodologies for one-, two- and N-step diagnosis are developed. All procedures used cluster order. Transition patterns are defined for the two-step diagnosis, as well as trend patterns are defined for the N-step diagnosis. Transition patterns allow diagnosing the improvement, worsening, or stability of the diagnosis object state. The procedure for the diagnostic characters analysis and the methodologies of diagnosis is new scientific results. The application of the developed methodologies is demonstrated in the example of diagnosing students' success. In this case, the curriculum provides the domain model. Examples of diagnosing states and behavior, as well as identifying recommended reactions, are provided. For one-step diagnostics, the presence of the influence of the latent factor and the diagnostic signs that show significant instability are investigated. For one- and two-step diagnostics, the conditions for forming a risk segment are provided.
Authors:
Keywords
DOI
10.15276/aait.01.2020.1
References
1. Marasanov, V., Sharko, A. & Stepanchikov, D. (2020). “Model of the Operator Dynamic Process of Acoustic Emission Occurrence While of Materials Deforming”. In: Lytvynenko, V., Babi-chev, S., Wójcik, W., Vynokurova, O., Vyshe-myrskaya, S., Radetskaya, S. (eds). Lecture Notes in Computational Intelligence and Decision Making. ISDMCI 2019. Advances in Intelligent Systems and Computing, Vol. 1020, pp. 48-64. DOI: 10.1007/978-3-030- 26474-1_4. 
2. Wiharto, W., Kusnanto, H. & Herianto, H. (2016). “Interpretation of Clinical Data Based on C4.5 Algorithm for the Diagnosis of Coronary Heart Disease”. Healthcare Informatics Research, Vol. 22(3), pp. 186-195. DOI: 10.4258/hir.2016.22.3.186. 
3. Soni, J., Ansari, U., Sharma, D. & Soni, S. (2011). “Predictive Data Mining for Medical”. Diagnosis: An Overview of Heart Disease Prediction. International Journal of Computer Applications, Vol. 17(8), pp. 43-48. DOI: 10.5120/2237-2860. 
4. Ruvinskaya, V. M., Shevchuk, I. & Michaluk, N. (2019). “Models based on conformal predictors for diagnostic systems in medicine”. Applied Aspects of Information Technology, Vol. 2(2), pp. 127-137. DOI: 10.15276/aait.02.2019.4. 
5. Qin, S. J. (2012). “Survey on data-driven industrial process monitoring and diagnosis”. Annual Reviews in Control, Vol. 36(2), pp. 220-234. DOI: 10.1016/j.arcontrol.2012.09.004. 
6. Liu, X., Ma, L. & Mathew, J. (2009). “Machinery fault diagnosis based on fuzzy measure and fuzzy integral data fusion techniques”. Mechanical Systems and Signal Processing, Vol. 23(3), pp. 690-700. DOI: 10.1016/j.ymssp.2008.07.012. 
7. MacGregor, J. & Cinar, A. (2012). “Monitoring, fault diagnosis, fault-tolerant control and optimization: Data driven methods”. Computers & Chemical Engineering, Vol. 47, pp. 111-120. DOI: 10.1016/j.compchemeng.2012.06.017. 
8. Dai, X. & Gao, Z. (2013). “From Model, Signal to Knowledge: A Data-Driven Perspective of Fault Detection and Diagnosis”. IEEE Transactions on Industrial Informatics, Vol. 9(4), pp. 2226-2238. DOI: 10.1109/TII.2013.2243743. 
9. Wen, L., Li, X., Gao, L. & Zhang, Y. (2018). “A New Convolutional Neural Network-Based DataDriven Fault Diagnosis Method”. IEEE Transactions on Industrial Electronics, Vol. 65(7), pp. 5990-5998. DOI: 10.1109/TIE.2017.2774777. 
10. Saric-Grgic, I., Grubisic, A., Seric, L. & Robinson, T. J. (2020). “Student Clustering Based on Learning Behavior Data in the Intelligent Tutoring System”. International Journal of Distance Education Technologies, Vol. 18(2), pp. 73-89. DOI: 10.4018/IJDET.2020040105. 
11. Wong, B. T. & Li, K. C. (2020). “A Review of Learning Analytics Intervention in Higher Education (2011-2018)”. Journal of Computers in Education, Vol. 7(1), pp. 7-28. DOI: 10.1007/s40692-019-00143-7. 
12. Marbouti, F., Diefes-Dux, H. A. & Madhavan, K. (2016). “Models for early prediction of at-risk students in a course using standards-based grading”. Computers & Education, vol. 103, pp. 1– 15. DOI: 10.1016/j.compedu.2016.09.005. 
13. Gasevic, D., Dawson, S., Rogers, T. & Gasevic, D. (2016). “Learning analytics should not promote one size fits all: The effects of instructional conditions in predicating academic success”. Internet and Higher Education, Vol. 28, pp. 68-84. DOI: 10.1016/j.iheduc.2015.10.002. 
14. He, L., Agard, B. & Trépanier, M. (2020). “A classification of public transit users with smart card data based on time series distance metrics and a hierarchical clustering method”. Transportmetrica A: Transport Science, Vol. 16(1), pp. 56-75. DOI: 10.1080/23249935.2018.1479722. 
15. Tretyakov, D. (2016). “A Self-Learning Diagnosis Algorithm Based on Data Clustering”. Intelligent Control and Automation, Vol. 07(03), pp. 84–92. DOI: 10.4236/ica.2016.73009. 
16. Omran, M., Engelbrecht, A. & Salman, A. A. (2007). “An overview of clustering methods”. Intelligent Data Analysis, Vol. 11(6), pp. 583-605. DOI: 10.3233/IDA-2007-11602. 
17. Friedman, J. H. & Meulmany, J. J. (2004). “Clustering Objects on Subsets of Attributes”. Royal Statistical Society, Vol. 66(4), pp. 815-849. DOI: 10.1111/j.1467-9868.2004.02059.x. 
18. Amorim, R. C. (2015). “Feature Relevance in Ward’s Hierarchical Clustering Using the Lp Norm”. Journal of Classification, Vol. 32(1), pp. 46- 62. DOI: 10.1007/s00357-015-9167-1. 
19. Obry, T., Travé-Massuyès, L. & Subias, A. (2019). “DyClee-C: a clustering algorithm for categorical data based diagnosis”. DX’19 – 30th International Workshop on Principles of Diagnosis, November 2019, Klagenfurt, Austria. [Electronic Resource]. – Access mode: https://hal.laas.fr/hal-02383492/file/DyClee_C_DX19_Final.pdf. – Active link – 20.03.2020. 
20. Min, E., Guo, X., Liu, Q., Zhang, G., Cui, J. & Long, J. (2018). “A survey of clustering with deep learning: From the perspective of network architecture”. IEEE Access, Vol. 6, pp. 39501- 39514. DOI: 10.1109/ACCESS.2018.2855437. 
21. (2019). “Clustering Methods for Big Data Analytics”. Techniques, Toolboxes and Applications. Cham, Switzerland: Springer Nature Switzerland AG, 187 p. 
22. Shibaev, D., Vyuzhuzhanin, V., Rudnichenko, N., Shibaeva, N. & Otradskaya, T. (2019). “Data control in the diagnostics and forecasting the state of complex technical systems”. Herald of Advanced Information Technology, Vol. 2, No. 3, pp. 183-196. DOI: 10.15276/hait.03.2019.2. 
23. Gu, J., Jiang, Z., Fan, W., Wu, J. & Chen, J. (2020). “Real-Time Passenger Flow Anomaly Detection Considering Typical Time Series Clustered Characteristics at Metro Stations”. Journal of Transportation Engineering Part A-Systems, Vol. 146(4), article 04020015. DOI: 10.1061/JTEPBS.0000333. 
24. Fowlkes, E., Gnanadesikan, R. & Kettenring, J. (1988). “Variable Selection in Clustering”. Journal of Classification, Vol. 5, pp. 205-228. 
25. Gnanadesikan, R., Kettenring, J. & Tsao, S. (1995). “Weighting and Selection of Variables for Cluster Analysis”. Journal of Classification, Vol. 12, pp. 113-136. 
26. Dresp-Langley, B., Ekseth, O. K., Fesl, J., Gohshi, S., Kurz, M. & Sehring, H.-W. (2019). “Occam's Razor for Big Data? On Detecting Quality in Large Unstructured Datasets”. Applied SciencesBasel, Vol. 9(15), article 3065. DOI: 10.3390/app9153065. 27. Huang, J. Z. X., Ng, M. K., Rong, H. Q. & Li, Z. C. (2005). “Automated variable weighting in k-means type clustering”. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27(5), pp. 657-668. DOI: 10.1109/TPAMI.2005.95. 
28. Iam-On, N. (2020). “Clustering data with the presence of attribute noise: a study of noise completely at random and ensemble of multiple kmeans clustering”. International Journal of Machine Learning and Cybernetics, Vol. 11, pp. 491-509. DOI: 10.1007/s13042-019-00989-4. 
29. Shirkhorshidi, A. S., Aghabozorgi, S. & Wah, T. Y. (2015). “A Comparison Study on Similarity and Dissimilarity Measures in Clustering Continuous Data”. PLoS ONE, Vol. 10(12), article e0144059. DOI:10.1371/journal.pone.0144059. 
30. Arora, J., Khatter, K. & Tushir, M. (2015). “Fuzzy c-Means Clustering Strategies: A Review of Distance Measures”. Software Engineering: Advances in Intelligent Systems and Computing, Vol. 731, pp. 153-162. DOI: 10.1007/978-981-10- 8848-3_15. 
31. Tolentino, J. A., Gerardo, B. D. & Medina, R. P. (2019). “Enhanced Manhattan-Based Clustering Using Fuzzy C-Means Algorithm”. Recent Advances in Information and Communication Technology, Vol. 769, pp. 126-134. DOI: 10.1007/978-3-319-93692-5_13. 
32. Blackburn, S. R., Bomberger, C. & Winkler, P. (2019). “The minimum Manhattan distance and minimum jump of permutations”. Journal of Combinatorial Theory Series A, Vol. 161, pp. 364-386. DOI: 10.1016/j.jcta.2018.09.002. 
33. Han, H., Mu, J., He, Y.-C. & Jiao, X. (2019). “Cosset Partitioning Construction of Systematic Permutation Codes Under the Chebyshev Metric”. IEEE Transactions on Communications, Vol. 67(6), pp. 3842-3851. DOI: 10.1109/TCOMM.2019.2900679. 
34. Rossi, G. C. & Testa, M. (2018). “Euclidean versus Minkowski short distance”. Physical Review D., Vol. 98(5), article 054028. DOI: 10.1103/PhysRevD.98.054028. 
35. Krisilov, V., Liubchenko, V. & Kavitska, V. (2012). “The methods for goal-oriented estimation of model adequacy”. [Metody tseleorientirovannoi otsenki adekvatosti]. Odes’kyi Politechnichnyi Universytet. Pratsi, Vol. 2(39), pp. 160-184 (in Russian). 
36. Krisilov, V. A. & Komleva, N. O. (2019). “Analysis and Evaluation of Competence of Information Sources in Problems of Intellectual Data Processing”. [Analiz i ocenka kompetentnosti istochnikov informacii v zadachah intellektual'noj obrabotki dannyh]. Problemele Energeticii Regionale, Vol. 1-1(40), pp. 91-104 DOI: 10.5281/zenodo.3239184 (in Russian). 
37. Larshin, V. & Lishchenko, N. (2019). “Educational technology information support”. Herald of Advanced Information Technology, Vol. 2 No. 4, pp. 317-327. DOI: 10.15276/hait.04.2019.8.

Received 11.01.2020
Received after revision 18.02.2020
Accepted 20.02.2020
Published:
Last download:
11 June 2021

Contents


[ © KarelWintersky ] [ All articles ] [ All authors ]
[ © Odessa National Polytechnic University, 2018.]