Scientific Journal

Applied Aspects of Information Technology

A disadvantage of many diagnostic systems is the inability to sufficiently assess the decisions reliability. While solving the problem of classification, each example may be classified with different degree of quality. So, a measure of the quality of an example classification was used (a non-conformity measure). The goal of the research is to improve evaluation of the diagnostics reliability in medicine based on conformal predictors which allow carrying out a probabilistic classification, as well as identifying abnormal cases when either the classifier is unable to determine the class for a particular object, or assigns one object to several classes at once. The paper describes the constructing and testing of various probabilistic binary classification models based on machine learning, particularly, the SVM method and conformal predictors using a non-conformity measure. For learning and testing the medicine Breast Cancer Wisconsin (Diagnostic) Data Set was used to construct linear, polynomial of different degrees and RBF models. We assessed the prediction results for every example from the test set as well as the integral characteristics of the quality of the models, taking into account both the correctness of the predictions for each class and the number of different types of anomalies. On the basis of the best selected models (linear, polynomial model of the 2nd degree and RBF), we developed an intelligent diagnostic system in medicine, which allows automating the model’s construction, as well as carrying out the diagnostics and displaying the confidence of the received diagnosis or a message about the impossibility of making a diagnosis. The program also allows multiple doctors to log in to the system, adding new patients and editing information about them; every patient has their medical record with the results of the examination and the diagnoses given. The results of the research can be applied in the diagnostic systems for various diseases. This can be done by using the data with the symptoms and the corresponding diagnoses and constructing the appropriate models on this basis.

1. (2019). “A Course in Machine Learning”, Posted by Hal Daumé III, 2015, [Electronic Resource]. – Access mode: ciml-v0_9-all.pdf. – Active link – 27.02.2019.
2. Flach P. (2015). Mashinoe obuchenie: nauka i iskustvo postroenia algoritmov, kotorie izvlekaut znania iz dannich, [Machine Learning: The Art and Science of Algorithms that Make Sense of Data], Publ. DMK Press, 400 p. (in Russian).
3. (2012). “Using Decision Trees in Evidence Based Medicine”. Posted by Venky Rao, [Electronic Resource]. – Access mode: https://www.datascience – Active link – 13.03.2019.
4. (2019). “Data Algorithms by Mahmoud Parsian”, [Electronic Resource]. – Access mode: 9781491906170/ch13.html. – Active link – 02.03.2019.
5. Statistical Learning Theory, & Vladimir N. Vapnik, (1998). “JOHN WILEY & SONS”, Incorporation.
6. Shitikov V., & Mastickiy S. (2017). Klasifikacia, regressia i drugie algoritmi Data Mining s ispolsovaniem R, [Classification, Regression and other algorithms of Data Mining with the use of R], [Electronic Resource] – Access mode: (in Russian). – Active link – 15.03.2019.
7. David, W. Whosmer, & Stainly, Lemetshow. (2000). “Applied Logistic Regression”, JOHN WILEY & SONS, Incorporation.
8. Vovk, V., Gammerman, A. & Shafer, G. (2005) “Algorithmic Learning in a Random World”, Publ. Springer, New York.
9. Gavrilova, T., & Choroshevskiy, V. (2005). Basi znaniy intelektualnich system, [Knowledge bases of intellectual systems], Publ. Izd. SPb. Piter, Russian Federation (in Russian).
10. Joseph C. Giarratano, & Gary D. Riley. (2007). “Expert Systems: principles and programming, Thomson course technology”, PeopleSoft, Incorporation.
11. Yalovec A. (2011). Predstavlenie i obrabotka znaniy s tochki zrenia matematicheskogo modelirovania. Problemi i reshenia, [Representation and processing of knowledge from the point of view of mathematical modeling. Problems and Solutions], Kiev, Ukraine, Publ. Naukova Dumka, NAN Ukraine, 399 p. (in Russian).
12. Strahov A., Strahov O., & Strahov E. (2019). Sposob avtomatizacii obshei funkcionalnoi dignostiki, [Automation method of general functional diagnostics], [Electronic Resource]. – Access mode: (in Russian). – Active link – 10.03.2019.
13. (2019). “DXplain: Patterns of Use of a Mature Expert System”. Edward, P Hoffer, Mitchell, J. Feldman, Richard, J. Kim, Kathleen, T. Famiglietti, and G. Octo, Barnett, 2005, [Electronic Resource] – Access mode: articles/ PMC1560464/. Active link – 20.01.2019.
14. (2019). “Using decision support to help explain clinical manifestations of disease”, [Electronic Resource] – Access mode: projects/dxplain/. Active link – 06.01.2019.
15. Ruvinskaya V. & Moldavskaya A. (2018). “Methods for Automated Generation of Scripts Hierarchies from Examples and Diagnosis of Behavior, Recent Developments in Data Science and Intelligent Analysis of Information”, ICDSIAI 2018, June 4-7, Kyiv, Ukraine, pp. 189-198. Part of “Advances in Intelligent Systems and Computing”, Publ. Springer, Cham, Vol. 836.
16. Bezrukov, N., Eremin, Е., & Perelman, Y. (2007). Avtomatizirovannay systema diagnostiki zabolevaniy legkich, [Automated lung disease diagnosis system], Control Sciences, 5, pp. 75-80 (in Russian).
17. (2019). “Building Classification Models: ID3 and C4.5, Temple University (US), [Electronic Resource] – Access mode: Ошибка! Недопустимый объект гиперссылки.. – Active link – 02.02.2019.
18. Ruvinskaya, V., Parshin, I., & Shevchuk, I. (2018). Provedenie experimentov po diagnostike v medicine na osnove metodov klassifikacii i analyz resultatov, [Conducting experiments on the diagnosis in medicine and analysis of the results], Modern Information Technologies, pp. 27-28 (in Russian).
19. (2013). “M-health: supporting automated diagnosis”, electronic health records by Efthimios, Alepis and Christos, Lambrinidis, [Electronic Resource] – Access mode: https://www.ncbi.nlm.nih. gov/pmc/articles/PMC3611032/. – Active link – 03.01.2019.
20. (2016). “Isabel Symptoms Checker for Patient Engagement”, 1st European Conference, 3rd June, 2016, Rotterdam, the Netherland, [Electronic Resource]. – Access mode: http://v4.isabelhealth – Active link – 11.03.2019.
21. Alex, Gammerman, & Vladimi, Vovk. (2007). “Hedging Predictions in Machine Learning”, the Computer Journal, 50: pp. 151-163.
22. Paolo, Toccaceli, Ilia, Nouretdinov, & Alexander, Gammerman. (2016). “Conformal Predictors for Compound Activity Prediction”, Conformal and Probabilistic Prediction with Applications. 5th International Symposium, COPA 2016, Madrid, Spain, April 20-22, 2016. Proceedings, Part of “Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)”, Publ. Springer Naturem, Vol. 9653, pp. 51-66.
23. Antonis, Lambrou & Harris, Papadopoulos. (2016). “Binary Relevance Multi-label Conformal Predictor”, Conformal and Probabilistic Prediction with Applications, 5th International Symposium, COPA 2016, Madrid, Spain, April 20–22, 2016. Proceedings, Part of “Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)”, Publ. Springer Nature, Vol. 9653, рр. 90-104.
24. Viugin, V. (2013). Matematicheskie osnovi teorii mashinogo obuchenia i prognozirovania, [Mathematical foundations of the theory of machine learning and forecasting], Мoscow, Russian Federation, 387 p. (in Russian).
25. (2019). “Machine Learning in Medicine”, [Electronic Resource]. – Access mode: – Active link – 20.01.2019.
26. Donskoi, V. (2014). Algoritmicheskie modeli obuchenia i klasifikacii: obosnovanie, sravnenie, vibor, [Algorithmic training and classification models: justification, comparison, choice], Simferopol, Publ. DIIP, 228 p. (in Russian).
27. Hastie, T., Tibshirani R., & Friedman, J. (2009). Chapter 7.9. Vapnik–Chervonenkis Dimension, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction”, 2nd ed. Publ. Springer-Verlag, 746 p.
28. (2019). Breast Cancer Wisconsin (Diagnostic) Data Set, [Electronic Resource] – Access mode: – Active link – 21.02.2019.
29. (2019). Chih-Chung Chang and Chih-Jen Lin. LIBSVM – A Library for Support Vector Machines, [Electronic Resource]. – Access mode: – Active link – 13.03.2019.
30. Zagoruiko, N. (1999). Prikladnie metodi analiza danich i znaniy, [Applied methods of data and knowledge analysis], Novosibirsk, Russia, Publ. Izdatelstvo Instituta Matematiki (in Russian).

Last download:
23 Oct 2021


[ © KarelWintersky ] [ All articles ] [ All authors ]
[ © Odessa National Polytechnic University, 2018.]