Development of an object identification algorithm for the forging industry based on standard vision systems

Development of an object identification algorithm for the forging industry based on standard vision systems

Adrian Litwa, Lukasz Madej

AGH University of Krakow, Mickiewicza 30 av., 30-059 Krakow, Poland.



The work aims to develop an algorithm for identifying objects in a forging plant under production conditions. Particular emphasis is placed on the accurate detection and tracking of forgings that are transferred along the forging line and, if possible, detection will also cover employees controlling and supporting the operation of forging machines, all of this with the use of standard vision systems. An algorithm prepared in such way will allow the performance of effective detections that will support activities related to the control of the movement of forging elements, the analysis of safety in workplaces, and the monitoring of compliance with Occupational Health and Safety Regulations by employees, as well as also allowing for the introduction of additional optimization algorithms that will further enrich the presented model, which may prove to be a long-term goal that will form the basis for subsequent work. Three algorithmic solutions with different levels of complexity were considered during the research. The first two are based on artificial neural network solutions, while the last one utilizes classical image processing algorithms. The datasets for training and validation in the former cases were generated based on the recordings taken from standard cameras located in the forging plant. Data were acquired from three cameras, two of which were used to create training and validation sets, and a third one was used to verify how the developed algorithms would work in a variable environment that was previously unknown to the models. The impact of model parameters on the results is presented at this stage of the research. It has been proven that machine learning-based solutions cope very well with object detection problems and achieve high accuracies after a precise selection of hyperparameters. Algorithms show the performance of detections with excellent accuracy of 92.5% for YOLOv5 and 94.3% for Mask R-CNN. However, a competitive solution using only image transformations without machine learning showed satisfactory results that can also be obtained with simpler approaches.

Cite as:

Litwa, A., Madej, L. (2024). Development of an object identification algorithm for the forging industry based on standard vision systems. Computer Methods in Materials Science, 24(1), 5-14 .

Article (PDF):


Machine learning, Artificial neural networks, YOLOv5, Mask R-CNN, Forging industry, Object identification, Vision systems


Bhadani, A. K., Sinha, A. (2020). A facemask detector using machine learning and image processing techniques. Engineering Science and Technology, an International Journal, 11, 0–8.

Cao, Q., Qingge, L., Yang, P. (2021). Performance Analysis of Otsu-Based Thresholding Algorithms: A Comparative Study. Journal of Sensors, 2021, 4896853.

Goenka, U., Jagetia, A., Patil, P., Singh, A., Sharma, T., Saini, P. (2022). Threat detection in self-driving vehicles using computer vision. In R. Doriya, B. Soni, A. Shukla, X.-Z. Gao (Eds.), Machine Learning, Image Processing, Network Security and Data Sciences. Select Proceedings of 3rd International Conference on MIND 2021 (pp. 617–630). Springer Singapore. “Lecture Notes in Electrical Engineering”, vol. 946.

Hassan, E., El-Rashidy, N., Talaa, F. M. (2022a). Review: Mask R-CNN models. Nile Journal of Communication & Computer Science, 3(1), 17–27.

Hassan, E., Shams, M. Y., Hikal, N. A., Elmougy, S. (2022b). The effect of choosing optimizer algorithms to improve computer vision tasks: a comparative study. Multimedia Tools and Applications, 82, 16591–16633.

He, K., Gkioxari, G., Dollár, P., Girshick, R. (2017). Mask R-CNN. ArXiv.

Indolia, S., Goswami, A. K., Mishra, S. P., Asopa, P. (2018). Conceptual understanding of convolutional neural network – a deep learning approach. Procedia Computer Science, 132, 679–688.

Jais, I. K. M., Ismail, A. R., Nisa, S. Q. (2019). Adam optimization algorithm for wide and deep neural network. Knowledge Engineering and Data Science, 2(1), 41–46.

Jiang, T., Gradus, J. L., Rosellini, A. J. (2020). Supervised machine learning: A brief primer. Behavior Therapy, 51(5), 675–687.

Kandel, I., Castelli, M. (2020). The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express, 6(4), 312–315.

Karthi, M., Muthulakshmi, V., Priscilla, R., Praveen, P., Vanisri, K. (2021). Evolution of YOLO-V5 Algorithm for Object Detection: Automated Detection of Library Books and Performace validation of Dataset. In 2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES) (pp. 1–6). IEEE.

Krogh, A. (2008). What are artificial neural networks?. Nature Biotechnology, 26(2), 195–197.

Kühl, N., Goutier, M., Baier, L., Wolff, C., Martin, D. (2020). Human vs. supervised machine learning: Who learns patterns faster?. Cognitive Systems Research, 76, 78–92.

Otani, M., Togashi, R., Nakashima, Y., Rahtu, E., Heikkilä, J., Satoh, S. (2022). Optimal correction cost for object detection evaluation. ArXiv.

Sarker, I. H. (2021). Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Computer Science, 2(6), 420.

Sharma, S., Dubost, F., Lee-Messer, C., Rubin, D. (2021). Automated detection of patients in hospital video recordings. ArXiv.

Uzair, M., Jamil, N. (2020). Effects of hidden layers on the efficiency of neural networks. In 2020 IEEE 23rd International Multitopic Conference (INMIC). IEEE.

Yamaura, H., Tamura, M., Nakamura, S. (2018). Image blurring method for enhancing digital content viewing experience. In M. Kurosu (Eds.), Human-Computer Interaction. Theories, Methods, and Human Issues. 20th International Conference, HCI International 2018, Las Vegas, NV, USA, July 15–20, 2018, Proceedings, Part I (pp. 355–370). Springer Cham. “Lecture Notes in Computer Science”, vol. 10901.

Zhuang, J.-X., Tao, W., Xing, J., Shi, W., Wang, R., Zheng, W.-s. (2021). Understanding of kernels in CNN models by suppressing irrelevant visual features in images. ArXiv.