Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni

In this research, we address the problem of event detection and localization in football (soccer) videos. While the problem of event detection in videos is itself a research problem, event detection in sports, especially in football, has an important commercial impact as well. Football is played by...

Full description

Bibliographic Details
Main Author: Behzad , Mahaseni
Format: Thesis
Published: 2021
Subjects:
Online Access:http://studentsrepo.um.edu.my/14569/
http://studentsrepo.um.edu.my/14569/1/Behzad.pdf
http://studentsrepo.um.edu.my/14569/2/Behzad_Mahaseni.pdf
_version_ 1848775004678782976
author Behzad , Mahaseni
author_facet Behzad , Mahaseni
author_sort Behzad , Mahaseni
building UM Research Repository
collection Online Access
description In this research, we address the problem of event detection and localization in football (soccer) videos. While the problem of event detection in videos is itself a research problem, event detection in sports, especially in football, has an important commercial impact as well. Football is played by more than 250 million players in 200+ nations. In addition, it has the highest television audience in sport. This makes football the most popular sport in the world. Considering the advancement in streaming technologies on mobile platforms, it is important to develop efficient and fast processing algorithms for thousands of videos captured and stored in the cloud. Unlike images, videos provide additional temporal information. While this additional information is helpful, it also makes the reasoning more challenging. On one hand, from the local correlation between adjacent frames, it is possible to identify the short-range correlation between player movements. On the other hand, one can identify the mid-range and long-range correlation between events that are seconds away from each other. One important challenge in analyzing long videos is how to consider all range of correlations (short - long) between video frames. Localizing (temporal segmentation) events in a football video is a challenging problem. While the general problem of temporal segmentation in videos have been extensively addressed in the literature, to the best of our knowledge this work is the among the first to address the event localization problem in “long” football videos using end-to-end deep learning techniques. Football videos are long and the correlation between frames in the video ranges from short to long. To model various range of correlations in football videos, we propose to use a combination of two-stream CNNs and dilated RNNs with LSTM cells, to capture short-range and long-range correlations. Our experimental result shows 5.4% - 11.4% accuracy improvement compared to the state of the art and the baselines for the problem of spotting in long videos presented in the largest football dataset available for research community (i.e., SoccerNet).
first_indexed 2025-11-14T14:07:18Z
format Thesis
id um-14569
institution University Malaya
institution_category Local University
last_indexed 2025-11-14T14:07:18Z
publishDate 2021
recordtype eprints
repository_type Digital Repository
spelling um-145692023-07-02T23:47:11Z Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni Behzad , Mahaseni QA75 Electronic computers. Computer science T Technology (General) In this research, we address the problem of event detection and localization in football (soccer) videos. While the problem of event detection in videos is itself a research problem, event detection in sports, especially in football, has an important commercial impact as well. Football is played by more than 250 million players in 200+ nations. In addition, it has the highest television audience in sport. This makes football the most popular sport in the world. Considering the advancement in streaming technologies on mobile platforms, it is important to develop efficient and fast processing algorithms for thousands of videos captured and stored in the cloud. Unlike images, videos provide additional temporal information. While this additional information is helpful, it also makes the reasoning more challenging. On one hand, from the local correlation between adjacent frames, it is possible to identify the short-range correlation between player movements. On the other hand, one can identify the mid-range and long-range correlation between events that are seconds away from each other. One important challenge in analyzing long videos is how to consider all range of correlations (short - long) between video frames. Localizing (temporal segmentation) events in a football video is a challenging problem. While the general problem of temporal segmentation in videos have been extensively addressed in the literature, to the best of our knowledge this work is the among the first to address the event localization problem in “long” football videos using end-to-end deep learning techniques. Football videos are long and the correlation between frames in the video ranges from short to long. To model various range of correlations in football videos, we propose to use a combination of two-stream CNNs and dilated RNNs with LSTM cells, to capture short-range and long-range correlations. Our experimental result shows 5.4% - 11.4% accuracy improvement compared to the state of the art and the baselines for the problem of spotting in long videos presented in the largest football dataset available for research community (i.e., SoccerNet). 2021-04 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/14569/1/Behzad.pdf application/pdf http://studentsrepo.um.edu.my/14569/2/Behzad_Mahaseni.pdf Behzad , Mahaseni (2021) Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni. Masters thesis, Universiti Malaya. http://studentsrepo.um.edu.my/14569/
spellingShingle QA75 Electronic computers. Computer science
T Technology (General)
Behzad , Mahaseni
Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni
title Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni
title_full Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni
title_fullStr Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni
title_full_unstemmed Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni
title_short Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni
title_sort spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / behzad mahaseni
topic QA75 Electronic computers. Computer science
T Technology (General)
url http://studentsrepo.um.edu.my/14569/
http://studentsrepo.um.edu.my/14569/1/Behzad.pdf
http://studentsrepo.um.edu.my/14569/2/Behzad_Mahaseni.pdf