Date of Award
5-2023
Document Type
Thesis
Degree Name
Master of Science
Degree Discipline
Electrical Engineering
Abstract
Object Detection is a very popular and essential task in computer vision. This research explored super-resolution images generated via a super-resolution generative Adversarial Network (SRGAN) for object detection with various models. Specifically, the perceptual loss calculation uses an SRGAN, which is modeled after the Keras VGG19 network for deep feature extraction. This generated adversarial network is typically used to increase the quality of low-resolution imagery. Therefore, in the use case of satellite imagery, it can increase the pixel count per object instance in the image, resulting in better detection results. Based on previous models that were trained on original View data, it was observed that the maximum mean average precision (mAP) scores were 46.6 % in classes with adequate representation. Even lower mAP scores were observed in small objects and classes with few instances throughout the data set. The performance of newer YOLO models has increased due to advancements in anchor calculations, which use unsupervised learning techniques such as K-means. The results showed that newer improved models using the SRGAN data set improved on previous versions in the following way: increased Intersection Over Union (IOU), recall, and precision score. However, for more significant improvements, data pre-processing techniques should take priority as model architecture and optimizers aid in the process; the root issue is the challenges presented by this unique form of data. The continued use of model fine-tuning and overcoming obstacles associated with satellite data, such as high instance count per image, low pixel representation of small objects, and monochromatic photos/ backgrounds, should make the goal of accurate and fast object detection a reality. The advancements made in each You Only Look Once (YOLO) model paired with the fine tuning of hyperparameters and super-resolution images is the start of conquering the View Data set.
Index terms: Convolutional Neural Networks (CNN), Enhanced Super Resolution Generative Adversarial Network (ESRGAN), Generative Adversarial Network (GAN), Hyperparameters, Intersection Over Union (IOU), K-Means, Mean Squared Error (MSE), Recurrent Convolutional Neural Networks (RCNN), Rectified Linear Unit (ReLU), Regions of Interest (ROI), Super Resolution Generative Adversarial Network (SRGAN), Single Stage Detector (SSD), Unmanned Aerial Vehicles (UAVs), You-Only-Look-Once (YOLO)
Committee Chair/Advisor
Lijun Qian
Committee Member
Pamela Obiomon
Committee Member
Xishuang Dong
Committee Member
Xiangfang Li
Committee Member
Annamalai Annamalai
Publisher
Prairie View A&M University
Rights
© 2021 Prairie View A & M UniversityThis work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Date of Digitization
1/3/2025
Contributing Institution
John B Coleman Library
City of Publication
Prairie View
MIME Type
Application/PDF
Recommended Citation
Dukes, X. (2023). Srgan Images And Object Detection On The Xview Dataset. Retrieved from https://digitalcommons.pvamu.edu/pvamu-theses/1540