Date of Award

5-2023

Document Type

Thesis

Degree Name

Master of Science

Degree Discipline

Electrical Engineering

Abstract

Object Detection is a very popular and essential task in computer vision. This research explored super-resolution images generated via a super-resolution generative Adversarial Network (SRGAN) for object detection with various models. Specifically, the perceptual loss calculation uses an SRGAN, which is modeled after the Keras VGG19 network for deep feature extraction. This generated adversarial network is typically used to increase the quality of low-resolution imagery. Therefore, in the use case of satellite imagery, it can increase the pixel count per object instance in the image, resulting in better detection results. Based on previous models that were trained on original View data, it was observed that the maximum mean average precision (mAP) scores were 46.6 % in classes with adequate representation. Even lower mAP scores were observed in small objects and classes with few instances throughout the data set. The performance of newer YOLO models has increased due to advancements in anchor calculations, which use unsupervised learning techniques such as K-means. The results showed that newer improved models using the SRGAN data set improved on previous versions in the following way: increased Intersection Over Union (IOU), recall, and precision score. However, for more significant improvements, data pre-processing techniques should take priority as model architecture and optimizers aid in the process; the root issue is the challenges presented by this unique form of data. The continued use of model fine-tuning and overcoming obstacles associated with satellite data, such as high instance count per image, low pixel representation of small objects, and monochromatic photos/ backgrounds, should make the goal of accurate and fast object detection a reality. The advancements made in each You Only Look Once (YOLO) model paired with the fine tuning of hyperparameters and super-resolution images is the start of conquering the View Data set.

Index terms: Convolutional Neural Networks (CNN), Enhanced Super Resolution Generative Adversarial Network (ESRGAN), Generative Adversarial Network (GAN), Hyperparameters, Intersection Over Union (IOU), K-Means, Mean Squared Error (MSE), Recurrent Convolutional Neural Networks (RCNN), Rectified Linear Unit (ReLU), Regions of Interest (ROI), Super Resolution Generative Adversarial Network (SRGAN), Single Stage Detector (SSD), Unmanned Aerial Vehicles (UAVs), You-Only-Look-Once (YOLO)

Committee Chair/Advisor

Lijun Qian

Committee Member

Pamela Obiomon

Committee Member

Xishuang Dong

Committee Member

Xiangfang Li

Committee Member

Annamalai Annamalai

Publisher

Prairie View A&M University

Rights

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Date of Digitization

1/3/2025

Contributing Institution

John B Coleman Library

City of Publication

Prairie View

MIME Type

Application/PDF

Recommended Citation

Dukes, X. (2023). Srgan Images And Object Detection On The Xview Dataset. Retrieved from https://digitalcommons.pvamu.edu/pvamu-theses/1540

Download

COinS

All Theses

Srgan Images And Object Detection On The Xview Dataset

Date of Award

Document Type

Degree Name

Degree Discipline

Abstract

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Committee Member

Publisher

Rights

Date of Digitization

Contributing Institution

City of Publication

MIME Type

Recommended Citation

Browse

Search

Author Corner

All Theses

Srgan Images And Object Detection On The Xview Dataset

Author

Date of Award

Document Type

Degree Name

Degree Discipline

Abstract

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Committee Member

Publisher

Rights

Date of Digitization

Contributing Institution

City of Publication

MIME Type

Recommended Citation

Share

Browse

Search

Author Corner