Deep Aerial Object Recognition
Aerial imagery, which is captured by drones or unmanned aerial vehicles, is a great tool for surveillance because of its wide field of view and the ability of drones to access places that would be physically difficult to visit. Aerial imagery has many applications like boarder security, search and rescue tasks, and image and video understanding. The aerial imagery has an advantage of wide area view but this results in objects of interest occupying a small number of pixels in the image. Therefore, it is common that a vehicle in an aerial view is missed by an object detector. Because of the background or other objects a large number of false positive predictions are also highly probable. The application of aerial vehicle detection and recognition can be more specific if the goal of the system is not just limited to detect vehicles but to detect and find specific vehicles. For example, a detection system can concentrate on searching for a specific car with a specific color, type, and other descriptions (e.g., yellow taxi, large green truck). In this scenario, the detection system can be used in the applications like finding a suspicious vehicle or a specific target vehicle among several other vehicles, objects, and backgrounds. In this project, we proposed a framework that can handle the problem of open-ended classification or prediction. A classical image classification system (see Fig. 1a) receives an image and produces an output label. However, in this project, we use a novel architecture (see Figure 1b) in which it receives an image and a desired text description of the queried object (i.e., vehicle label), represented by a code-vector, and makes a yes or no decision about the correctness of the input label. In other words, it decides if the input image has the desired class label or not.