The Contemporary Problem of Search: Search and Rescue in the Digital Age
By Matthew Mitchell, Founder and CEO, International Association of Search and Rescue Coordinators (IASARC)
Introduction
The Contemporary Problem of Search can be summed up in a word: pixels. This problem manifests in two distinct questions. First, how do you optimize the collection of pixels? In contemporary search, search units can be viewed as pixel collection units. The more efficiently they can collect the adequate pixels for the mission, the better. Ultimately, the pixel doesn’t care if it’s collected from an aircraft, surface vessel, drone, or satellite.
Second, how do you turn the collected pixels into actionable information that search and rescue (SAR) responders can use to save lives? This involves transferring data from a sensor to a computer and then processing the data to locate and recognize the complex shapes that are the objects of a search (e.g., a person in the water, a capsized vessel, etc.). To be effective for the SAR mission, it must identify and alert the human operator to only those targets the search aims to find, while dismissing all other objects; it must avoid false alerts.
In simple terms, the contemporary search problem comes down to collecting pixels as efficiently as possible and processing them to recognize what we want to find while ignoring everything else.
Setting the Stage
Historically, the search pattern for a SAR response relied on detection—whether a human operator could detect that something, anything, was present. These human operators may have tools to enhance their detection abilities, such as night vision goggles or binoculars, but ultimately, it was the human operator who detected a potential search object. Once a detection was made, the human operators would have to interrogate that object to discern whether it was indeed the object of their search. This most often involved diverting from a designated search plan to close the distance to the object for a better view.
Unfortunately, these human-in-the-loop searches utilize a sensor system that is infinitely variable by nature: the human. This sensor system includes the sensor itself, the eye, and the cognitive processor, the brain, both of which have attributes that can result in missed detections and false recognition of a potential object in a search.
The International Aeronautical and Maritime Search and Rescue Manual (IAMSAR) describes the situation succinctly:
“…the eye is vulnerable to the vagaries of the mind. We can “see” and identify only what our mind permits us to see.”
“In situations where no fixed reference points exist, such as at sea or in open landscapes, the eye may fail to focus, resulting in empty field myopia, which leaves an observer with a loss of focus and a blurred or absent image.”
Cognitive biases can lead an observer to overlook an object directly before them, even while actively conducting a systematic scan. Additionally, optical illusions and our inherent expectations influence our behaviors and perceptions, often without our awareness. Physical limitations, too, are well-documented; the eye struggles with rapid refocusing, requiring one to two seconds to switch between near and distant objects. In situations where no fixed reference points exist, such as at sea or in open landscapes, the eye may fail to focus, resulting in empty field myopia, which leaves an observer with a loss of focus and a blurred or absent image. Furthermore, while the human eye has incredible resolution in the center of the field of view, it resolves only one-tenth as much detail with just a 20° offset.
Atmospheric conditions, windscreen distortions, glare, and lighting can obscure potential targets. For example, looking toward the sun can cause objects to disappear into the brightness. Meanwhile, hazy conditions can distort perspectives, making distant objects seem closer or further away than they actually are. Additionally, physical factors like fatigue, emotions, age, medication, or even something as trivial as a fallen eyelash can impair vision. Observers in aircraft or moving vessels also encounter vibrations, cabin temperature changes, and even variations in oxygen levels, all of which can diminish optical performance.
The result is that only about 19% of person-in-the-water incidents ever result in a life saved. Historical examples abound of persons in distress being directly overflown in good search conditions yet remaining undetected by human operators. Indeed, the human sensor system was not engineered to solve the problem of search.
What the human excels at is recognizing complex objects. For example, humans can discern with remarkable accuracy the difference between a mannequin and a person, a buoy and a cooler, and so forth. Although many attempts have been made to quantify human performance in detecting and recognizing complex objects, none is more famous than the Johnson Criteria. Developed in 1958, the Johnson Criteria are a method for predicting the probability of accurate target discrimination. This model defines the minimum resolution required to achieve a 50% probability of an observer properly distinguishing a target object.
Under the Johnson Criteria, resolution is described in terms of line pairs: one white line adjacent to one black line in an image, each line pair corresponding to a single pixel. His criteria also included the effect of the number of cycles (or frames) in which the object was resolvable and thus could be viewed by the human operator. The Criteria’s thresholds are as follows: for detection, at least one line pair or 2 pixels on the target are required for at least one resolvable cycle, while positive recognition of a target (termed identification in this model) required 6.4 line pairs or nearly 13 pixels for six resolvable cycles.
When the Johnson Criteria were developed, the sheer volume of data generated by modern EOIR sensors could not have been imagined. The use cases focused on distinguishing targets from a relatively small amount of available data, such as an observer wearing night vision goggles with a comparatively narrow FOV. It must be clear that the Johnson Criteria, while a starting point to understand the minimum theoretical resolution required for a human to discern a target, are categorically different from the criteria required to evaluate a human’s ability to process millions or even billions of pixels of visual information to locate the proverbial needle in the haystack accurately.
Contemporary Opportunities
Current technologies offer solutions to the limitations of both the human visual sensor, the eye, and the processing unit, the brain.
Today’s Electro-Optic Infrared (EOIR) sensors can collect massive amounts of pixels. Commercially available drones, priced at less than $1,000, are now equipped with 8k cameras capable of producing nearly 800 million pixels of visual information per second. Advanced military wide-area mapping sensor systems can produce more than 100 billion pixels.
Further, unlike the human eye, which has the highest resolution in the very center 2° of its Field of View (FOV), modern EOIR systems maintain consistent resolution throughout their entire FOV. Additionally, EOIR sensors don’t suffer from empty field myopia, lag in adjusting near-to-far focus, or any of the other many human-related performance barriers
Processing all the information available through modern EOIR sensors is impossible for the human grey matter. At best, it is estimated that a human visual cortex can process 20 million “pixels” worth of information in a single second. When applied to the problem at hand, that processing power is skewed through the Double Detection Problem, where the human must process information through a secondary medium (i.e., a screen), which is necessarily an imperfect representation of the actual imagery collected. At the same time, the human operator must also process other information from their environment (i.e., what’s going on around them), and thus can never focus absolutely on the mission at hand.
With modern processing power and advanced deep learning algorithms, computers can now recognize complex shapes, such as persons in the water and other search objects. While the volume of information processed is limited by the available processing capability, there is ultimately no theoretical upper limit to the volume of visual information that can be processed. Although rudimentary computer vision programs that can detect the presence of anomalous objects have been in operation for some time, recognizing a complex shape accurately requires cutting-edge deep learning algorithms (hereafter referred to as intelligent recognition systems) trained on massive amounts of empirical data.
In general, today’s most advanced intelligent recognition systems require approximately 15 pixels and five frames or cycles to positively recognize a complex shape. Unlike humans, however, these systems can process information immediately; they do not get tired or distracted, do not suffer from cognitive biases or assumptions, and are entirely consistent.
But why is it necessary to positively recognize a target, rather than merely detect an object that can be further interrogated by a human? Consider the sheer number of anomalous objects that may appear in any given search area. Similar to staring at the clouds, if you do so for long enough, you’ll begin to see objects that aren’t there (i.e., false positives). While executing a life-saving search, every moment you waste interrogating objects that aren’t the object of your search is another moment closer to death for those whom you seek to save.
What does this all mean?
Two fundamental truths quickly emerge regarding the contemporary problem of search.
First, there is no future state in which humans are the primary processors of sensor information for SAR.
Even modest EOIR sensors can produce 20 times the information a human can process. For example, an off-the-shelf 4K drone camera operating at 60 frames per second generates 20 times the visual information a single human can process, even with 100% focus and in an optimal environment. At the far end of the spectrum, sensor platforms like the U.S. Air Force’s MQ-9 Gorgon Stare produce around 5,000 times more information than can be adequately processed by a human observer.
Solving the contemporary problem of search necessitates an intelligent recognition system to process the sensor data adequately.
Second, most nations’ current sensor systems are ill-suited to the SAR mission.
Standard coast guard and military sensors, such as the Teledyne 380 HD, feature relatively low-resolution camera systems (1080p daytime, 620p IR), yet offer incredible zoom capabilities (120x). These pan-tilt-zoom (PTZ) systems are ideal for interrogating individual targets at great distances but provide limited value for wide-area mapping and surveillance, essential for conducting an effective search over a large area. This is particularly true for detecting and recognizing search objects of limited visibility, such as persons in the water.
Consider that a commercial off-the-shelf 8k drone, purchased for $1,000, can cover a swath of 0.13 Nautical Miles (NM) at a resolution of 3 cm (.03 meters per pixel), while the U.S. Coast Guard’s current Teledyne 380 HD, which is installed on most of their fixed-wing aircraft, can only cover a swath of 0.02 NM at the same resolution.
Thus, fundamental sensor procurement requirements for SAR assets require significant revision. For wide-area mapping and surveillance, which is what the SAR mission entails, expensive PTZ cameras with high focal length lenses are unnecessary. Instead, wide-angle, high-resolution fixed camera systems are much more effective and significantly cheaper.
To function adequately, intelligent recognition systems require pixels. The absolute minimum for positive recognition is 15 pixels, with 20 being preferred for maximum accuracy. To reduce false positives, at least five frames of video are necessary to allow the system to filter out objects that may briefly appear in a similar shape to the desired search object. Thus, frame rate and resolution are critical for matching the ideal sensor with intelligent recognition systems. The focal length of the sensor’s lens or the zoom capability only determines search characteristics such as altitude. An extremely wide FOV (e.g., 107°) would necessitate very low flight altitudes, while a narrow FOV (e.g., 19°) could allow for significantly higher flight altitudes.
What’s the Catch?
The above could almost make the contemporary problem of search sound too easy. Buy a bunch of cheap drones with good cameras, and you’re off to the races. Well, there’s a catch… several of them. Search and rescue (SAR) events in the maritime environment routinely take place at significant distances from land. They occur in harsh conditions with minimal or no terrestrial communications. Each conceivable use case for a pixel collection unit has drawbacks and technical hurdles. Let’s examine several of the common use cases.
SAR Aircraft (fixed wing or helicopter)
Aircraft dedicated to SAR missions can generally operate in remote, harsh conditions. With few exceptions, they possess sufficient onboard electrical power and available space for the modest computing capacity required to run an intelligent recognition system that processes data from one or two modest EOIR sensors. The most significant challenge is navigating the airworthiness testing and certification process to ensure that any new hardware does not threaten safe flight operations.
Unmanned Aerial Systems (UAS, aka drones)
Short- and long-range (over-the-horizon) UAS are readily available. Small, short-range units can be deployed from small vessels or mobile land teams but have exceptionally limited payloads. Longer-range units require greater training, have more limited deployment options, and generally offer greater payload flexibility. UAS, however, including short- and long-range models, typically lack the payload capacity for the processing power required to run an intelligent recognition system. Thus, leveraging such technology would require transmitting data in real-time back to a base station for processing. Unfortunately, commercial off-the-shelf systems usually only transmit low-resolution visual imagery (less than 1080p), keeping the high-resolution data onboard for later download. To be effective for the SAR mission, the UAS use case would necessitate either processing the data onboard, sacrificing range or other payloads, or employing more robust transmission capabilities to allow remote data processing in real time.
Surface Vessels
Like SAR aircraft, surface vessels dedicated to the SAR mission are often robust crafts capable of operating in remote, harsh environments. They also have ample electrical power and space for additional processors and other equipment. Many modern vessels also have existing EOIR systems, although they are generally low-resolution models with long-range zoom, which is not what the SAR mission demands. In application, their only drawbacks are their relatively slow speed, instability due to sea state, and low line of sight, which impacts a target’s detectability during rough seas.
Commercially Available Satellites
Most commercially available satellite imagery systems offer a resolution of 30 cm (.3 meters per pixel), with some newer systems advertising 16 cm (.16 meters per pixel). The minimum requirement to recognize a person in the water is 3 cm (.03 meters per pixel). Merely detecting an object similar in size to a person requires significantly less resolution; however, in wide area mapping, one is likely to develop many false positives. This is why advanced deep-learning recognition algorithms are critical, as they can positively recognize the desired search target among millions and even billions of pixels.
While individuals in the water or other very small search objects may not be recognizable from current commercial satellites, other potential search targets, such as capsized small boats, are. Unfortunately, the absence of video with multiple progressive frames increases false positives. Until satellite imagery improves to a resolution of 3 cm or better and can provide the required five sequenced frames, locating individuals in the water via space-based SAR from commercial satellites is not feasible.
Summing Up the Use Cases
To sum it all up, each use case mentioned above may be able to leverage intelligent recognition systems, but none of the options are without challenges. UAS appear to be an efficient solution; however, they cannot generally transmit high-resolution sensor data for processing at a base station. SAR aircraft, already operational, would require updated sensors, replacing low-resolution PTZ cameras with fixed wide-area high-resolution models. This, of course, would necessitate rigorous airworthiness acceptance. Surface vessels provide ample space and power for the required equipment, but also need sensor upgrades. Moreover, surface vessels would not face the airworthiness challenges faced by SAR aircraft, although their slower speed and limited line of sight restrict their overall efficacy. Commercial satellites, while an exceedingly attractive solution, lack the resolution for small targets and the frame progression necessary for intelligent recognition systems to process the data optimally.
Conclusion
Despite the challenges in fully integrating modern EOIR sensors and intelligent recognition systems with the SAR mission, the potential improvements in mission effectiveness are extraordinary. Moreover, the financial costs of implementing such technologies would easily be offset by the efficiencies gained. Some cutting-edge intelligent recognition systems have demonstrated a consistent 96.8% probability of detecting and recognizing complex search objects in a single pass. While the area covered depends on the effectiveness of the pixel collection units, an increase in overall search performance of ten times or more over human-in-the-loop searches is easily attainable with commercial off-the-shelf components.
Although a SAR authority’s specific use case will vary, the strategy is the same for all. First, sensor procurement efforts should be focused on efficient pixel collection, prioritizing wide-area high-resolution sensors over the traditionally favored, expensive PTZ packages with high focal plane lenses. Second, integrate cutting-edge intelligent recognition systems to overcome the inherent performance limitations of human operators. The watershed improvements to the life-saving mission can only be realized by leveraging both the appropriate modern sensors and cutting-edge deep learning recognition systems.