Tutorial: Similar Picture Search
Pictures often circulate the Internet. One person will see a picture online and cross-post it to another forum, where someone else will take a copy and distribute it through other online services.
Each time someone submits a picture to another online service, information may be lost and the image may be modified. Additional changes to the picture, such as scaling, cropping, or color adjustments, further alter the image. All of these changes impact the overall quality of the picture. Metadata and image artifacts may identify a low quality picture that has been altered by multiple online services. However, multiple resaves and edits may make it impossible to determine if a person was pasted into a picture.
Forensic investigators may have a low quality picture, or an image without any context. Online image search services permit analysts to find variations of the same pictures. The search results may identify high quality versions, distribution patterns, and circumstances that can provide context to the picture.
About Perceptual Searches
Most picture search engines take a text description as input and return a series of pictures that match the text description. If the context around the picture is known, then a wide range of online text-to-image search services are available. Unfortunately, an analyst may spend hours doing keyword searches and manually reviewing results, and may still not find the desired image.
The alternative to text-to-image search is to use an image-to-image (reverse image) search engine. This approach uses algorithms to identify similar pictures.
There are different approaches for finding visually similar images. The most direct approach is to use a cryptographic hash function, such as MD5 or SHA1. The same hash value identifies the exact same picture. However, a single byte change will result in significantly different hash values. Even though pictures may visually look the same, different bytes yield different hashes and are therefore different pictures.
Digital picture analysis typically relies on perceptual hashes for reverse image searches. A
perceptual hash is an algorithm that will yield similar hash values for visually-similar pictures. There are a wide variety of perceptual hash functions. Different algorithms may focus on colors, edges, corners, 'blobs', or frequency patterns. In general, these types of algorithms can match similar pictures, even if there are significant size, quality, and coloring differences. They may also match pictures with minor differences in content, cropping, and rotation.
While there are many different perceptual hash algorithms, only a few perceptual hash search engines are publicly available. These services allow users to upload pictures as the search criteria. Rather than using words to find pictures, they permit the use of a picture to find similar pictures. The results include web pages that host variations of the picture. A few examples of perceptual search engines include:
- TinEye: This search engine is exceptional at finding partial matches. Although it may not know immediately about new pictures, it will usually identify widely circulated pictures, as well as images from news outlets.
- Google Image Search: This search engine has indexed most of the pictures found by Google, including late-breaking pictures that are only a few hours old. It is exceptional at identifying textual content related to pictures. However, Google Image Search is not strong at identifying significant variations from cropping, splicing, or editing.
- Bing: Microsoft's Bing includes a search engine that matches based on similar shapes. However, it does not have a large corpus of indexed images and it usually does not find variations of the same picture. If the picture has a large rectangular region in the middle, then it will usually find other pictures with large rectangular regions in the middle. This is useful for finding visually similar images without finding variations of the exact same picture.
- Karma Decay: This specialized search engine matches against all pictures that have appeared on the Reddit social network. This is useful for identifying topics such as memes and controversial current events.
Each of these search engines serves a different purpose. For example, Bing is good for finding a variety of pictures with generally similar shapes, identifying known people, and extracting text. TinEye is exceptional at finding variations of the same picture. Google may identify what is in the picture, and Karma Decay helps determine what social media is saying about the picture.
These are not the only reverse image search engines. Some services (not listed here) have very extreme focuses, such as only searching for Anime images. Others, like
Yandex and
Baidu, have been observed providing
malicious JavaScript, extreme user tracking, and changing the type of data they collect based on where you are located. (We
do not recommend using any search engine that has been observed providing hostile code to users.)
Identifying Quality
Pictures passed from blog to blog and across online forums are typically resaved. Each resave reduces the quality of the image. The best analysis results will come from the highest quality picture. If the original picture is not available, then a near-original (one or two saves from the original) is likely a good option.
Assuming that the image is available online, how can you find an original (or near-original) picture? The answer typically requires finding the highest quality picture.
Perceptual search services permit analysts to view results by size or similarity. Unfortunately, none of these search engines sort results by quality. When searching for visually-similar images, there are a couple of attributes that can help identify higher quality pictures. Some attributes are easily identifiable from the search results, while others may require additional analysis. Although these guidelines are not always true, these heuristics are typically good enough to identify a higher quality image:
Identifying Context
When searching for textual content related to a picture, it is valuable to identify the picture's quality. A higher quality picture is usually associated with more authoritative text.
In general, try to identify a time period when the picture first appeared and was discussed. With viral pictures, there may be multiple clusters of discussion. These clusters appear each time a different online group discovers the picture (if it is new to them, they will discuss it as a cluster, even if it is an old picture). For example, Karma Decay may identify multiple threads at Reddit that discuss a picture. Each thread may denote a different time period where someone discovered the image.
Attributing context to an image varies based on the perceptual search engine. Google Image Search will attempt to associate common text to similar pictures; Google may immediately identify the content or context associated with the picture. Karma Decay will identify discussion threads at Reddit that typically provide context. In contrast, TinEye only identifies web sites that host the picture. You may need to visit multiple web sites in order to identify the context.
Similar pictures may also identify distribution patterns. For example, if variations of a picture are widely found on social networking sites, then it is likely a widely discussed topic. If variants are only found on Thai web sites, then the person who generated the variant that you are evaluating may be able to read Thai or may have ties to Thailand.
When identifying textual context, be wary of hoaxes and conspiracy theories. An established hoax/conspiracy typically results in contradicting textual descriptions. One description supports the concept, a different description debunks the issue, and a third may provide the initial story. In these cases, the amount of text and age of the text is typically independent of the ground truth. (There may be more articles around a hoax, but that does not mean it is real.) Do not assume that the initial story, or the most repeated explanation, is accurate. With hoaxes and conspiracies, look for cited sources and identifiable experts; unspecified sources and anonymous experts who are only identified by online handles are unlikely to be authoritative sources.
Search Limitations
Similar image searches are valuable for identifying context, variants, distribution, and information related to a picture. However, search results may not always yield authoritative information. In particular:
- Not every picture is distributed online or indexed by search engines. You may not find the picture you are looking for.
- Viral pictures that are distributed through social media (e.g., Facebook or Twitter) may result in hundreds of variants -- pictures that look similar but have different MD5/SHA1 hash values. These variants result in search noise that may obscure the initial source. Viral pictures may not have an identifiable origin.
- The camera-original source may not be online. You may not be able to identify the initial source for a picture, a high quality variant, or even an authoritative source.
- For composite pictures, the individual components may not be identifiable.
- Perceptual searches return similar pictures. Similar does not mean identical. A recreation of a photo should look similar to the original. People with similar physical qualities will likely look similar, and pictures of people in similar poses will look similar. Most search engines can sort results by the degree of similarity. Do not be surprised if only a few pictures look similar to you before diverging into visually dissimilar pictures. If the algorithm focuses on color, then do not be surprised if pictures have similar colors but completely different content.
- "Visually similar" is not the same as object identification. If you copy a picture and apply minor alterations (e.g., minor edit, crop, scale, or recolor), then it can still be visually similar to the source image. However, if you take two pictures of an object from different angles (e.g., two pictures of a chair taken from different angles) are unlikely to match each other -- even though it is the same object. This is because a human can recognize the object, while the computer will recognize that they are not two versions of the same picture.
Caveats
Visually similar searches can be useful, but also have specific caveats. These include:
Similar image search is only one evaluation approach. The interpretation of the results may be inconclusive. It is important to validate findings with other analysis techniques and algorithms.