Media files typically contain a combination of binary structures, binary data, and textual values. The strings analyzer extracts all potential textual values.
In technical terminology, a string is any sequence of text characters and spaces.
Binary or non-printable characters are not displayed when strings are extracted from a file.
The strings extractor parses the file. It displays every sequence of bytes that could possibly represent text characters.
When to use Strings
String extraction is typically used when the file structure is unknown or a parser is unavailable. At FotoForensics, we already know the file format: JPEG, PNG, WebP, HEIC, AVIF, etc. Tools like metadata, JPEG %, and digest are available to parse the data structures and extract information about the file.
With pictures, listing of all extracted strings is almost never needed. Typically, string extraction floods the analyst with useless information: random lines of characters and informative text sequences taken out of context. (See the Warnings and Caveats.) However, there are a few corner-case situations where string extraction may be very useful. For example:
Corner-case
Explanation
MakerNotes
Some camera vendors include proprietary MakerNotes data segments. The values that are displayed denote the parts of the proprietary format that the open source community has managed to reverse-engineer. Sometimes there are contents within the proprietary MakerNotes block that are not extracted. Strings can help identify extra content from these proprietary data segments.
Unknown metadata
On rare occasions, the metadata analyzer may flag a data structure as being unknown. For example, a JPEG might generate a metadata warning like "Unknown APP5 segment". Strings can help identify any textual values. On occasion, these include useful information.
Parasites
Files with parasitic attachments may display the parasite in strings listing. If you know what you're doing, then the strings output may help you identify the type of parasite and any risks it might pose.
Padding
Some file formats include large blocks of padding. For example, Adobe sometimes stores hundreds of bytes of unused padding in XMP blocks.
Similarly, it's common to see an MP3 file with the word "LAME" written over an over. (LAME is an MP3 encoder that initializes unused space with it's own software name.)
The strings listing may allow you to view any plain-text data stored in the padding areas.
Most of the time, this padding is harmless and uninformative.
Steganography
If you have a reason to suspect steganography, then viewing the strings listing might help you identify some of the hidden plain-text data.
Curiosity
If you're curious about what strings are contained in the raw file, then this listing might provide some insight.
Excluding "curiosity", most of these situations are uncommon. Unusual important data is flagged by other analyzers. Unless some other analyzer identifies one of these situations, relying on the strings listing as part of your regular analysis approach could result in the wrong conclusion about a string's intended purpose.
Risks from Raw Data
Consumer Advisory
Consuming raw data may increase your risk of missing important information and reaching the wrong conclusions.
Many forensic tools are built on the belief that "more data is more better." Those tools often flood the analyst with raw information, on the assumption that the human will know what to do with it.
However, that is usually not the case.
Overloading the analyst with lots of raw data usually leads to wasted time on unproductive work.
Too much data can bury real findings among the noise, potentially causing the analyst to miss something important.
Raw data taken out of context can cause an analyst to reach the wrong conclusions.
Raw textual data, such as XMP metadata, may be unformatted and difficult to read. An analyst is likely to miss important details from the unformatted raw text.
In most cases, strings extraction is not informative to the analyst.
Experienced analysts should assume that any reports that focus on raw strings -- without metadata decoding and extraction -- are inaccurate or incomplete.
With FotoForensics, the focus is on providing better information rather than "more data". This helps the analyst prioritize findings and rapidly focus on interesting aspects. While the strings listing is unlikely to be useful 99% of the time, it is provided because 1% of the time it might help an analyst evaluate a specific corner-case situation.
When NOT to use Strings
If you are not sure about what you are looking for, then the strings listing will likely only cause confusion. It is very common to see something in the raw data and misinterpret the cause, purpose, or significance.
It is unfortunately common to see people attempt to extract metadata or other fields from the raw strings listing. This can lead to false conclusions due to data being taken out of content. For example:
JPEG's Huffman tables often coincidentally look like sequences of text.
XMP data often includes 'adobe' in the namespace even when it was never processed by an Adobe application. (Since Adobe defined the standard, they got to define the namespace.)
There are common metadata blocks called 'Adobe' and 'Photoshop'. These do not mean that the picture was processed by an Adobe product or Photoshop. (Don't confuse the name of the metadata block with the name of the application.)
EXIF data includes binary fields and values that are either binary or text. The values often stored separately from the fields and are not always in the same order as the fields. Do not assume that a visible text timestamp is actually a timestamp; it could be a creation date, modification date, comment field, or serve some other purpose.
The HEIC and AVIF file formats use an internal structure called 'meta' that defines the encoded image data stream. This does not define the metadata information. (If there is any metadata, it will be stored in internal structures, such as "Exif" for EXIF data and "mime" for XMP data.)
Instead of focusing on raw data without purpose, use the other analyzers -- such as metadata, digest, and JPEG % -- to decode and view the information in context. These decoding analyzers often omit uninformative information (e.g., EXIF padding or XMP namespace information) and help focus on the important aspects for analysis.
Skill Level
With the exception of a few corner-case instances, there is really only one more reason to know about strings: skill level.
Amateur sleuths will often omit or ignore metadata analyzers in lieu of extracting raw strings.
If you see a report that includes a strings listing without a specific reason to include it, then you should question the value of the evaluation.
Sample Strings Extraction
The following picture includes metadata and other useful information. (Click on the picture to view the full analysis page.) Below the image is the decoded metadata from the metadata analyzer.
The metadata analyzer validated the binary file structure, decoded the various fields, and provided the information in a human-readable format. Any textual content is associated with a specified purpose (field and data segment). However, the metadata does not identify the actual structural layout of data in the file.
In contrast to the metadata, the following listing shows the raw strings for the same picture. Each set of strings are listed after the top-level structural component within the file's layout.
The strings result displays a list of the raw text output, divided into sections based on the file format's internal structures. For example, this JPEG begins with a start-of-image (SOI) marker, includes two application blocks (APP0 and APP13), and defines the quantization tables (DQT) that are used for decoding the image. (See Structures for more information about the file structure.)
Each line with a string is preceeded by the hexadecimal offset into the file. For example:
0x00000018: Photoshop 3.0
This line indicates that the raw string "Photoshop 3.0" is found at offset 0x18, 24 bytes from the start of the file. (0x18 in hexadecimal is 24 in decimal.)
Common Mistakes with Raw Strings
Although the raw string "Photoshop 3.0" is found in this sample file, it does not mean the picture was processed by Photoshop. Jumping to this conclusions mean that the string's purpose has been misinterpreted.
Most JPEG application blocks begin with a text string that denotes the format. In this sample picture, APP0 contains JFIF data. All JFIF formatted information begins with the letters "JFIF". Since they are text characters, they are displayed by the strings extraction process.
This picture also contains an APP13 application block.
The APP13 block contains data in a format called "Photoshop 3.0". This format is a de-facto standard and used by many applications; it is not exclusive to the graphics program called Photoshop. (This is what happens when companies like Adobe define a general-purpose data structure but name it after their specific application.) This specific data format begins with the text string "Photoshop 3.0". Because it is written as text, it appears in the string extraction listing. However, the purpose is to define a binary structure and not to identify the generating application.
Within the APP13 block is a data structure that begins with the letters "8BIM". The four-character-code "8BIM" defines a data structure format nested within the "Photoshop 3.0" structure.
Within the 8BIM is a binary IPTC record. (As a binary structure, it does not appear here because it is not extracted as strings.) This identifies four levels of nesting with four different binary data encoding formats. (JPEG includes APP13 which includes Photoshop 3.0 which includes 8BIM which includes IPTC.)
Within the IPTC is a binary tag that specifies any special instructures. In this sample, the instructions field has a value containing a long string that begins with "FBMD". Facebook and Instagram are known to create a "Photoshop 3.0" application record that contains an 8BIM with an IPTC special instructure that uses this value format. (The 'FB' stands for Facebook.)
The metadata analyzer decodes the file structure, extracts the IPTC record, determines the field's purpose, and lists the textual value. Using this metadata information, you can reliably conclude that this picture likely came from Facebook or Instagram.
However, the raw strings extraction shows text without context. While you can see the value, you may not draw a conclusion about the cause or purpose. Moreover, the extracted text begins with "bFBMD". The initial "b" is part of a binary data field that preceeds the value. It only coincidentally looks like a text character. While the strings extraction shows the raw text, the metadata analyzer shows the text in context and permits identifying the purpose.
Other fields can be equally confusing. For example, one of the DQT blocks contains a long series of 9s. The text character "9" is represented by the ASCII code "57". If you view the quantization tables (using the 'JPEG %' analyzer), then you will see that one of the defined quantization tables (DQT) contains a large number of 57s. This is an instance of text characters appearing coincidentally in an otherwise-binary data structure.
As you scroll down through the sample text extraction, you will see other random character strings. These are not encrypted communications or hidden messages; these are from binary data structures that coincidentally contain some of the same characters that are found in text strings.
File Format Structures
Overview
PNG
JPEG
RIFF
ISO-BMFF
Different file formats have different internal structures. However, there are some common components. For example:
There is usually some kind of header (or multiple headers) that identifies the image dimensions and encoding options.
If there are any optional metadata segments, such as EXIF or XMP data, then those are usually encoded into separate data blocks that are self-contained. That is, all necessary data is usually found within the metadata block. E.g., an XMP metadata block does not need to access any data outside of the XMP block. (There are a few exceptions, such as MPF and a few MakeNotes formats, but self-containment is generally the case.)
Complex, standardized data structures typically contain some type of header that identifies the binary format. Often, this is the textual name of the data format.
The image data itself needs to be stored in the file.
A single file format usually includes a wide variety of internal data structures. These are often convoluted and assigned cryptic names.
Each file format has it's own set of specific structure components.
There is only one cryptic element that is common to all file formats: EOF. This marks the "end of file". There is no data beyond the EOF. If a file has 12,233 bytes, then byte number 12,233 is the EOF.
Remember: The file layout acts as a container; it contains other types of data. Each file format uses different types of structures to store different types of data. Common internal data types, such as EXIF, XMP, and ICC Profile data, are usually stuffed into one or more of the format's structures.
This analyzer identifies the format's top-level structures, but not the internal data contents.
PNG File Structure
PNG files are mostly straightforward. Each data segment is stored in a chunk. Each chunk contains a short name, variable amount of data, and a checksum. The minimal PNG file contains a header (IHDR chunk), image data (one or more IDAT chunks), and an end tag (IEND chunk). If the encoding requires a color palette, then there will be a PLTE chunk.
Each PNG chunk is defined by a four-character name. The capitalization identifies how it is used:
Critical or Ancillary: If the first character is a capital letter, then the chunk is required for rendering the image. A lowercase first letter denotes ancillary information that is not required for decoding the image.
Public or Private: The second character is a capital letter when the chunk's format is publicly documented. A lowercase first letter denotes propriary information.
Reserved: The third character is always capital letter.
Retain or Remove: Some chunks should not be copied after a file is edited. A capital for the last letter indicates that the chunks can be copied. A lowercase character tells the editor to not copy the data after altering the file.
A typical PNG layout looks like:
IHDR
The capital "I" indicates that it is a critical chunk for rendering the picture. The captital "H" means it is public and well-defined, and the final capital "R" means it should be retained by editors. The IHDR contains basic information about the image dimensions.
PLTE
The optional PLTE structure defines a color palette.
IDAT
One or more IDAT chunks store the encoded image data.
IEND
Identifies the end of the PNG file.
There are a wide range of optional data chunks that can be located between the IHDR and IEND. This includes "zTXt" and "zTXT" for storing compressed text, "iTXt" and "iTXT" for uncompressed text, "gAMA" for gamma correction information, "iCCP" for a color profile, and many more. This is not a comprehensive list. Common metadata structures, like XMP and EXIF, are often stored in text chunks, such as iTXt or zTXt.
JPEG File Structure
Compared to PNG, JPEG is a much more complicated file structure and has a wider range of data segments. The basic layout includes:
SOI
The start-of-image (SOI) marker identifies this as a JPEG formatted file.
APP0 ... APP15
Optional application blocks store auxiliary information, such as EXIF, XMP, and MakerNotes data. The internal format to each APP block is application specific. Many applications include a text string that helps to identify the APP data's format, but this is not always the case. In addition, some blocks have well-known assignments (e.g., JFIF data is typically found in APP0, while EXIF is usually in APP1). However, these are not strict requirements.
SOF
The start-of-frame (SOF) block specifies the image size, encoding method, and defines the color components.
DQT
One or more blocks are used to define the quantization tables (DQT) that are used for rendering the picture. Some applications use one DQT to define multiple quantization tables, while other applications use one DQT per quantization table.
DHT
Similar to DQT, one or more DHT block defines the Huffman tables that are used for decompressing the picture's encoded binary data stream.
SOS
The start-of-stream (SOS) denotes the beginning of the picture's binary data stream. Depending on the encoding options, it may include periodic restarts (RST0...RST7). Some SOF encoding methods may include additional DHT and SOS segments within the data stream.
EOI
The end-of-image (EOI) marker identifies the end of the JPEG binary data stream. This is typically the end of the file.
Trailer
Some encoders include data after the EOI. This is often additional metadata or large preview images.
These are not the only segments and many segments are optional. Moreover, the ordering of these segments is flexible; the APP, SOF, DQT, and DHT blocks can appear in any order between the SOI and SOS.
JPEG segments often include nested information. For example, APP1 may contain EXIF data that can contain an a preview image (another JPEG) that may contain another APP1 segment. This analyzer only identifies the outer-most segment and any not nested data.
RIFF File Structure
The Resource Interchange File Format (RIFF) is used by a range of media files, including WebP (image), WAV (audio), and some AVI (video) files. It contains:
Header
A basic RIFF header. This identifies the total size of the file and the encoded format (WAVE, WEBP, etc.). All other chunks are used to support the encoded format.
Chunk 1
Each chunk has a four-character name that defines the chunk's purpose, like "LIST", "MDTA", "EXIF", "XMP " (with a space), "avih", "indx", etc. (There are dozens of well-defined chunk formats.) The name is followed by the chunk size and chunk data. Some chunks, like "LIST", contain a series of sub-chunks as the data; this permits nested chunks.
Chunk 2 ... Chunk n
The RIFF may contain multiple chunks, listed one after the other.
The purpose, format, and contents of each chunk is chunk-specific. The chunk's name informs the decoder which (of the dozens of chunk definitions) to use for processing it.
This analyzer only identifies the top-most chunk and not the nested components.
ISO-BMFF File Structure
The International Standards Organization and International Electrotechnical Commission formalizes technical standards. ISO/IEC standard #14496 defines a media file format. Commonly called the ISO Base Media File Format (ISO-BMFF), it is sometimes referred to as MPEG or ISO-14496 encoding. This standard defines a file structure that is used by HEIC, AVIF, MP4, 3GPP, JPEG2000, and many other file formats.
Similar to RIFF, this file format contains a series of data chunks. Each chunk begins with a length and a four-character name. This is followed by the chunk's data.
Header
The first chunk is always at least 12-bytes long. The name is "ftyp" and it is followed by the primary file format (e.g., "heic" or "avif") and an optional list of compatible formats. For example, 'heic' is typically compatible with 'mif1'.
Chunk 1 ... Chunk n
After the header are one or more chunks that follow the same format. This may include four-letter names like:
meta: For meta structural data, including image indexes, property lists, and data formatting information.
moov: For video structural data, including track indexes, property lists, and data formatting information.
mdat: A general listing of large media data. While 'meta' and 'moov' may have pointers to binary, EXIF, or XMP information, the actual data is likely stored in the mdat block. This is typically an unsorted heap. Whatever data is needed by 'meta' or 'moov' is placed here. The 'meta' and 'moov' chunks include offsets and sizes for the necessary data inside the mdat chunk.
free: Unused space typically used for padding. Often this is used to alignment subsequent data with a memory boundary point.
The meta and moov chunks contain a nested series of additional chunks. This analyzer lists all top-level chunks and all subchunks directly under the meta and moov chunks.
Many chunks contain nested subchunks. For example, a typical AVIF image may contain:
With ISO-BMFF, only the header (ftyp) is required to come first. The order of the remaining chunks at any given level is usually flexible. The top level could be ftyp→meta→mdat, or ftyp→mdat→meta. (The 'free' chunk is always optional.)
This analyzer only identifies the top-most chunk and not the nested components.
Caveats
Most of the time, raw string extraction is not more informative than metadata analysis.
Issue
Description
Random strings
Binary files often have byte sequences that coincidentally look like strings. There are usually lots of strings that appear to only contain random characters.
Random characters
Actual text strings may be preceeded or followed by bytes that serve other purposes. If these adjacent bytes happen to look like text, then they will be included in the strings result. (On the sample page, the initial 'b' in "bFBMD01000a8a0100..." is coincidental.)
Random order
Do not assume that the order of the strings matches the order used by the metadata. For example, EXIF has a series of binary fields and pointers into a block of data that can contain strings. The pointers do not need to be in the same sequence as the fields.
No context
Metadata segments often contain text values. However, without viewing the decoded metadata structure (which displays the strings in context), you cannot determine the purpose of any value.
Misleading text
Many metadata structures have names like "Photoshop" or "Adobe". As a structure's name, these are not indicators of any specific application that created the file. Similarly, XMP objects often contain strings that look like URLs but actually define name spaces and are not representative of those online services. (Sometimes a string is just a string and not indicative of an application or service.)
Omitted text
Encoded or compressed metadata structures are not decoded by the strings extraction. There may be plenty of text values that are not detectable by strings. This includes non-ASCII text encoded using UTF-8, UTF-16, or other language encoding methods.
Obscure formatting
Some metadata structures, like XMP, contain text but may be difficult to read due to the raw formatting.
When performing an evaluation, always start with the metadata analyzer to parse the file structure, extract values, and display them in context.
By itself, string extaction shows random textual elements without any context. However, because media files include known structures, the FotoForensics strings extractor labels any primary data types.
The file structure components may be complicated and nested. This strings analyzer only identifies the top-most structure and not the internal components or nesting.
The file structure components usually do not look like the metadata segments. This is because metadata segments are typically stored inside these structural components. With some formats, like JPEG, the metadata blocks may span multiple structural components. (E.g., a large XMP data block may be broken up and stored within multiple APP blocks.)