Embedded cabinets
Purpose
This guide explains how to find and extract cabinet files embedded within other files, such as executables, installers, and data files.
Understanding Embedded Cabinets
What Are Embedded Cabinets?
Many Windows installers and applications embed CAB files within executable files. Common scenarios include:
-
Self-extracting installers: EXE files containing embedded CAB archives
-
Software packages: MSI or setup files with embedded resources
-
Game installers: Installation executables with asset archives
-
Driver packages: Hardware driver installers with embedded CAB files
-
Application resources: Programs storing data in embedded cabinets
Why Embed Cabinets?
Embedding provides several advantages:
-
Single-file distribution: Everything in one executable
-
Simpler installation: No separate archive files to manage
-
Better branding: Custom installer interface
-
Integrity checking: Combined executable and data validation
-
Backward compatibility: Support for older Windows versions
Searching for Embedded Cabinets
Basic Search
Find embedded cabinets in a file:
require 'cabriolet'
# Search for embedded cabinets
results = Cabriolet::CAB::Decompressor.search('installer.exe')
results.each do |result|
puts "Found CAB at offset: #{result[:offset]}"
puts " Size: #{result[:size]} bytes"
puts " Files: #{result[:file_count]}"
endCLI Search
$ cabriolet cab search installer.exe
Found 2 embedded cabinet(s) in installer.exe:
Cabinet 1:
Offset: 0x12A400
Size: 2,458,624 bytes
Files: 156
Folders: 3
Set ID: 1234
Cabinet ID: 1
Cabinet 2:
Offset: 0x386C00
Size: 1,234,567 bytes
Files: 45
Folders: 1
Set ID: 5678
Cabinet ID: 1Advanced Search Options
Search with specific criteria:
# Search with options
options = {
min_size: 1000, # Minimum cabinet size in bytes
max_results: 10, # Stop after finding 10 cabinets
validate: true, # Validate cabinet headers
deep_scan: true # Scan entire file (slower but thorough)
}
results = Cabriolet::CAB::Decompressor.search('large_file.bin', options)Extracting Embedded Cabinets
Extract by Offset
Extract a specific embedded cabinet:
# Find cabinets
results = Cabriolet::CAB::Decompressor.search('installer.exe')
# Extract first embedded cabinet
if results.any?
offset = results.first[:offset]
decompressor = Cabriolet::CAB::Decompressor.new(
'installer.exe',
offset: offset
)
decompressor.extract_all('output')
endExtract All Embedded Cabinets
Extract all found cabinets:
results = Cabriolet::CAB::Decompressor.search('installer.exe')
results.each_with_index do |result, index|
output_dir = "cabinet_#{index + 1}"
decompressor = Cabriolet::CAB::Decompressor.new(
'installer.exe',
offset: result[:offset]
)
puts "Extracting cabinet #{index + 1} to #{output_dir}..."
decompressor.extract_all(output_dir)
endWorking with Multiple Embedded Archives
Identifying Cabinet Sets
When multiple cabinets are embedded, they may be part of a multi-part set:
results = Cabriolet::CAB::Decompressor.search('installer.exe')
# Group by set ID
sets = results.group_by { |r| r[:set_id] }
sets.each do |set_id, cabinets|
puts "Cabinet Set #{set_id}:"
cabinets.sort_by { |c| c[:cabinet_id] }.each do |cab|
puts " Part #{cab[:cabinet_id]} at offset #{cab[:offset]}"
end
endOutput:
Cabinet Set 1234:
Part 1 at offset 1222656
Part 2 at offset 3768320
Part 3 at offset 5242880Extracting Multi-Part Embedded Sets
# Find all parts of set 1234
set_parts = results
.select { |r| r[:set_id] == 1234 }
.sort_by { |r| r[:cabinet_id] }
# Extract using first part
first_offset = set_parts.first[:offset]
decompressor = Cabriolet::CAB::Decompressor.new(
'installer.exe',
offset: first_offset
)
# Provide offsets for continuation cabinets
set_parts[1..-1].each do |part|
decompressor.add_continuation_offset(part[:offset])
end
decompressor.extract_all('complete_set')In-Memory Extraction
Extract to Memory
Extract embedded cabinet contents without writing to disk:
# Find cabinet
results = Cabriolet::CAB::Decompressor.search('installer.exe')
offset = results.first[:offset]
# Open with memory I/O
memory_io = Cabriolet::System::IOSystem.new
handle = memory_io.open('installer.exe', 'rb')
handle.seek(offset)
decompressor = Cabriolet::CAB::Decompressor.new(handle)
# Extract files to memory
files = {}
decompressor.files.each do |file|
files[file.filename] = decompressor.extract_to_memory(file.filename)
end
# Use extracted data
config_data = files['config.ini']
puts "Config: #{config_data}"Stream Processing
Process embedded cabinet contents on-the-fly:
decompressor = Cabriolet::CAB::Decompressor.new(
'installer.exe',
offset: embedded_offset
)
decompressor.each_file do |filename, io|
case File.extname(filename)
when '.txt'
# Process text files
content = io.read
analyze_text(content)
when '.dll'
# Check DLL signatures
signature = io.read(2)
verify_signature(signature)
end
endAdvanced Techniques
Custom Search Patterns
Search for cabinets with specific characteristics:
class CustomCabinetSearch
def self.find_signed_cabinets(filename)
all_results = Cabriolet::CAB::Decompressor.search(filename)
all_results.select do |result|
# Check if cabinet has reserve data (often used for signatures)
has_signature?(filename, result[:offset])
end
end
def self.has_signature?(filename, offset)
File.open(filename, 'rb') do |f|
f.seek(offset + 20) # Offset to flags field
flags = f.read(2).unpack1('v')
# Check RESERVE_PRESENT flag
(flags & 0x0004) != 0
end
end
end
signed_cabs = CustomCabinetSearch.find_signed_cabinets('installer.exe')Extracting from Compressed Executables
Some installers are themselves compressed (e.g., UPX-packed):
# First, decompress the executable if needed
if upx_packed?('installer.exe')
system('upx', '-d', 'installer.exe', '-o', 'installer_unpacked.exe')
search_file = 'installer_unpacked.exe'
else
search_file = 'installer.exe'
end
# Then search for embedded cabinets
results = Cabriolet::CAB::Decompressor.search(search_file)Handling Large Files
Efficiently search very large files:
# Use chunked searching for large files
class ChunkedCabinetSearch
CHUNK_SIZE = 10 * 1024 * 1024 # 10 MB chunks
def self.search(filename)
results = []
File.open(filename, 'rb') do |f|
offset = 0
while chunk = f.read(CHUNK_SIZE)
chunk_results = find_in_chunk(chunk, offset)
results.concat(chunk_results)
offset += CHUNK_SIZE
end
end
results
end
def self.find_in_chunk(chunk, base_offset)
results = []
pos = 0
while (index = chunk.index('MSCF', pos))
# Validate and add result
if valid_cabinet_header?(chunk, index)
results << {
offset: base_offset + index,
# ... other metadata
}
end
pos = index + 1
end
results
end
endForensic Analysis
Analyzing Installer Structure
Examine installer structure without extraction:
results = Cabriolet::CAB::Decompressor.search('installer.exe')
results.each_with_index do |result, index|
puts "\n=== Cabinet #{index + 1} ==="
puts "Location: 0x#{result[:offset].to_s(16)}"
# Open cabinet for analysis
decompressor = Cabriolet::CAB::Decompressor.new(
'installer.exe',
offset: result[:offset]
)
# Analyze compression methods
compression_stats = Hash.new(0)
decompressor.folders.each do |folder|
compression_stats[folder.compression_type] += 1
end
puts "Compression methods:"
compression_stats.each do |type, count|
puts " #{type}: #{count} folder(s)"
end
# List largest files
puts "\nTop 5 largest files:"
decompressor.files
.sort_by { |f| -f.uncompressed_size }
.first(5)
.each do |file|
puts " #{file.filename}: #{file.uncompressed_size} bytes"
end
endDetecting Malicious Content
Screen embedded cabinets for suspicious patterns:
def analyze_cabinet_safety(filename, offset)
decompressor = Cabriolet::CAB::Decompressor.new(filename, offset: offset)
suspicious = []
decompressor.files.each do |file|
# Check for suspicious filenames
if file.filename =~ /\.(exe|dll|scr|bat|cmd|vbs|js)$/i
suspicious << "Executable file: #{file.filename}"
end
# Check for path traversal
if file.filename.include?('..')
suspicious << "Path traversal attempt: #{file.filename}"
end
# Check for unusual attributes
if file.attributes & 0x02 != 0 # Hidden attribute
suspicious << "Hidden file: #{file.filename}"
end
end
suspicious
end
# Scan all embedded cabinets
results = Cabriolet::CAB::Decompressor.search('suspicious.exe')
results.each_with_index do |result, index|
issues = analyze_cabinet_safety('suspicious.exe', result[:offset])
if issues.any?
puts "Cabinet #{index + 1} - WARNINGS:"
issues.each { |issue| puts " - #{issue}" }
end
endBest practices
Search Optimization
-
Use deep_scan sparingly: Only when necessary, as it’s slower
-
Set size limits: Use min_size to filter out false positives
-
Validate results: Always verify cabinet headers
-
Cache results: Store search results for repeated operations
-
Check file size first: Skip files that are too small
Extraction Safety
-
Scan before extracting: Check for malicious content
-
Use temporary directories: Extract to isolated locations first
-
Validate paths: Prevent path traversal attacks
-
Check disk space: Ensure sufficient space before extraction
-
Verify checksums: Validate extracted files
Error handling
-
Handle corrupted cabinets: Use salvage mode when needed
-
Check offsets: Verify offset validity before extraction
-
Handle partial data: Some embedded cabinets may be incomplete
-
Log failures: Track which cabinets failed to extract
-
Provide context: Include offset and size in error messages