Scenario
You are a forensic examiner in a large firm. David, a colleague of yours from the HR department, received two resumes for an open position within the firm. David viewed the resumes and listed the senders as a possible candidates.
A few days later, the firewall administrator noticed a strange connection going from David's machine to outside the network. David is sure that he hasn't opened any suspicious executable and that he only opened the two resumes he received. However, David remembers something interesting. David says that one of the files popped up a "save as" window.
When he pressed "Cancel," another dialog window, which he has never seen before, appeared requesting to click "Open" in order to view an encrypted content within the file.
Sadly, David doesn't remember which file it was, since the two names are similar to each other. You were called to examine those files and provide feedback.
Goals
Learn how to profile and examine a PDF file and be able to tell whether it's malicious or not.
Recommended tools
- Pdf-Id - PDF-Parser - ExifTool
TASK 1 : OBTAIN GENERAL OVERVIEW OF THE SUSPICIOUS PDF FILES WITH PDFID
PDF-ID: PDFiD will scan a PDF document for a given list of strings and count the occurrences (total and obfuscated) of each word
I will start by running pdfid.py on "Linda.pdf" to analyze its structure. It appears to be a standard PDF file.
It appears to be a standard PDF file.
Now, I will run pdfid.py on "Lucy2.pdf" to determine if it is a malicious PDF
The Lucy2.pdf contains a JavaScript object! This is interesting and suspicious at the same time since a resume file has very little use to JavaScript. Something we can tell is that even though the two files are similar in structure and format, the other one doesn't contain JavaScript objects.
TASK 2 : EXTRACT THE FILES METADATA
I will run exiftool on both files to examine their metadata. This will help us identify any clear differences between the files, such as variations in the creator tools used.
I ran exiftool on the Linda.PDF file, which appears to be normal based on the creator tool information.
I also ran exiftool on the "Lucy2.pdf" file, and while everything seems normal,
it's worth noting that Lucy's file contains less metadata. In contrast, Linda's file appears to be a template downloaded from a website. This discrepancy, along with the presence of a JavaScript object in Lucy's file, raises further concerns.
TASK 3: LIST THE OBJECTS IN THE MALICIOUS FILE
PDF-parser.py: This tool will parse a PDF document to identify the fundamental elements used in the analyzed file.
I have strong reasons to suspect Lucy's file, so I will proceed with a more in-depth analysis.
I will begin by performing a general examination using the --stats option.
The stats option display statistics of the objects found in the PDF document. Use this to identify PDF documents with unusual/unexpected objects, or to classify PDF documents.
There are two things worth mentioning. First the number of total objects (150) and more importantly, the number of objects which are related to Actions.
Interestingly, the JavaScript code is one of the three objects which is related to actions.
The other object is also worthy of examination.
Specifically, this object is of type /Action with a subtype /Launch, which is used to launch an application.
TASK 4: EXTRACTING THE EVIL CODE FROM THE OBJECT
Now that I have displayed the code in plain text, it is better to extract it to a separate file to make the analysis easier.
TASK 5: ANALYZING THE EVIL CODE FROM THE OBJECT
The PDF seems to be launching the CMD.exe from the victim's machine
**TASK 6: DISARMING JAVASCRIPPT AND AUTO LUNCH WITH PDF-ID **
PDF-ID: Could disamred the javascript code with the -d parameter
This created a new file Lucy2.disarmed.pdf
Now, I need to confirm that the JavaScript code has been disabled by running pdfid on the disarmed file that was created.
Conclusion
Upon analyzing both "Linda.pdf" and "Lucy2.pdf," several key observations were made that distinguish the two files.
PDF-ID Analysis: The analysis revealed that while "Linda.pdf" appeared to be a standard PDF file, "Lucy2.pdf" contained a JavaScript object. Given the nature of the content, which is a resume, the presence of JavaScript in "Lucy2.pdf" is unusual and raises a red flag.
Metadata Examination: The metadata extracted from "Linda.pdf" indicated that it was created using common tools and contained typical metadata for a resume file. In contrast, "Lucy2.pdf" had less metadata and showed signs of being less conventional, which suggests it may have been tampered with or generated with a different intent.
Object Analysis: The PDF-parser analysis revealed that "Lucy2.pdf" contained 150 objects, with several related to actions. Notably, one of these objects was an /Action subtype /Launch, which is designed to execute applications. This feature is highly suspicious for a document of this type.
Malicious Code Extraction: The extracted code from the suspicious object in "Lucy2.pdf" indicated an attempt to launch CMD.exe on the victim’s machine. This is a clear indicator of malicious intent, as it aims to execute commands on the user's system without their consent.
In summary, the evidence strongly suggests that "Lucy2.pdf" is malicious. Its inclusion of JavaScript, abnormal metadata, and executable actions point towards it being designed to compromise the user's machine. On the other hand, "Linda.pdf" appears to be a legitimate file. Therefore, it is crucial to treat "Lucy2.pdf" with caution and take appropriate steps to prevent potential security breaches.