Exfiltrating data from remote access services via video and sound

Given the current situation, many of us are now working remotely all the time. Many of such arrangements are facilitated via tools like Citrix, RDP, VNC, LogMeIn, etc.  We have been researching some possibilities about how to exfiltrate data via such arrangements. Here are some obvious choices:

  • File connections – if enabled
  • Remote USB connections – if enabled
  • Remote printing connections – if enabled
  • Exfiltrating via email or Internet connections at the remote desktop level

Most of these have obvious controls that can be activated by an administrator, which would leave the attackers with very few channels left. The two in particular that we were interested in, are video and sound, since the user can view their remote screen and many tools allow hearing sound from their remote desktop.

For exfiltration of data via video, we originally considered encoding data with base-64 using an encoding tool such as the Windows certutil CLI command, then doing screen capture on the host and running some sort of OCR against it such as Tesseract. However, we ran across a much better tool from Pen Test Partners called PTP-RAT which flexes the pixels on the screen to transfer information (see their blog post and GitHub repo).

For exfiltration of data via sound, we originally considered using a tool that will modulate the data into a sound form that used to be used by modems back in the 1980s/1990s. However, we ran across a much better suited tool from Roman Zayde called amodem which is able to do this. While the tool is designed for exfiltrating data across a physical air gap, it should work the same way on a remote desktop by converting the data into sound via the soundcard and capturing it back on the host then decoding it.

P.S. For extra brownie points, you can also try enabling the webcam and microphone on the host, and transfer data from the host back to the remote desktop using the same mechanisms.