Writeups: Monster Inc The Middle

IHack2020 - Monster Inc The Middle

Yesterday, I successfully got into the Monster Inc. network. I performed a Monster In The Middle (MITM) attack against Mike Wasowski. He connected to his machine remotely using RDP and I was able to capture two sessions using an open source tool. However, before I could do anything more, the blue team found my foothold in the network and kicked me out. I really want to find juicy stuff about Monster Inc. and I need help! I attached the packet capture files and the secrets needed to decrypt the traffic.:

https://drive.google.com/file/d/1ZrYnGKeReED4xVIdojLBer45YzgfwUsh/view?usp=sharingNote: if you use the intended way to solve this track, you might want to use the various-fixes branch.

For this challenge, we have 2 network captures and an ssl.log file. The pcaps are mostly TLS traffic so we can’t do much without first decrypting it. Luckily, we have the ssl log.

Decrypting TLS data in Wireshark

The ssl.log is a NSS Key log which contains 4 lines in the following format:

CLIENT_RANDOM <hex_encoded_client_random_value> <hex_encoded_master_secret>

In TLS, the session keys are derived from the master secret, this means we have the information we need to decrypt the traffic. Wireshark can do that for us by importing the file under Edit→Preferences→Protocols→SSL→(Pre)-Master-Secret log filename. We now have network captures that definitely look like RDP sessions. Time to get to the actual challenges

Monster Inc The Middle 1 (capture #2)

What are the dimensions (in mm) of Mike Wasowski’s main monitor? Use the second packet capture file to solve this challenge.Flag format: HF-{monitorWidth_monitorHeight} Example: if the monitor was 42mm horizontal and 500mm vertical, the flag would be HF-{42_500}

After digging through the MS-RDPBCGR specification, we know that this info can be contained in 2 different places:

The optional desktopPhysicalWidth and desktopPhysicalHeight properties of the clientCoreData structure inside the Client MCS Connect Initial PDU with GCC Conference Create Request packet aka ClientData
The physicalWidthand physicalHeight properties of a monitorAttributes entry inside the clientMonitorExtendedData struct in of the ClientData packet.

The first one doesn’t seem to be present, but we do have a clientMonitorExtendedData in the ClientData packet. The attribute we want isn’t shown, but as we can see in the screenshot, there 32 uninterpreted bytes left after the last field displayed by Wireshark. These correspond to 2 monitorAttributes entries (makes sense considering the Monitor Count is 2)

clientMonitorExtendedData leftover bytes

Both monitors have the same propertes, so if we decode one entry manually, we get:

Attribute	Hex Value	Human Value
physicalWidth	0x0256	598mm
physicalHeight	0x0150	336mm
orientation	0x0	ORIENTATION_LANDSCAPE
desktopScaleFactor	0x64	100%
deviceScaleFactor	0x64	100%

Flag: HF-{598_336}

Monster Inc The Middle 2 (capture #2)

What is the big corporate secret??? Note: do not use the --secrets option, its broken.

The question doesn’t tell us much, but judging from what we have, it’s very likely we have to somehow replay the RDP session. After a little bit of research, we came across PyRDP, a tool that does RDP mitm and RDP session replay. Surprise, surprise, the article presenting it was written by the challenge’s creator! Let’s venture a wild guess and say that this is the “intended way” of solving it.

We install PyRDP and follow the instructions on how to convert a pcap to a replay file:

Export the PDUs from Wireshark by selecting File > Export PDUs and selecting OSI Layer 7 and save that to a file
Convert it to a replay file using the provided pyrdp-convert utility: pyrdp-convert.py --src 192.168.110.1 -o MonsterIncTheMiddle2Replay MonsterIncTheMiddle2PDUs.pcapng
Open the replay file with pyrdp-player: pyrdp-player.py MonsterIncTheMiddle2Replay
Watch the replay

We see the user open an image file that contains a beautiful paint rendition of Mike Wasowski and the second flag!

the second flag is visible in the replayed session

Gotta say, I like this tool. We can see a video of the RDP session along with keypresses sent at the bottom of the window.

Flag: HF-{41aa33f61d71adfda}

Monster Inc The Middle 3 (capture #1)

Mike Wasowski is a busy monster

Let’s watch the other captured session. First, we need to repeat the steps from the previous challenge to export the network capture as a session replay file. Then, we see the user do a few things:

Open a TODO.txt file
Change keyboard layout
Mark a few things as done
Paste a bunch of emojis that we don’t see on screen
Open up a youtube video of rainbow colors
Listen to a motivation_song (which we’ll revisit in the next part of the challenge)

The emojis sent through the clipboard look suspicious. We can’t see what the user is pasting them into. This is probably happening on his second monitor which we can’t see.

CLIPBOARD DATA: 🏯🌪🛵🤤💎🤡🐬👀👺🖍🎛👃👦🐇🦆🤜💎😏🍠👄👥🤐🏞🔧🏧🏄🚾📑

We could probably tweak the the player to view the second monitor, but that isn’t necessary since we have a hint in the TODO file (gotta say, I missed it on my first watch):

Try ecoji

From ecoji’s Github

Ecoji encodes data as 1024 emojis. It’s base1024 with an emoji character set. As a bonus, this repo includes code to decode emojis to the original data they represent.

Sounds just like a novelty encoding that would be used in a CTF challenge:

We tried installing the original Golang tool, but had issue making it work, so we went with the Python implementation which is conveniently available through pip

echo -n 🏯🌪🛵🤤💎🤡🐬👀👺🖍🎛👃👦🐇🦆🤜💎😏🍠👄👥🤐🏞🔧🏧🏄🚾📑 | ecoji -d

and we get

Flag: HF-{wowEmojisAreSoCoolBase64BTFO}

Monster Inc The Middle 4 (capture #1)

What is the name of Mike Wasowski’s motivationnal song? Flag format: HF-{songNameCamelCase} Example: If the song was “never gonna give you up”, the flag would be HF-{neverGonnaGiveYouUp}

We can’t tell anything about the song played from the video alone, so we need to get the session’s audio. PyRDP doesn’t support audio replay, so it’s time to dig into the protocol’s spec again!

RDP is a complex protocol which has many extensions. The one that interests us is Remote Desktop Protocol: Audio Output Virtual Channel Extension aka MS-RDPEA. This extension defines how audio data is transmitted in RDP sessions.

Audio data can be transmitted over 3 different channels:

A static virtual channel named RDPSND
A dynamic virtual channel named AUDIO_PLAYBACK_DVC or AUDIO_PLAYBACK_LOSSY_DVC
Plain UDP

Each virtual channel has its own channel ID, which we’ll use to filter packets to get only the audio data.

The client requests the static channels it wants for the current session in the initial ClientData packet under clientNetworkData > channelDefArray with one channelDef entry per channel. In its initial ServerData packet, the server returns the ID for each of the requested channel, using an id of 0 for an unallocated channel. The ID’s are assigned in the same order as the channelDef entries were sent. In wireshark, the virtual channel id field is t124.channelId.

In our capture, we can see that rdpsnd is the second channel requested and that it was assigned the ID 1005.

When watching the session replay, we see that the song starts playing about 1 minute and 15 seconds in.

We apply these filters to the capture containing exported PDUs from capture #1 to get the audio packets. The virtual channel is contained is the appropriately named rpd.virtualChannelData field. We can see that packets around this time are SNDWAVE2 packets. These are used to transmit audio data.

Putting it all together:

tshark -Y "t124.channelId == 1005 && ip.src == 192.168.110.133 && frame.time_relative >= 75" -r MonsterIncTheMiddle2PDUs.pcapng -e rdp.virtualChannelData -Tfields | tr -d '\n' | xxd -r -p > monster_inc_audio

We filter by channel ID, source IP and time to get only the virtual channel data from packets sent by the server over the audio channel after the start of the song. Then we remove the newlines to transform this into a continuous stream of bytes and use xxd -r to convert this bytestring to binary data.

This will contain some noise because of the header for the SNDWAVE2 PDUs, but it should not be significant enough to prevent us from recognizing the song.

We then import that file as raw data in Audacity. We’re asked for encoding and sample rate, these are actually defined in the CLIENT_AUDIO_VERSION_AND_FORMATS and SERVER_AUDIO_VERSION_AND_FORMATS packets, but let’s try importing it with Audacity’s default values (Little-endian, 44100Hz, signed 32bit PCM) for now and see how that sounds.

Just from the graph, this does look like a song’s audio!

When we play it, it’s got the characteristic chipmunk on speed sound of audio being played too fast. We go to Effet > Change Speed and reduce the speed by half and voilà!

Flag: HF-{letItGo}

I really enjoy this kind of challenge; the ones that make you learn about a new tool and force you to really understand the technology you’re analyzing (RDP in this case), so kudos to res260 !