One of the first things on any security auditor’s list is checking to see if a site is vulnerable to cross-site scripting (XSS).
The point of an XSS vulnerability is that an attacker can inject a script into your page. Once the script is there, it can manipulate your page in any way it wants. It can add markup (i.e. reconfigure the page). It can intercept visitor interactions (i.e. capture usernames and passwords, sending them to a remote site).
Or it can make your entire page to the Harlem Shake.
Ok, this is a pretty amazing XSS exploit: http://t.co/PObynTNTMV
— Owen 🎹 (@ringmaster) September 18, 2014
Own linked here to a DNS lookup tool that, apparently, is suffering from a cross-site scripting exploit.[ref]The site was being actively mocked at the time of this writing, but late yesterday afternoon I noticed that the site host updated their system to properly escape DNS record output. Unfortunately you can’t see the active exploit any longer.[/ref] Shortly after the page loads, you hear a Harlem Shake track play in the background and, shortly thereafter, see all of the content on the page dance along with the music.
It’s awesome, both in terms of humor, and in terms of the stellar example this presents of how not to build a site.
The Exploit
It took a few minutes of digging to figure out what was actually going on with the site. I couldn’t track down exactly how the Harlem script was being injected … until I saw a very specific indicator.
The domain being looked at returns several TXT records. Two of them appear to be (intentionally) malicious.
One showed up as blank in the page, so I skipped over it. The second was a YouTube video embed.
Wait …
One of the DNS TXT records was a YouTube video embed. What’s worse, it was a functional YouTube video embed!
I used the DOM inspector to look at the seemingly blank TXT record, and found out it was, indeed, the script tag that was triggering the Harlem Shake takeover.
When the DNS lookup tool rendered the raw content of the TXT records directly to the page. No validation. No escaping. So both the <iframe> and <script> TXT tags became functional markup and part of the page.
Lessons Learned
When developers are pulling information from external sources, it’s often easy to just print the information direction to the page. Often we’re the ones writing the information in the first place, so we assume that data will be safe when we read it back out.
This includes remote APIs, databases, filesystems. Basically any place we take data and turn around to present it to visitors.
Always assume a malicious party can stand between you and your data.
This is why, with WordPress code, the accepted best practice is to escape everything. Core functions, custom functions, database reads, remote data fetches – everything must be escaped before we print it to the browser.
After all, no one expects a TXT record to contain a <script> tag, so it’s safe to print without escaping, right?