WordPress comes with a lot of built in helper functions for the purpose of making a site’s code more secure. A lot of code and legacy sites I’ve worked on as well as some plugin documentation don’t seem to use these very often, if at all however, and so for for a time before I knew any better I was just echoing plain text, simply because I didn’t know or understand the whole point of sanitizing data.
If you’ve ever downloaded or edited a WordPress theme you may have noticed a lot of its template files include code wrapped in many abstract, non obvious functions, like the example below from _s:
1<?php 2the_content( sprintf( 3 wp_kses( 4 /\* translators: %s: Name of current post. Only visible to screen readers \*/ 5 __( 'Continue reading<span class="screen-reader-text"> "%s"</span>', '_s' ), 6 array( 7 'span' => array( 8 'class' => array(), 9 ), 10 ) 11 ), 12 get_the_title() 13) );
When you see something like this, it is very tempting to condense it all:
1<?php 2 3the_content( 'Continue reading <span class="screen-reader-text">' . get_the_title() 4 .'</span>' ),
Surprise, surprise, the
wp_kses() wrapper is actually there for a reason. It’s a very specific function used to filter the html by only allowing span elements with a class to display.
wp_kses() is just one of many helper functions WordPress provides to escape data.
To echo, or not to echo
Consider the following code:
1<?php if( get_field('image') ): $img = get_field('image'); ?> 2 3 <img src="<?php echo $img['url']; ?>" alt="<?php echo $img['alt']; ?>"/> 4 5<?php endif; ?>
It is a simple acf field, echoing out a custom image url and alt tag for display.
There’s nothing inherently wrong about the code, and it will work as intended.
If you want to ensure the code is even more secure however, you could do something like this:
1<?php if( get_field('image') ): $img = get_field('image'); ?> 2 3 <img src="<?php echo esc_url( $img['url'], 'text_domain' ); ?>" alt="<?php esc_attr_e( $img['alt'], 'text_domain' ); ?>"> 4 5<?php endif; ?>
Notice that I replaced instances of
echo with the
esc_url() wrapper function for urls, and
esc_attr_e() for the alt tags.
What does this do exactly and how does it ‘clean’ the data?
- Rejects URLs that do not have one of the provided whitelisted protocols (defaulting to http, https, ftp, ftps, mailto, news, irc, gopher, nntp, feed, and telnet), eliminates invalid characters, and removes dangerous characters. This function encodes characters as HTML entities: use it when generating an (X)HTML or XML document. Encodes ampersands (&) and single quotes (‘) as numeric entity references (&, ').
- If the URL appears to be an absolute link that does not contain a scheme, prepends http://. Please note that relative urls (/my-url/parameter2/), as well as anchors (#myanchor) and parameter items (?myparam=yes) are also allowed and filtered as a special case, without prepending the default protocol to the filtered url.
esc_attr_e() escapes text for html attributes:
- Displays translated text that has been escaped for safe use in an attribute. Encodes < > & “ ‘ (less than, greater than, ampersand, double quote, single quote). Will never double encode entities.
Why escape at all?
Escaping/sanitizing data is the process of filtering data, and adds a layer of security so your code isn’t as vulnerable to attack.
Wordpress explains the importance of sanitization well, and have a long list of helper functions that covers just about every possible use case. To make a long story short, using WordPress’ in-built functions to clean data is one of the easiest ways to ensure your code is as resilient and secure as possible.
Why is it so complicated?
The thing that confused me most about how WordPress implements sanitization is that there are a few ways to escape and clean data. For example:
1<?php 2 3esc_html_e( 'Hello World', 'text_domain' ); 4 5// is the same as 6 7echo esc_html( __( 'Hello World', 'text_domain' ) ); 8 9// which is the same as 10 11echo esc_html__( 'Hello World', 'text_domain' );
- It’s the result of WordPress being almost too flexible, where there are many short helper functions that can be cobbled together to perform one whole action.
- Add to this the fact that a lot of these functions are named by single letters and non-obvious signifiers such as kses.
- Some functions are made for display, others are made for use in PHP only. For example, you can use
esc_html_e()to clean data and immediately display it, whereas there is only a
esc_url()function where you need to manually echo it out.
- And to make it even more complex. many of these functions are also important for the purpose of internationalisation- that is, making a website translatable to other languages.
Getting to know these different functions does take time and it probably means you’ll be looking up the codex a lot. Luckily, the documentation is extremely well written and there are many examples and help online if you need it.
The most important thing to remember, if you are using data of any kind whether in PHP or display, it is prudent to use WordPress’ in-built escaping functions to add that extra layer of security.
It doesn’t matter how little plugins you’re using, I think it’s a good practice to always ensure the data you are using has been cleaned especially if you are using data returned from a function.
WordPress has a large audience with a good mix of hobbyists, tinkerers, front end developers, and PHP specialist developers. When these all mix it is inevitable that code can become mangled in the process- not by ill intent but simply because it is hard to enforce escaping when it is easy to mix PHP and HTML.
That’s why I think it pays to know little, big things like this. You may not need to read all the theme developer docs WordPress provides, but this is definitely something I think every WordPress developer and tinkerer needs to know!