Sanitizing strings and data validation in WordPress
It’s very common in our development to clean strings, or to check that strings values are correct (read “formatted”) before being inserted in the database. PHP brings a lot of great functions to sanitize strings, but most part of the time your have to use Regex and it’s not always easy. But guess what, WordPress provides many functions that do the job ! There are many great functions to clean strings, let’s have a look at the best of them!
This function is mainly used to format a string to be use as a url (the post slug). The returned value is intended to be suitable for use in a URL, not as a human-readable title.
$new_url = sanitize_title('This Long Title is what My Post or Page might be');
This function sanitizes title, replacing whitespace with dashes. It also limits the output to alphanumeric characters, underscore (_) and dash (-). Whitespace becomes a dash. Note that it does not replace special accented characters. This function accepts 3 parameters:
- $title (required) : The title to be sanitized. Default: None
- $unused (optional) : Used to be the $raw_title, but is now unused. Default: None
- $context (optional) : The context for the sanitization. When set to ‘save’, additional entities are converted to hyphens or stripped entirely. Default: ‘display’
echo sanitize_title_with_dashes("I'm in LOVE with WordPress!!!1");
This function just removes non authorized characters in a string that should contain a valid email.
$sanitized_email = sanitize_email('éric@loremipsum.com!');
This one is used to obtain a valid file name. I really recommend to use it because depending on the hosting server you are not using it could lead to some incorrect file paths due to file renaming on the fly.
$sanitized_file_name = sanitize_file_name( 'my image !.png' );
This function is mainly to use for theme developpers, it strips the string down to A-Z,a-z,0-9,_,-. If this results in an empty string, then the function will return the alternative value supplied.
See sanitize_html_class() in the codex
sanitize_text_field checks for invalid UTF-8, converts single < characters to entity, stripes all tags, removes line breaks, tabs and extra white space, strips octets.
Sanitize username stripping out unsafe characters. If $strict is true, only alphanumeric characters plus these: _, space, ., -, *, and @ are returned.
Removes tags, octets, entities, and if strict is enabled, will remove all non-ASCII characters. After sanitizing, it passes the username, raw username (the username in the parameter), and the strict parameter as parameters for the filter. This function accepts these two parameters:
- $username (required) : The username to be sanitized. Default: None
- $strict (optional) : If set limits $username to specific characters. Default: false
These functions are only some of them, there are many other functions for data validation in WordPress, you can have a look on the codex to se the whole list and you’ll see why it’s important to sanitize every data before storing them in the database, it’s more secure and can prevent some time consuming bugs searches .
I didn’t know about sanitize_email. I looked into core and it seems it’s a custom function.
Do you know if it’s better than using filter_var()
Hi Paul, i think it’s better to use sanitize_email because there are quiet a lot of tests done within the function and it’s also using many filters. The fucntions is under wp-includes/formatting.php at line 1788
look what I dug up
Seems like there’s a patch to use filter_var in the sanitize_email function
Many thanks Paul, this is very interesting, i’ll have a deeper look into it
[…] Ressource utile sur la question des sanitize (en) […]