Published on
When users submit data from a form they are free to enter anything they like into the form inputs - including corrupt, unwanted or just plain malicious (i.e. dangerous) content. While clientside validation can block user errors it is easily bypassed by turning JavaScript off and so does not provide any real protection for your site.
ChronoForms does no server filtering, sanitization or validation on submitted form data by default.
This has pluses and minuses. On the minus side, there is a risk of corrupt, unwanted or malicious content being posted. On the plus side, as a form developer you can choose to accept any content you want without ChronoForms messing it up.
The first line of defence is to use serverside validation and ChronoForms offers both the auto validation for common input data types and custom validation for other inputs.
If you need more - and you probably do - then PHP has come quite good filters and sanitizers available and Joomla! has many of them available as methods.
To implement them in ChronoForms use either the Serverside Validation action (if you want the ability to handle errors) or the Custom Code action if you just want to sanitize.
With sanitization, you have a choice of approaches; you can either (a) accept the values that ChronoForms has loaded into the $form->data array; or (b) you can re-load directly from the $_POST array and over-write the ChronoForms values in the $form->data array. The main difference is that (a) requires you to use the PHP sanitisers but protects any previous processing that may have been done while (b) allows you to use the Joomla! methods. In most cases (b) is simpler and equally effective.
The Joomla! methods take the general form
JRequest::getVar('input_name', 'default_value', 'source', 'type', 'mask');
The input name is the name of the form input (or the URL query string entry). It is the only required parameter.
The default value will be used if there is no input with the specified input name and source.
The source will usually be 'post' (though 'get' and 'cookie' are also available). If you leave this empty then any source with a value will be used.
The type specifies the kind of data that is expected and hence the sanitisation that will be applied. Values are: INT or INTEGER, FLOAT, DOUBLE, BOOL or BOOLEAN, WORD, ALNUM, CMD, BASE64, STRING, ARRAY, PATH, USERNAME
There are shortcuts to using the most common types by using JRequest::getInt(), JRequest::getString(), etc. instead of JRequest::getVar().
Lastly the Filter Masks are: JREQUEST_NOTRIM - prevents trimming of whitespace; JREQUEST_ALLOWRAW - bypasses filtering; JREQUEST_ALLOWHTML - allows most HTML. These sre defined terms and should be used without quotes round them.
If JREQUEST_ALLOWHTML is not passed in, HTML is stripped out by default.
So, here are some examples:
<?php $form->data['some_integer'] = JRequest::getInt('some_integer', 0); $form->data['some_string'] = JRequest::getString('some_string', 'empty', 'post'); $form->data[some_textarea'] = JRequest::getString('some_textarea', '', 'post'); ?>
Note: mostly from the Joomla! docs here
FYI: ChronoForms uses
$form->data = JRequest::get('post', JREQUEST_ALLOWRAW));
If you want to use the PHP approach the docs on Sanitize and Validate filters are here
Advanced filtering
These filters are not perfect' though they are good enough for most sites. If you think that your site is at risk of attack - especially if it is very popular or the topic is contentious - then more thorough filtering may be needed.
There's a xss_clean() function from this StackOverFlow post that can be included in a Custom Code action. This function does is targeted at malicious code but does not remove 'normal' HTML, this code sample will apply both xss_clean() and a PHP Sanitize filter to the form input values listed in the $clean_array at the beginning:
<?php
$clean_array = array(
'input_name_1',
'input_name_2',
'. . .',
'. . . '
);
foreach ( $clean_array as $v ) {
$form->data[$v] = filter_var($form->data[$v], FILTER_SANITIZE_STRING);
$form->data[$v] = xss_clean( $form->data[$v] );
}
function xss_clean( $data ) {
// Fix &entity\n;
$data = str_replace( array( '&', '<', '>' ), array( '&amp;', '&lt;', '&gt;' ), $data );
$data = preg_replace( '/(&#*\w+)[\x00-\x20]+;/u', '$1;', $data );
$data = preg_replace( '/(&#x*[0-9A-F]+);*/iu', '$1;', $data );
$data = html_entity_decode( $data, ENT_COMPAT, 'UTF-8' );
// Remove any attribute starting with "on" or xmlns
$data = preg_replace( '#(<[^>]+?[\x00-\x20"\'])(?:on|xmlns)[^>]*+>#iu', '$1>', $data );
// Remove javascript: and vbscript: protocols
$data = preg_replace( '#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2nojavascript...', $data );
$data = preg_replace( '#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2novbscript...', $data );
$data = preg_replace( '#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '$1=$2nomozbinding...', $data );
// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$data = preg_replace( '#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#i', '$1>', $data );
$data = preg_replace( '#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#i', '$1>', $data );
$data = preg_replace( '#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#iu', '$1>', $data );
// Remove namespaced elements (we do not need them)
$data = preg_replace( '#</*\w+:\w[^>]*+>#i', '', $data );
do {
// Remove really unwanted tags
$old_data = $data;
$data = preg_replace( '#</*(?:applet|b(?:ase|gsound|link)|embed|frame(?:set)?|i(?:frame|layer)|l(?:ayer|ink)|meta|object|s(?:cript|tyle)|title|xml)[^>]*+>#i', '', $data );
} while ( $old_data !== $data );
// we are done...
return $data;
}
?>
For even higher protection of fields where HTML input is possible you could use the HTML Purifier library which would need to be installed on your site.
Comments: