Update of "input"
Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Artifact ID: | 1da5ba9116eed22a301818c8f911aecfa297ad1d |
---|---|
Page Name: | input |
Date: | 2014-03-20 02:42:18 |
Original User: | mario |
Mimetype: | text/x-markdown |
Parent: | 3cd29bcf731bdd3f34b39d058a73f4c2285563a6 (diff) |
Next | 97d44d3ef9ec003c557c3223d9bfe526fbdd1f5e |
new input()
The input class wraps the superglobals $_REQUEST
, $_GET
, $_POST
, $_SERVER
and $_COOKIE
. It provides streamlined sanitization with unobtrusive filter names and a unique semi-fluent syntax:
$_REQUEST<mark style="background: linear-gradient(#f7f6f5,#f7e655,#f7f6f5); color:#fa3">->text</mark>["content"]
Filtering functions can also be chained, as in $_GET->text->html["title"]
. There are various whitelisting and sanitizing methods for that.
This approach addresses input constraint validation at the earliest feasible entry point.
Unifies access through a central verification mechanism.
Allows reliable input interpolation instantly into many target contexts.
Additionally it can still shadow/audit casual and unverfied accesses. Its overall API simplicity is meant to encourage safety; through minimal effort.
Available filters
There's a wide range of built-in methods. Often the basic filters are sufficient and best suited for combination.
Method | Type | Sample | Usage |
---|---|---|---|
int | cast | 123 | Only numeric characters, cast to integer. |
name | white | abc12_x3 | Alphanumeric symbols only. |
id | white | xy_2.1 | Alphanumeric chars, dot and underscore. |
words | white | abc def | Text with minimal interpunction (only spaces allowed). |
text | white | Hello, World! | Common natural text with basic interpunction (including quotes, but no < >). |
filename | filter | basename.txt | Replace all non-alphanumeric characters with underscores. |
float | cast | 3.14159 | Cast to float. |
boolean | cast | true, false | Converts "false/true" or "0/1" or "off/on" and "no/yes" to boolean. |
ascii | white | Aa#:β,n0~ | Characters in the ASCII range 0 .. 127 |
nocontrol | white | Aa#:β,n~ | Fiilters out control characters (< 32), except r n t. |
spaces | filter | Single line | Turns linebreaks / whitespace (r n t) into spaces only. |
q | black | βvalueβ | Shorthand for addslashes . |
escape | black | []β{}'$`!Β΄&?/><|*~;^ | Broader escaping of well-known meta charactes (quotes and regex). |
html | filter | & | htmlspecialchars (shorthand) |
Structural | Following filters constrain specific input formats. | ||
datetime | white | 1999-12-31T23:59:59Z | HTML5 datetime values |
date | white | 2015-07-17 | Just date string. |
time | white | 23:59:20.17 | Time specifier. |
color | white | #FF5022 | Hex color value. |
tel | white | "+1-347-2214144 | International-format telephone number. |
iconv | filter | Convert input to UTF-8 | |
utf7 | black | Filter some UTF-7 out. | |
ip | white | ::1 | IPv4 or IPv6 address |
ipv4 | white | 134.22.7.207 | IPv4 address only |
public | white | 8.8.4.4 | Validate IP to be public. |
white | you @gmail.com | Syntactically valid email address. | |
url | white | Ensure URL syntax xxx:/// | |
http | white | http:// localhost/ | More conservative http:// URL constraint. |
uri | white | More generic URI syntax. | |
xml | cast | Create a SimpleXML object from input. | |
json | cast | {βkeyβ:βvalueβ} | json_decode() |
purify | filter | <b>basic</b> | Utilizes HTMLPurifier |
NOP | Virtual / control filters. | ||
log | control | Raw value access with logging. | |
raw | control | Raw access with E_NOTICE (is the default). | |
disallow | control | Disallow unfiltered variable access (configurable per INPUT_DIRECT). | |
is | control | Is a meta filter, that applies the following filter chain, then checks if the content would have passed unaffected. Returns a boolean if all constraints were matched. | |
Parameterized | These filters require method access $_GET->default(βidβ, βindexβ) instead of the plain array key syntax. |
||
length(ID, 20) | filter | Hello Wo | Cuts strings to maximum given length. |
range(ID, 1, 15) | white | 17 | Constrains numeric input to the given range. |
default | filter | β¦ | Uses default value, if no input present. |
regex | white/black | β¦ | Custom regular expression method ->regex("field", "/(abc)/") |
in_array | white | a,b,c | Can be used with array parameter, or a simpler comma-separated of allowed values. |
Multi-Apply | Following filters work on a set of input variables, instead of a single one. | ||
array | control | Is automatically applied to input subarrays, so filters are run on each entry. | |
list | control | Combine multiple input variables per name (comma-separated list) and apply filtering collectively; finally return a named result array. | |
multi | control | Also grabs a list of input variables. But does not run filters on scalars within, but pass the combined set to filter functions. This is used in combination with e.g. http_build_query |
|
Global functions | |||
strtolower | filter | Any global function can be chained actually. It just needs to accept one parameter, modify its input (string), and return something in return. Custom userland functions can thus be utilized. | |
urlencode | filter | ||
strip_tags | filter | ||
Inadvised filters | Care should be taken here. Liberal application will lead to a false sense of security. | ||
sql | filter | Configurable PDO::quote shorthand. |
|
mysql | filter | Shorthand to mysql_real_escape_string (doubly discouraged). |
|
xss | black | Minimal XSS blacklist |
As mentioned, any global function can be utilized implicitly. A few core string functions are useful in this context. But the intended target are custom functions.
Binding filters
One can even bind new functions or class methods using:
$_GET->_filtername = array("AppFilter", "validSessionID");
It's imperative to shadow the filternames using an underscore _
prefix however. See input.inspekt.php
for some examples. Such bound methods can be chained just as well:
$_GET->text->validSessionID["var"]
(Btw, to use some of the input
filter methods statically and outside of their scope, one could use $value = input::_datetime($value);
for instance.)
Complex filters
With ->list
and ->multi
you can utilize some more crafty features. For instance:
$_GET->multi->http_build_query["id,name,title"]
Will rebuild an URL-encoded string from three input variables.
Context targetting
The input
wrappers primarily encapsulate early access to unvetted remote input. This avoids delayed sanitization and an effortful data flow tracing through application layers.
But some filter combinations are perfectly suitable to skip the application logic, and combine input constraining and output context preparation.
For instance replaying form input becomes as simple as:
echo <<<FORM
<input name=title value="{$_POST->text->html['title']}">
<input name=email value="{$_POST->email->html['email']}">
FORM;
While this is highly indavisable (and ultimatively more effort than just using parameterized queries!!!!) one could do the same for SQL queries:
pdo_query("INSERT INTO comments VALUES ('{$_POST->id->mysql['name']}') ");
The complex curly ("var expression") syntax makes this utilization of input filters in string context suitable in quite a few cases.
With preset/default filters (see ->always()
), one could even use the simple PHP3 syntax in double quoted string context.
Wrapper implementation
Basically the filters are initialized for all superglobals like:
$_GET = new input($_GET);
The original variables are stored in ->__vars[]
internally. Each $_GET->filtername
pseudo-method access is accumulated in a filter chain.
The first use of array ["key"]
or method ("key")
requests, applies the filter chain to the named input variable, then returns the constrained value.
Filter chain defaults
It's possible to define a default filter for remaining $_GET["old"]
accesses with the INPUT_DIRECT
constant.
- Per default it uses "raw" which just prints a notice.
- It can also be set to "disable" to prevent such uses.
- Another alternative would be "q" to emulate magic quotes (not recommended).
- Or using "sql" to securely use
$_POST["fields"]
in SQL strings, if that's the default target (also not recommended).
Another option is to predefine a filter chain on a particular superglobal with ->always()
:
$_POST->xss->nocontrol->always();
Then any $_RAW["access"]
would still use these filters. Yet additional more context-specific filters could also be intermixed.
It's equivalent to having the filter chain built up, before accessing an entry:
$_GET->filter->name->and->more;
$_GET["var"]
Btw, to reset a default filter chain, use ->__always = array()
;
Predeclaring filters for raw access
While this somewhat amounts to magic_quotes 2.0, you can also pre-define filter chains on a variable name basis:
$_GET->__rules["old_id"] = array("int", array());
This is suitable for bolting a minimum of safety onto old code, whose data flow is structurally hard to fix otherwise.
Differences to plain $_GET
/ $_POST
/ $_REQUEST
</h3>
Because the whole ArrayAccess
and Iterator
interfaces are implemented, it's easy to transition existing code to new input()
. There are few behavioural discrepancies.
One thing that won't work for example is the common / olden idiom:
if ($_POST) {
To probe for presence of input data, one should check one of the keys, or rather:
if (count($_POST)) {
Which has the same effect.
Methods ->has()
, ->no()
, ->keys()
These three convenience methods make some idioms more readble. Instead of testing for isset($_GET["key"])
one can now write: $_GET->has("key")
. Or to probe for the opposite $_GET->no("sleep")
.
In place of array_keys()
there's now $_REQUEST->keys()
, also slightly shorter.
Notice emission
Syntactic salt Γ la (isset($_GET["id"]) ? $_GET["id"] : "")
for silent value substitution has become commonplace.
It's made redundant here, because input{}
itself already probes for existence of variables. Notices for absent values are only generated afterwards, and only if requested. Thus they can be reenabled when needed, unlike with the irrevocable isset ?:
super suppression syntax.
Rather utilize INPUT_QUIET
to control it at incursion. Set this constant to 1
prior loading input.php
to eschew notices and just receive NULL
for absent input data. For uncovering non-systemic or structural flow deviations you could then easily reenable them later.
Rewritten code might also utilize $_REQUEST->default("id", 123)
for applying preset values. Because of its centralized role you could thus alternatively adapt ->default
or even inject a different default handler when the need arises.
Closing remarks
Using such an input filter does not mean one can forgo database escaping (or parameterization) et al. It just adds another layer of format constraining and thus a bit of security atop.
And it's a very simple and convenient layer! (Complexity seldomly abets security.)