Status
Not open for further replies.

el_jentel1

Active Member
655
2007
114
0
One of the biggest concerns of all developers or any webmaster running a custom script (no support for it) is security, and since hackers are making sure to reach every hole, we need to make sure it's closed before they reach it.

If you run your inputs through a database or just temporarily displaying it on your website, or even executing shell commands, you need to make sure that your entries are escaped, or clean in other words.

Some of the most common functions to clean or escape in PHP are:

These function are very useful, most of the time, but they are very general. It is always best that you validate your data exactly as required, for example, let's say you have a public input which requires the page number to display, the entry needs to be an integer, so we validate it.

There are many ways to validate entries, whether with functions already made by PHP, type juggling or using regular expressions (regex).

Some of the useful functions are:

Or in type juggling:

  • (int)
  • (bool)
  • (float)
  • (string)
  • ...
You should make use of all that, it'll save you a lot of time and will make sure that your entry is clean, for example, let's say you have a public $_GET or $_POST, the purpose of it is a poll or a survey if you will, asking the user "How many times have you visited my site today?"

If you pass this into the database without any validation, malicious code can be executed, example.

PHP:
$entry = $_GET['vote']; // ?vote={MALICIOUS_CODE}

That will be executed and causing your database to malfunction, depending on the malicious code of course, deleting, dropping or editing entries in database.

If we add (int) right before $_GET it'll basically switch the type to integer, and only returns an integer.

PHP:
$entry = (int) $_GET['vote'];

If you want to make your validation public and display errors, you could do something like:

PHP:
if ( !is_numeric($_GET['vote']) )
{
    // Display error

Notice the "!" before is_numeric(), it basically means "not integer, or not equal to, not set..." however you want to word it. So "!" is "NOT" in short.

What about validating regular strings? for example you have a script which has a registering system on it, of course you should always use mysql_real_escape_string() for that, but why not do extra validation to be extra safe.

In this situation we use regex (regular expressions) very useful, let's look at this example:

PHP:
if ( !preg_match('/^\w+$/i', trim($str)) )
{
    // Display error

A couple of things to notice here, first the preg_match() function, this function basically matches regex as you defined it to the entry added, in this case we're asking PHP "Is the entered string consists of a-z A-Z 0-9 or _ or not?"

This is very useful for user name validation, even though you could code another version that accepts all types of characters (using some of the functions we mentioned earlier) but a user name only needs those characters, mostly.

Another thing to notice are the ^ and the $ signs, ^ means "start" and $ means "end", basically telling it that string starts with that and ends with this.

The trim() function removes all extra spaces from left/right of the entry so " string " will become "string".

Here is a basic list of the most common expressions used:

  • \d: Matches digits, equivalent to [0-9]
  • \w: Matches word characters and underscore, equivalent to [a-zA-Z0-9_]
  • \s: Matches space, new line and tab.
With regex the possibilities are endless, you can validate a URL, email, user name, string and how many characters allowed in each constant (minimum and maximum).

You can easily find detailed tutorials about regex by searching Google.

Other areas that need to be secured are when you execute system functions, whether publicly or internally. Functions like system() and exec() which are used to execute system commands also need to be escaped.

Of course you can use the same methods we mentioned before (ie: regex) to validate data through system functions, however, thanks to PHP, they have made special functions to escape commands in system functions, for example:

The function escapeshellcmd() escapes more characters than escapeshellarg().

Even if you use those functions, it's always a good idea to make additional validation to make sure that the entry passed is exactly as you want it.

Conclusion: always make sure that you secure your data, this is for your own good and of course the users viewing your website, and if a hacker managed to get into your scripts, don't give up, this just gives you more power because you'll be learning new things, thus making your scripts more powerful.

Thank you.

Article by: el_jentel1
 
17 comments
wow Gr8 Article.
Thaaanks :)
I already use some of these now will use these tips too.
its really important to validate the form contents with php and put some security measures.
As some guys just validate forms using javascript and dont validate it server-side Its a real security threat because a hacker can still send anything malicious. :|
 
The way i prefer to do things is to check all data that can be inputted via GET/POST.

The way i do this is create a class that will recursively check the inputted userdata before we use anywhere within application.

A simple class can do this, taking into note the class below is an example, and is only for informational purposes.

PHP:
class Input
{
    var $get,$post,$cookie; //Cleaned (Not DB)
    var $_get,$_post,$_cookie; //Uncleaned / RAW

    function __construct()
    {
         $this->clean();
    }

    private function clean()
    {
         //Keep the raw stuff in there designated variables.
         $_get = $_GET;
         $_post = $_POST;
         $_cookie = $_COOKIE;
   
         //Clean them and assign the data to the designated variables;
         $get = $this->escape($_GET);
         $post = $this->escape($_POST);
         $cookie = $this->escape($_COOKIE);
    }

    public function __get($type)
    {
        return isset($this->{$type}) ? $this->{$type} : array(); // usage: $input->get->some_key
    }

    public function escape($var)
    {
         $return = array();
         foreach($var as $key => $val)
         {
              if(is_array($val))
              {
                   $return[$key] = $this->escape($val);
              }else
              {
                   $return[$key] = htmlentities($val); //MORE WORK HERE
              }
         }
         return $return; //Return it;
    }
}

so from now on if you use this class to get your GET/POST/COOKIE Vars, all the values are pretty safe, altho you still need to use a Database Escape Fuction. Read El_j's thread above.

Doing your escaping this way reduces the amount of code you need to write as its all done for you,

Peace
 
i dont mind saying i have no idea what all this is about, but its very nice to see the community pull together and teach each other tricks of the trade.

wJ <3
 
Basically jay its general application security..

In PHP theres several factors that of security that ALWAYS needs to be addressed,

The 2 main are XSS (Cros Site Scripting) witch can be prevented by turning characters such as ' and " int there html entites & qoute; etc. this is what my other post deals with. turns all user inputted data such as index.php?key=some_user'data"with\'SomeChars into a safe escaped string to use, this will help prevent XSS

The other is database injections, this is when you take lets say an id from an url such as index.php?id=2 and use the $_GET['id'] in php to check the database. that id=2 could easily be changed into something that can get you hacked ie (id=-1' OR DELETE xxx FROM xxx\).

If you want to look up more information on this just google SQL Injection you will find plenty of basic outlines.
 
so in your XSS example, your ensuring everything is HtmlEncoded?

Forgive me for my ignorance, but why are these not hard-implemented? Is there any given situation where you wound want these functionalities?
 
Because some systems especially BBCODE, use there raw entities to convert to valid html i/e a bbcode B tag to be converted into <strong> html tag, the bbcode system will parse the submitted post then extract all allowed tags from the post, then turn all the remaining into its entities then add the converted code back to the post, this way it does not produce messed up bbcode.

Thats the best example i can give you.
 
It depends on what I'm coding. If it's just a small simple thing I'll do it like el_j above but if it's a fairly big script with a few pages and tables I'll use something like Litewarez. It's far less coding and work in the long run to just clean everything at the start.

If I'm super paranoid like an admin area I do like to do stuff like eg. replace example.com?go=delete with example?go=3 and example.com?go=reply with example.com?go=4 etc.
And then make sure it's an integer. Same thing for $_POST. Basically I try and make every transfered variable a number. It also confuses people a bit too which all helps.
Interested to know people's thought's on this.
 
Mr Happy, i dont really think using integers is a great advantage apart from sanitization i.e typecasting your can lose yourself in your code if your getting into the 100+ actions, this i would keep with reply,edit,delete what i tend to do is to develop a MVC Framework and create a sort of REST system so:

Instead of topic.php?mode=reply&id=22 i would do

/topic/reply/22/

this way you only have to sanitize the topic/reply values so that they only allowed to be called IF the function exists within the topic class, this gives better security then within the reply method i just intergize the 3rd param.

This gives alot more structure and security to your code.
 
hmm

But wouldnt having to decode be a better option than having to encode? I mean, everything is encoded whether you like it or not in the interest of security, and if you dont like it, decode it.

Just seems far more logical to me..
 
i also think the mojor reason for leaving the entities down to the end user is to allow the user to have more control over its code.

personally i think its good that its not auto-encoded because users then learn about html entities, and characters codes, this is an important part of web-development and personally i dont expect PHP Dev team to do all my work for me!
 
I agree with litewarez, in a way, PHP has made sure to provide encoding and escaping functions, so they are doing it, just not directly.

I don't really think it's just about giving control, PHP language, like I always say, is almost becoming a noob friendly language, which is great, means anyone can learn it fast.

If you implement the things mentioned in the article (or other methods) into functions or classes it'll definitely save you a lot of time, but first you need to make sure how to handle functions and classes, and what to add in there, this way you don't end up escaping or encoding the wrong characters, which will render the function useless.

The most common "hacks" are SQL injection and HTML injection. Never really saw much reports about system/shell injection, but thought I'd include it anyway.

I still say, for direct small inputs, regex is the man.

The other good thing about not having everything encoded automatically is if you're saving that data into a database, you don't really need to encode all characters if you code a strong script.

With direct input, it'll save you space, instead of &quot; or &amp; the decoded version will be inserted, if you consider a large database, that will save you a lot of bytes and bits "ie: &amp; vs &".

It's much better than the early days, when using base_64 to encode data then store in database was a standard, which increased data by 30-40%.
 
@ el_jentel1

Thank you for posting that very detailed information, Even knowing as much as i do when it comes up PHP their is always more to learn really appreciate to have an admin around that actually devotes their knowledge to their forum :)
 
Status
Not open for further replies.
Back
Top