Basics of cookies and sessions

Teaching of a Samurai Engineer 9: Cookies and sessions, part 3 (Implementation: sessions)

Last time, we looked at implementation of cookies. This time, I’ll write about implementing sessions. As I wrote in my 7 th article, ‘Cookies and sessions, part 1 ’, sessions have various implementations such that just calling them sessions broadly is confusing, so for this text, ‘session’ will refer to the use of session functions offered by PHP.

Last time, we looked at implementation of cookies. This time, I’ll write about implementing sessions. As I wrote in my 7 th article, ‘Cookies and sessions, part 1 ’, sessions have various implementations such that just calling them sessions broadly is confusing, so for this text, ‘session’ will refer to the use of session functions offered by PHP.

This time, I’m deliberately aiming for a similar text structure to last time.Comparing this to before to learn the differences and similarities of cookies and sessions should be educational.

First of all…
Whether for cookies or sessions, at the end you will output a Cookie (HTTP response header).
Thus, while it depends on your php. ini’s settings (output_buffering), writing code like below can get you an error that reads ‘Warning: session_start (): Cannot start session when headers already sent in’ or ‘Warning: Cannot modify header information — headers already sent by’.

var_dump [ini_get (’output_buffering’) ];
session_start ();

If ini_get (‘output_buffering’) is 0, you will get a warning. This is because before the headers are outputted, the contents (HTTP response body) have already been outputted [session_start () is outputting an http response header as a result of internal processes].
output_buffering is PHP_INI_PERDIR, and so it becomes an entry you can configurein php. ini, .htaccess, httpd.conf or user. ini (if PHP 5.3 or later).

This is something you can confirm every time you develop in a new environment, but you can also prevent it by writing the following in the program.

ob_start ();
var_dump [ini_get (’output_buffering’) ];
session_start ();
ob_end_flush ();

ob_start () and ob_end_flush () are Output Control Functions, so using functions like this frequently is an easy way to prevent accidents.

First, inserting a value into the cookie.

To set a value for your session, you just need to insert a value in the $_SESSION superglobal.

$_SESSION can be used in the same way as standard hash arrays.

ob_start ();
session_start ();
$_SESSION[’name’] = ’value’;
// and so on

Additionally, sessions ($_SESSION) are ultimately hash arrays, so you can insert and immediately use them on the spot as well.

ob_start ();
session_start ();
$_SESSION[’hoge’] = ’foo’;
var_dump ($_SESSION[’hoge’]); // This runs without any problem
// and so on

Now for the value to insert. Sessions can basically insert any form of variable aside from resources.
However, instances (objects) can cause problems depending on how they’re written, so typically you will have less problems if you stick to integers (whole numbers/decimals), character rows, arrays, true/false values or NULL.

ob_start ();
session_start ();
// The following all go in with no problem
$_SESSION[string’] = ’sss’;
$_SESSION[int’] = 111;
$_SESSION[bool_true’] = true;
$_SESSION[bool_false’] = false;
$_SESSION[array’] = [0, 1, 2, “abc”, ’key’ => ’value’];

Furthermore, the length limit that exists for cookies typically does not for sessions. The one caveat is that this is going to be saved to storage somewhere, so extreme sizes (a few mega — or gigabytes) will cause slowdown, so you should exercise some restraint. One more point: Sessions can be written on the assumption that data will not spread to the outside, so users will have trouble rewriting them. This is very different from cookies, so it’s good to keep this in mind from the design phase.
Now let’s take a more detailed look at instances (objects) stored in sessions. First, a simple look through code.

The following code operates with no problem.

ob_start ();
session_start ();
// Class announcement
class hoge {
}
var_dump ($_SESSION);
$_SESSION[’object’] = new hoge ();

<result>
array (1) {
[”object”]=>
object (hoge) #1 (0) {
}
}
</result>

However, if the class definition file is a different file, results change.

class hoge {
}

ob_start ();
session_start ();
require_once (’. /hoge.php’);
var_dump ($_SESSION);
$_SESSION[’object’] = new hoge ();

<result>
array (1) {
[”object”]=>
object (__PHP_Incomplete_Class) #1 (1) {
[”__PHP_Incomplete_Class_Name”]=>
string (4) “hoge”
}
}
</result>

Typically it should be the hoge instance, but instead it’s the __PHP_Incomplete_Class instance, and data restoration fails.

This is down to the timing of reading the class definition.

If you use an autholoader (I plan to write about this some day), this problem won’t occur, so that’s one method (if you use Composer, you’re very likely to have an autoloader system).

If that’s not the case, using serialize () / unserialize () will stop the problem.

To break it down to its basics:

  • On entering $_SESSION, using serialize () turns it into letter rows
  • When taking it out of $_SESSION, unserialize () turns it into an instance.

This should help going forward.

ob_start ();
session_start ();
require_once (’. /tt.php’);
// When storing
$_SESSION[’object’] = serialize [new hoge () ];
// Looking at the stored value
var_dump ($_SESSION[’object’]);
// When you use it
$obj = unserialize ($_SESSION[’object’]);
var_dump ($obj);

<result>
string (15) “O:4: „hoge“:0: {}”
object (hoge) #1 (0) {
}
</result>

Sessions provided by PHP will conclude inside the server, so if you don’t use any external input in your flow, you can use serialize () without any concern. As such, if storing instances (objects) in your session (and if you’re not using an autoloader system), it’s good to use serialize/unserialize.
Moving on, let’s look at how to erase set values in a session.
Session data can be wiped by using unset ().

ob_start ();
session_start ();
if [isset ($_SESSION[’hoge’]) ] {
unset ($_SESSION[’hoge’]);
}

As you see here, inserting or clearing session values can be done just the same way as with a standard hash array, so it should be easy to handle. Finally, let’s go over two points about sessions that it doesn’t share with cookies.
The first is the fragility of the session itself.As far as ways to attack a session go,

  • Predicting session ID
  • Listening in on a session ID
  • Session ID fixation

These are the three known.

Session ID prediction is more likely to occur if you’re using an easily predicted session ID. If you’re on PHP5.4 or later, entropy_file=/dev/urandom is the default, while in PHP 7.1.0 or later, the setting itself was deleted. If your PHP version is not extremely old, or if you’re not deliberately inserting vulnerable settings you made yourself, it shouldn’t be a huge problem.
Session ID spying tends to happen by preying on other vulnerabilities, and can be prevented by not building them in. When you consider network spying, using https instead of http is also an important point.

Session ID fixation can be prevented by, alongside the above points, changing the session ID to a new ID at certain points, such as after successful authentication.
session_regenerate_id () functions are good for this. When using these functions, you will need to explicitly specify parameters as true.
In particular, this is often used after authentication succeeds.

When I talk about implementing login functions at a later point, I hope to explain this in further detail. The second is when a load balancer is included. Sessions, by default, save data to a local file.
Meanwhile, the modern web often sets up a load balancer and uses two web servers on the back end. When these two overlap, a problem occurs.

Examples:

  • First access connects to server A, begins session.It’s assigned ‘Session ID: 0123 ’, and stores its data in server A’s files
  • Second access connects to server A, and is given ‘session ID: 0123 ’, so it reads and processes files from server A, storing the data on server A’s files
  • Third access connects to server B, and is given ‘session ID: 0123 ’, but…… server B has no files for that session, so it can’t read the session data.

This can happen all too often.
To avoid this, one way is to set your load balancer to sticky session settings, which will get around this somewhat (sticky sessions set one user’s access to the same server at all times). Still, if that server is deleted for some reason or other, that will cause problems.

In PHP, a common solution is to use session_set_save_handler () functions, to set session saving integers according to user definition. Using this, session data is saved not to local files, but another place.
session_set_save_handler can set multiple functions, but these days it’s probably more common to have it register a class (instance) that inherits the SessionHandlerInterface.

Using this, early on, you could save to RDB (MySQL etc. ) so that you could access any server through the load balancer and use your session without issue. However, once the number of users increases, RDB starts to slow down. When that happens, you can rewrite the class to use lighter databases, such as KVS.
Typically, outside of gc () methods, they’re all processes to deal with a single key. If you can handle gc () and its way of deleting old sessions, you can make the move from RDB.

As shown here, there’s a pattern to some degree for session implementation. First understand the pattern, and then customise to meet your needs; this will et you learn safely and relatively easily.
Next time, we’ll be looking at the various functions relating to output control (the ob_* functions).

Part 1

Part 2

Part 3

Part 4

Part 5

Part 6

Part 7

Michiaki Furusho

Part 8

PREVIOUS ARTICLE NEXT ARTICLE