Basics of cookies and sessions

Teachings of a Samurai Engineer 7: Cookies and sessions, part 1

This time, I’d like to talk a bit about cookies and sessions, two functions that look quite similar at a glance.

First, let’s go over the basics on HTTP communication.

HTTP’s most simple form is HTTP/0.9, so let’s start there.

To break it down to its basics:

  • A simple request is made
  • A simple response is given

HTTP/0.9 really has nothing going on past this, and that makes it perfect for understanding HTTP in its most basic form.

This request->response flow continues to be repeated in later iterations of HTTP. From HTTP 1.0, to 1.1, to 2, the core of it is built on the same simple combination of parts: Meet a request with a response.

This simple system is a major point in favour of HTTP. However, this simple structure also leads to HTTP being called a ‘stateless protocol’. In other words, it gives you some trouble when you want to employ conditions. The short version is that in scenarios where you want to maintain authenticated states, being stateless is a problem.

What often comes up in these conversations is something called Basic Authenticiation (and above it, Digest Authentication). These will collectively be referred to as Basic Authentication from hereon. Basic Authentication’s system is to send the ID and password each time. In other words, the authentication previously discussed is performed with every communication.

That said, sending the ID and password continuously is a bit of a brute force solution for communication, so it would be better to be able to authenticate once, and maintain that state over later communications.
In authentication states listed above, or any scenario that requires some sort of state or user identification, this is going to keep coming up.

Thus, to maintain a state, cookies and sessions are used.
As an extension of cookies being used to determine user ID, it may be worth looking into recent stories about the GDPR (General Data Protection Regulation) requiring active user permission to use cookies.

Having explained the importance of maintaining a state, let’s get into the main point. That is, the difference between sessions and cookies.

Firstly, cookies are rigidly defined by the RFC. Thus, all environments can handle cookies the same way with no difference by language.
Sessions, on the other hand, vary by language, framework and implementation, and are not a structure defined by the RFC or equivalent..

Cookies are always stored in browser-side storage, and have a set format of HTTP request/response communications. Sessions, meanwhile, ‘often have their data saved in server-side storage’, but this is only the majority, and there are implementations where the storage is browser-side. In other words, the implementation also changes whether the data is part of the HTTP request/response communication, or completed in-server.

Cookies are part of the communication, and so there are various numerical restrictions on length, the upper limit of cookies per domain, and so on. According to RFC6265, a cookie must be ‘at least 4096 bytes per cookie’, ‘at least 50 cookies per domain’, and ‘at least 3000 cookies total’. Meanwhile, sessions saved server-side have no real restrictions except for the upper saving limit of the server. Sessions saved client-side effectively have their data riding on cookies, and so the upper limit follows that of cookies.

Cookies have their data saved client-side, and so they work just fine even if the web server spreads across multiple machines and is split by load balancers, etc. Meanwhile, for sessions where data is stored server-side (for example, session functions used by PHP), you need to be careful with where data is saved.
In PHP, sessions using session_start () will default to saving under ‘files’.

Thus if:

  • You have two or more web servers, restricting access with a load balancer,
  • Persistence is not set, and
  • Your PHP session settings remain set to default…

…Then a user’s access may connect to a server separate from the server that has the session file of their previous access. When this happens, this can create hard-to-replicate errors where they sometimes cannot retrieve session info.
You can make persistence settings on the server/load balancer side as well, but personally, I think it’s more common and thus preferable to set PHP’s save file destination to somewhere else.

Finally… Both sessions will ultimately hold identifying information in a cookie (by default, a key name called PHPSESSID, and a session ID), and use those values to identify users.
Thus, if you communicate with a client who either cannot use, or has not permitted cookies, then obviously not only can you not use cookies, but sessions are out of the question as well.
This has been a short discussion on the theory and background of cookies and sessions.
Next time, we’ll go into advice and common pitfalls for writing the actual code of cookies and sessions.

Part 1

Part 2

Part 3

Part 4

Part 5

Part 6

Part 8

Michiaki Furusho