Tom clears up the mystery of why your browser and the sites you visit love cookies so much.
Featuring Tom Merritt.
Please SUBSCRIBE HERE.
A special thanks to all our supporters–without you, none of this would be possible.
Thanks to Kevin MacLeod of Incompetech.com for the theme music.
Thanks to Garrett Weinzierl for the logo!
Thanks to our mods, Kylde, Jack_Shid, KAPT_Kipper, and scottierowland on the subreddit
Send us email to [email protected]
Cookies, they sound delicious…
But I’m told by my friends in the tech know, say that they are evil and must be purged?
But that nice shopping site I go to really wants me to agree to use them?
Are you confused?
Let’s help you Know a Little more about HTTP cookies.
Cookies for browsers all started because of online shopping.
A cookie, in this instance, is a small piece of data stored on a device by a web browser to help websites remember things between a user’s visits to them. Something usually referred to as “stateful information.” That means it can remember the “state” the website was in the last time you pulled it up in the browser. As in “look at the state of your room, Tom! You need to clean up right now!” That kind of state.
You will also hear them called a Web cookie, internet cookie, browser cookie and most properly an HTTP cookie, but usually it’s just called a cookie.
And in the past few years it’s been referred to with swear words, as it can be used to track people. It doesn’t need to be used to track people and there are lot of legitimate and useful things cookies do that aren’t bad. Still, most people know about cookies from a headline screaming about ad tracking or a popup asking you to approve all the cookies and please don’t read the small letters too carefully.
But like most tools, a cookie is neither good nor bad, it all depends on what it is used for. Keeping you logged in is one of the more useful things it can do for instance. Preserving a record of all the links you clicked on is not one of its more popular uses. But in the end it’s just a little piece of data that a website asks a browser to store away so it can read it the next time a page from that domain is loaded in the browser.
First off though, why is it called a cookie? The term actually predates the Web. Early UNIX programmers referred to short opaque packets of data passed between systems as “magic cookies.” Think of it like coat check ticket. The ticket on its own doesn’t mean anything. It’s passed to the owner while the coat is held for them and then when they hand in the ticket, they get their coat back. Why not call these little data packets tickets? Or Coat checks? The answer may be lost in the mists of 1970s UNIX programmers. Wikipedia cites the earliest reference to a magic cookie in the 1979 man page for the fseek routine in the C Standard Library. They were most often used as identifying tokens. A way for a networked program to know it’s talking to the same system it talked to before.
It came to the Web thanks to Netscape programmer Lou Montulli in June 1994.
MCI- yeah the big 1990s telecom– was developing a way for customers to buy things and check out online. But it didn’t want to store all the transaction data on its servers. MCI’s Vint Cerf– yes the internet-inventing Vint Cerf and John Klensin yes the FTP-inventing John Klensin— went to Netscape to see if there was a way to store the transaction state on the user’s computer. That way thousands of abandoned carts didn’t pile up on their servers, among other things.
Montulli and John Giannandrea (who would later become head of Machine Learning at Apple) wrote the spec for a shopping cart that used a small data token to save the transaction state between web visits. Montulli apparently suggested calling it a cookie, after the UNIX Magic Cookies. Version 0.9beta of Netscape included support for cookies when it was released on October 13, 1994.
A cookie is set with a Set-Cookie line in the header. Every subsequent request to that server causes the browser to send back all previous stored cookies with that server’s Domain and Path.
A cookie’s structure is simple. It has a name a value and a number of optional attributes. The name is so the browser keeps the cookies straight. The value is so the server knows what the cookies refers to. The attributes have a few different purposes. The most common attribute is the domain and path. A website can only set a domain name for itself. Tommerritt.com cannot set a cookie with the domain name veronicabelmont.com. A cookie can also include an expires and Max-Age attribute. If used this tells the browser when to delete the cookie. And there is the Secure and HttpOnly attribute. When set, this says the cookie can only be accessed with encrypted connections.
Cookies can be as large as 4,096 bytes. Without a cookie each visit to a web page is as if for the first time. Even clicking a link inside the same domain name.
Even though the cookie was set up for shopping carts, the first use of a cookie was a kind of tracking. Netscape set a cookie on its own website to see if a visitor had already visited the Netscape site.
So Netscape supported cookies starting in October 1994 and Internet Explorer added cookie support in October 1995.
Originally cookies were always accepted and users were not notified. But the Financial Times published an article about them on February 12, 1996 raising awareness and they became the subject of their first, but not last, US Federal Trade Commission hearings in 1996.
Meanwhile the Internet Engineering Task force had been debating two official ways of saving “state” in browsers. One proposal from Brian Behlendorf and another from David Kristol. A working group headed by Kristol and Lou Montouli decided to use the Netscape spec for cookies instead. That group anticipated the problem of third-party cookies. Third party cookies is a situation where the domain name in the cookie data did not match the domain name of the page you were on because there was an element from the third-party site embedded in the page. Like an ad. The cookie was technically set and read by the third-party element, which was served from a different place than the page it was included in, so didn’t violate the rules about servers and domain attributes matching.
This was a cool new trick for advertisers. Let’s say you visited infoseek.com and there was a banner ad shown there from linkexchange.com. Linkexchange.com would set a cookie to show that its ad was delivered. Then you head over to excite.com and there’s another banner ad from linkexchange. Linkexchange sets another cookie but even though you’re on excite.com link exchange can see that a cookie from it is already in the browser from when you were at infoseek.com. It now knows that this user has seen two link exchange ads. It can also know lots more like which websites were listed before they saw the ad and more.
The working group realized the privacy implications of a third party cookie, so in the RFC published on cookies in February 1997, third party cookies were either not to be allowed or if allowed not enabled by default.
And here’s where everything could have been different. You see, ad companies had already picked up on Netscape’s cookie idea and were using it for third party tracking already. So, instead of following the official RFC, Netscape and IE ignored it and kept letting Advertisers set third party cookies.
There are however other cookies besides third-party tracking cookies.
A session cookie does not have an expiration date or MaxAge so it expires as soon as you close the browser. This can be useful for keeping track of things while you’re on a site that neither you nor the site will care about later. Like pagination in a sequential story maybe.
A persistent cookie is more common. It’s stored between browsing sessions. These are sent every time you visit the site listed in the domain and path attribute every time you use the browser. Yes they can be used for tracking of course but can also be used for keeping you logged in between browsing sessions. They can also store preferences like a user theme or other settings.
Modern browsers also support something called a Same-site cookie meant to stop cross-site forgery requests. It has three attributes called Strict, lax or None. Strict only sends a cookie to the exact same site that set the cookie. That way the cookie can’t be forged to Say it comes from a domain it did not. Lax lets a cookie be set when on a different domain but only with GET request which makes sure its coming from the domain it says it is. This lets a third-party cookie be set but without the risk of forgery. The none attribute lets the cookie be set from anywhere but most browsers do require the secure attribute to be on to set a SameSite=none cookie. This lets you do easy third party tracking without having to change how you get the cookie in your code but the encryption requirement helps insure it comes from where it says it does, though not as foolproof as Lax.
There are a couple other terms you may have encountered out there. A super cookie used to be a cookie that could be set for the whole top-level domain. Like a cookie served at any .com for instance. And that could be used for a lot of malicious things. Browsers block these now but new top level domains are regularly created, so older browsers may not have an up to date list.
And a zombie cookie, or evercookie is stored in odd locations like Flash and if it sees it has been deleted it copies itself back in. This used to be common in Flash but has declined with the decline of Flash though HTML 5 Web Storage can be used for them.
And speaking of Web Storage, the existence of web Storage API, IndexedDb, JSON Web Tokens, HTTP Authentication and more mean there are other technologies that do some of the things like session management and login that cookies used to be used for. In fact the original reasons or cookies? Shopping carts? Are mostly done server side now. Because why leave all that valuable information about you on YOUR computer not theirs?
Because of the prevalence of using cookies to track users so they can be served more effective ads, multiple laws around cookie use have also been created, including Europe’s GDPR and California’s Consumer Privacy Act, among others. Regulations around cookies differ In the particulars, but in general they require a site to notify that it is using cookies, allow users to opt out of receiving some or all cookies and and allow users to use the service without receiving cookies. Usually with an exception for “required” cookies like login.
So now you know what all the fuss is about. It’s a little piece of data stored in your browser with little bits of data about you.
In other words, now you Know A little More about HTTP cookies.