Independent Analytics WordPress Plugin is not GDPR Compliant out of the box.

A plugin that is gaining traction for WordPress users is the analytics plugin called Independent Analytics. It’s actually a very nice plugin, we’re using it as inspiration for our own analytics for our clients. One of the claims of Independent Analytics is that because it uses cookie-less tracking and the data is stored on your own servers – it’s GDPR compliant out of the box and therefore you don’t need a cookie consent form.

There are various ways to track website visitors without using cookies. We can gather these all under an umbrella term “fingerprinting”. The most common way is to gather lots of information about the user’s browser and connection and use that to hash some sort of id. If we look at the Independent Analytics code, we can see they do a very basic version of this:

    private function calculate_id(string $ip, string $user_agent) : string
    {
        $salt = Salt::visitor_token_salt();
        $result = $salt . $ip . $user_agent;
        return \md5($result);
    }

Let’s come back to the “salt” in a bit. But we should sidetrack just a little bit and mention the flaw in using the ip address and user agent for a session id. Many companies, and particularly large institutions (e.g. in education) use proxy servers or other methods (e.g. NAT) to route their web traffic. This results in them all having the same ip address. They also tend to manage their devices, meaning that the browsers available to all their users will be the same versions of the same browsers. It’s just good security practice. Combining the ip address and the user agent (what the browser identifies itself as) will therefore produce the same id in this code for all those users. The same is often true for families on their home internet, who will often tend to have the same devices such as ipads. They’d then render the same id. This has an obvious effect on the reliability and accuracy of the data as it will simply miscount. It’s a problem of fingerprinting in general, but made much worse in this case by the simplicity of just using two components to fingerprint a device.

Back on track. Let’s talk about md5’ing. md5’ing something simply turns it into a string of characters. If we md5 the same thing we will always get the same string of characters. This is something called hashing. What’s fundamental to know about hashing is that it is theoretically one way, meaning that it is difficult (sometimes impossible) to determine what the original data was from the hashed string of characters. For example, say we hash the word “test” and for arguments sake let’s assume the output from some hashing function is 5f3a9d. There is nothing in the output that tells us what the input is. As “test” will always become 5f3a9d, the only way we could learn the input is to by keep guessing, hash our guess and see if it equals 5f3a9d. From this we can observe a key point – we can read hashed data more easily if it is easily guessable.

In this case, the use of a very limited amount of data that is very likely to exist in other logs, makes this hashing obviously insufficient to count as anonymization. In most cases connecting this to the original ip and user agent using logged data would be trivial. But setting that aside for the moment, it’s an identifier that is intended to be linked to a person (the browser). The relevant GDPR stuff about identifiers like this is as follows:

These identifiers refer to information that is related to an individual’s tools, applications, or devices, like their computer or smartphone. The above is by no means an exhaustive list. Any information that could identify a specific device, like its digital fingerprint, are identifiers.

https://gdpr.eu/eu-gdpr-personal-data/

and…

‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

https://gdpr.eu/eu-gdpr-personal-data/

So this is pretty clearly an identifier, but we’ve not addressed the question of the salt. We’d argue that the developers must’ve seen this point about it being an identifier and added the salt in. So what is a salt? A salt is a random bit of data. Previously I mentioned that the easier the data is to guess, the easier it is to get from the hash to the original data. There’s another method too – sometimes there are databases of text from hashes. A way to often fool both these methods is by just adding random data into the string being hashed. Random data is virtually impossible to guess and the strings are unlikely to be in databases because the databases would have to be ridiculously huge. Independent Analytics uses a salt. By default, however, it doesn’t change the salt and the salt is stored in the database. As the salt is then always available to the site owner, and the visitor’s id never changes in the default case, this is arguably the same as having no salt at all. There is an option to refresh the salt daily which you can turn on in the settings. In effect what this means is that if that option is turned on then the data is anonymized after a day. That is, presuming, the salt is not preserved in any way – either in the WordPress database itself or in website backups. If it is then, again, it’s the same as no salt at all.

Arguably in nearly all circumstances the use of a salt here has very little effect on how anonymous the id is. It’s an attempt to anonymise data, but doesn’t work. It certainly doesn’t conform to Independent Analytics claim:

If we dislike any of their claims the most, it’s that one specifically – because they’ve coded a feature in, in an attempt to make it more compliant, by default left that off, and then knowing that told user’s it is compliant “without configuration”.

If we give them the benefit of the doubt then I think, as a non-European LLC, they misunderstand the difference between the older EU cookie consent law and newer laws and the GDPR. Their other FAQ seems to imply that misunderstanding:

Whereas,

In order to comply with the GDPR, companies must ensure that the data collected through browser fingerprinting is necessary for the specific purpose and that the user has expressly consented to the data collection. 

https://legalweb.io/en/news-en/browser-fingerprinting-and-the-gdpr/#:~:text=In%20order%20to%20comply%20with,consented%20to%20the%20data%20collection.

So without configuration it certainly is not GDPR compliant. Whilst not using a cookie specifically does indeed mean you don’t need to ask for cookie consent, the plugin does indeed track personally identifiable data as defined by the GDPR and need consent. A consent form is required, it is simply the case that one consent form is replaced by another.

This all makes the third FAQ claim they make, unfortunately, very misleading for users:

Because to summarise, the plugin uses a personal identifier that needs consent and by default does not anonymize data for quite some time (we’ve not checked, it could be forever).

Now it may seem that we’re being harsh on this plugin. So let’s repeat, it’s a great little plugin and a source of inspiration for us. But, the marketing is very deceptive.

More recently the same fingerprinting techniques have also been referred to as Server Side Tracking.

Regarding GDPR, there is no difference between client side tracking and server side tracking.

https://legalweb.io/en/news-en/server-side-tracking-gdpr/

The law and all these consent form requirements may be a little silly. But there’s no way around it. It is the conclusion we came to ourselves when we looked at fingerprinting techniques. If you want to collect data and that data includes more than just very simple things like number of page views, and your audience/market is in Europe, then you are meant to ask for consent. There’s no outsmarting it I’m afraid. Because fingerprinting comes with disadvantages in terms of accuracy, and because consent is a requirement anyway, we ended up deciding to use the more accurate cookie method as a solution for our clients. We’re looking at our own consent functionality. But for users of Independent Analytics – just add a consent plugin. We don’t want to see you sent to Azkaban!

Share the Post:

thebytewizards.com@www.thebytewizards.com

Related Posts