If you’re reading this, you’re among the few who understand how damaging Google Analytics 4 PII can be. Many companies collect it without realizing the risks it poses to their clients.
The reason PII is a big deal is that many countries classify it as highly sensitive information and require it to be handled with care.
In this blog post, we’ll walk through what PII means, what it looks like in Google Analytics 4 (GA4), and how you can remove it.
Let’s start!
What is Personally Identifiable Information (PII)?
PII refers to any information that could potentially identify an individual. It could be as simple as a name, or as specific as a social security number.
Some examples of PII:
- Full Name
- D.O.B (Date of Birth)
- Telephone Number
- Home Address
- Email Address
- SSN (Social Security Number)
- Passport Number
- Driver’s License Number
- Login Information
Does Google Analytics 4 contain PII?
The short answer is, yes, it does.
Let’s see what Google Analytics 4 PII examples might be:
What Does Google Consider PII?
What is the meaning of PII in Google Analytics 4? According to Google, PII is any data that may be independently used to personally identify, get in touch with, or precisely locate an individual. Google Analytics 4 PII examples:
- Emails, postal addresses, and telephone numbers
- Prrecise locations (such as GPS coordinates)
- A username or full name
What Does Google Consider Non-PII?
- Pseudonymous cookie IDs
- Pseudonymous advertising IDs
- IP addresses
- Other pseudonymous end-user identifiers
Important Note: Please take note that under the GDPR and other privacy laws, data that Google has excluded from its definition of PII in Google Analytics may still be deemed personal data or Google Analytics PII.
How Do I Know If My Website Is Collecting PII?
It is your responsibility to make sure your site does not collect any Google Analytics PII that violates Google’s Terms.
Here are some examples:
- PII violation per Google
If your contract as a publisher restricts you from disclosing PII in Google Analytics, the URLs of the pages on your website where Google ads appear cannot contain email addresses as Google would collect those URLs along with any ad request.
- Not a PII violation per Google
Submitting a user’s IP address with an ad request does not violate Google’s policies against sending personally identifiable information (PII) to Google Analytics.
Now, let’s see how can you find out if you’re collecting PII on your Google Analytics 4 dashboard:
How to Look for PII in Your Google Analytics 4?
There are several ways to locate Google Analytics 4 PII. Let’s see how:
- Go to Google Analytics 4 > Reports > Page and Screens
- Then, filter with @ to see any views that contain common emails. You can also use the regex format for a more effective analysis.
Here are the formats:
- For email address: ([a-zA-Z0-9_.-]+)@([\da-zA-Z.-]+).([a-zA-Z.]{2,6})
- For SSN: (\d{3}-?\d{2}-?\d{4})
- For addresses: (drive|street|road|dr.|po box|rd.)
- For phone numbers: (\d{3}-?\d{3}-?\d{4}) (for a format with “-”, you can remove “-”)
- For names: (fn|ln|lastname|firstname|name|fullname)
How Do I Exclude PII in Google Analytics 4?
Your business should be running on all cylinders to ensure that it is never sending PII in Google Analytics to unauthorized recipients. The good news is that fixing leaks in your Google Analytics PII isn’t hard, and knowing where your riskiest exposure points are makes them easier to find and fix.
Google policies prevent the transfer of any data that could be identified as personally identifiable information (PII) in Google Analytics. So, let’s see how to remove PII data from Google Analytics:
How To Avoid Sending PII to Google?
Google provided a guideline for its users to avoid sending PII in Google Analytics 4 data. To ensure data compliance, follow these best practices from Google’s official guidelines on avoiding the transmission of Personally Identifiable Information (PII) in your analytics setup.
Data Redaction
Google Analytics 4 can redact sensitive information that may appear in the data in two ways: Email and URL query parameters.
Redact Email Data
Email redaction is enabled if there is a blue checkmark next to the email feature. If it is not, then click on the slider element to enable it and click “Save” button.
Now, Google Analytics 4 (GA4) will identify email addresses in all event parameters using common text patterns and automatically redact this information.
One thing to note is that there is the chance that data may be falsely identified as an email because it follows the typical patterns of an email address.
For example, any text containing “@mydomain.com” will be removed from the data, whether or not is an email address.
Redact URL Query Paremeters
Query parameters are additional details in page URLs that follow a ‘?’ and are separated by ‘&’ symbols. These parameters can unintentionally collect personally identifiable information (PII), such as an email address, first name, or last name.
- website.com?email=analyzify@gmail.com
- website.com?firstname=analyzify&lastname=test
Here, you can type the query parameters you would like to have redacted from the select group of parameters mentioned above, and then click “Enter” on your keyboard and click “Save” button.
There is a feature at the bottom of the page that allows you to test what happens when you select certain query parameters to be redacted.
Enter a page URL containing the query parameters you want to exclude and click “Preview redacted data”.
You can see how the data will appear after it has been redacted.
To Sum Up
Ultimately, the key principle is often overlooked when analyzing user data: respect users’ privacy in data collection and storage.
So, are you ready to start cleaning up your Google Analytics 4 (GA4)?
Bonus Content: Here are some additional resources on Google Analytics 4 (GA4) that you might find useful: