AnalyticsTracking AI / LLM Web Traffic To Your Website
Tracking AI / LLM Web Traffic To Your Website with Benj Arriola, Sr. SEO Director, 85Sixty

Tracking AI / LLM Web Traffic To Your Website

with Benj Arriola, Senior SEO Director at 85SIXTY

Not so long ago, if you said “digital marketing,” you’d hear buzzwords like SEO, Google Ads, social campaigns, and email blasts. These channels built the foundation for how brands reach their audience online. These are all important but in the past year, the popularity of Large Language Model (LLM) assistants, or as many just call them AI tools, with probably ChatGPT as the most popular right now, is also starting to be viewed as one of the growing digital marketing channels.

Today, millions of people ask ChatGPT for recommendations, search Perplexity for products, consult Gemini for advice before ever visiting a website, or using Claude for vibe coding web apps, phone apps, or any type of coding needs. More and more, these AI platforms act as digital gatekeepers, filtering, summarizing, and even recommending businesses before users hit those traditional marketing touchpoints.

For marketers, this shift is both a challenge and an opportunity. Suddenly, the path a customer takes to reach your brand might start with a chatbot or AI-powered search assistant instead of a Google search or Instagram post. Companies who spot these new journeys are asking: Can we measure that? Should we treat AI referrals like social, search, or something totally fresh?

This is the era of AI as a marketing channel, a space where visibility and engagement depend on whether your brand is “discovered” and recommended by intelligent bots. It’s a new world, and as always, those who adapt early will set the pace for what’s next in digital marketing.

Web Analytics Don't Track AI as a Separate Channel

Web analytics tools, like Google Analytics or Adobe Analytics, love putting your traffic into neat little buckets: organic search, paid ads, social, email, and so on. These channels help marketers see where their visitors are coming from and what’s driving conversions. But currently, AI platforms and LLM assistants haven’t landed their own seat at the table yet.

Despite how fast AI-powered assistants like ChatGPT or Perplexity are taking off, most, if not all analytics dashboards don’t label this traffic under an “AI” or “LLM” channel. If someone visits your site after getting a link from an AI assistant, they’ll probably be misfiled under “Referral” in your software, however I have seen these AI platforms also appear under “Organic Search” and “Direct” for some reason. Basically, there’s no default grouping that marks out these new, AI-driven visits separately.

For businesses and digital strategists, this blind spot presents a real problem. How can you measure the rising influence of AI tools in your marketing stack, if your analytics stacks can’t even recognize them? You end up with murky data that is hard to analyze, harder still to act on with confidence.

Until analytics providers catch up, you have to roll up your sleeves and get creative if you want true visibility into your AI-generated website visits. This means thinking outside the usual channel groupings, and using custom filters and segments to tease out what standard reporting doesn’t show. And as AI platforms become more important to customer journeys, this tracking gap will be front and center for anyone serious about data-driven marketing.

Tracking AI Assistants in Different GA4 Reporting Views

With so many eyeballs starting their web journey on an AI assistant, opening Google Analytics 4 and looking for an “AI Platform” tab doesn’t exist. GA4 doesn’t call out visits from ChatGPT, Perplexity, or Gemini in the stock reporting dashboards. They quietly blend in with all the other sources but they are at lease saved within the sources dimension.

If someone clicks a link inside ChatGPT or another large language model platform, there’s often a referral like “chat.openai.com” or “www.perplexity.ai” tagged along with the visit. GA4 captures this but not with any special highlight or grouping. These visits show up under the standard “Traffic acquisition” reports, generally as part of “Referral.”

To keep an eye on AI assistant traffic, you need to use GA4’s built-in filters or create segments. For example, in the default reports, you can apply a filter on the “Session source” dimension. This picks out pageviews or events where the referrer matches known AI sources, letting you measure their contribution.

It isn’t fancy, but it’s a workable baseline. As AI-generated journeys shape up to be a major force in marketing, knowing how to tease out these referrals in GA4’s default views means you won’t miss the first waves of this traffic, important intelligence for anyone aiming to stay ahead in the digital game.

With multiple flavors of AI assistants and new ones popping up, a comprehensive filter can be created by using regex where the different Regex filters we use for different situations in tracking AI/LLM traffic within web analytics:

    
     (?i)^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|\.ai$|^ai[abcefghijklnopqstuvwxyz.-].*|.*\.?ai\..*
    
   

If there is an option to have an exclude filter, then use this include and exclude filters below instead. The difference here is it the include filter is less strict in finding “ai” anywhere in the referral source so this tends to capture new AI platforms that decide to use “ai” within their domain name. But since many dictionary words and proper nouns can contain “ai”, we have the exclude regex filter to go along with it.

    
     (?i)^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|.*ai.*
    
   

And use that in combination with this exclude filter:

    
     (?i)^.*(campaign|air|mail|chain|baidu|fair|laid|sail|gain|main|lain|rain|pain|sain|tail|tain|rail|hair|nail|paid|wait|waist|gait|raid|maid|faint|claim|pair|stair|hail|fail|bail|jail|pail|avail|daily|daisy|dairy|kuwait|haiti|taiwan|hainan|jaipur|jamaica|faisalabad|cain|lair|craig|isaiah|kait|zain|sai|shai|maia|gail|jaime|haines|tait|nair|jain).*

    
   

Setting Up a Report Filter for Default Reports in GA4

Google Analytics 4 or GA4, the latest version of Google Analytics has many reporting views. And these views have the ability to add filters on almost any report view.

In the video, you’ll see I wasn’t able to add an exclude regex filter since GA4 cannot have another filter using the same dimension. In this case, we will only use the first regex filter.

    
     (?i)^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|\.ai$|^ai[abcefghijklnopqstuvwxyz.-].*|.*\.?ai\..*
    
   

GA4 Freeform Explore Report Filter for AI Assistants

If the standard GA4 dashboards feel a bit stiff for tracking AI assistant-driven visitors, the Freeform Explore report steps in like the Swiss Army knife of data slicing. Here, you aren’t locked into canned views, instead, you can build custom filters that hunt for traffic and conversions originating from the likes of ChatGPT, Perplexity, or Gemini.

To zero in, you want to use both “include” and “exclude” filters. The include filter will use “Session Source” and use this Regex filter:

    
     (?i)^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|.*ai.*
    
   

When the regex above is used, you may see it is more relaxed in adding any domain source that has “ai” in it even if it is not an AI platform simply because many words have “ai” naturally, which is why we need to use the “exclude” filter. And as shown in the video above, you will noticed that there is a length limit to the exclude regex field, so in this specific case within Google Analytics GA4 Explore reports filter, we needed to split the regex into 2 separate filters. And here we used these two exclude filters:

    
     (?i)^.*(campaign|air|mail|chain|baidu|fair|laid|sail|gain|main|lain|rain|pain|sain|tail|tain|rail|hair|nail|paid|wait|waist|gait|raid|maid|faint|claim|pair|stair|hail|fail|bail|jail|pail|avail|daily|daisy|dairy|kuwait|haiti|taiwan|hainan|jaipur).*

    
   

The top regex you see above ends with “jaipur”, while the bottom one continued with “jamaica”.

    
     (?i)^.*(jamaica|faisalabad|cain|lair|craig|isaiah|kait|zain|sai|shai|maia|gail|jaime|haines|tait|nair|jain).*

    
   

And if you are using a different analytics platform with length limitations for every single regex line, but can add multiple regex filters, then adjust the cutoff accordingly to the limits of your analytics software.

Setting up a Custom Channel for AI/LLM Platforms in GA4 to Compare Against Other Marketing Channels

If you’re tired of forcing AI assistant traffic into awkward “Referral” or “Other” buckets, GA4 allows you to set up a custom channel grouping just for AI/LLM platforms. This is where you start getting true apples-to-apples comparisons against paid search, organic, or email traffic, and it’s a must for any data-driven marketer chasing the next big channel.

The process is pretty straightforward: in GA4 Admin, create a new channel group and give it a name, maybe “AI Platforms” or “LLM Referrals” or what ever is meaningful for you. Next, specify the rules that funnel known AI referrer domains into this new channel. It will be very similar to the previous task, still using Session Source. But this time, the whole exclude regex fits entirely under 1 rule alone. You will notice also that it does not say Exclude, instead it mentions does not match regex. Additionally for some reason, the channel settings would NOT accept the case insensitive regex pattern (?i) only for the include rule, but it works in the exclude rule. So we just adjusted the rules accordingly. The include regex is:

    
     ^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|\.ai$|^ai[abcefghijklnopqstuvwxyz.-].*|.*\.?ai\..*
    
   

And the does not match regex below is:

    
     (?i)^.*(campaign|air|mail|chain|baidu|fair|laid|sail|gain|main|lain|rain|pain|sain|tail|tain|rail|hair|nail|paid|wait|waist|gait|raid|maid|faint|claim|pair|stair|hail|fail|bail|jail|pail|avail|daily|daisy|dairy|kuwait|haiti|taiwan|hainan|jaipur|jamaica|faisalabad|cain|lair|craig|isaiah|kait|zain|sai|shai|maia|gail|jaime|haines|tait|nair|jain).*

    
   

Going Beyond GA4, Using Regex in Data Studio (formerly Looker Studio) Filters to Report on AI/LLM Traffic and Conversions

For analysts who want more freedom or prettier data displays, Google Data Studio (formerly Looker Studio) comes to the rescue. It lets you pull your GA4 data and set up custom filters using Regex, allowing you to slice out exactly the AI/LLM platform traffic and conversions you want to see, right in interactive charts and tables. In the charts within Data Studio, simply add two filters for the include and exclude regex filters applied on the Session Source. Unlike the filters we had in the different reports in GA4, we needed to modify them a bit, but here in Data Studio, you can use the main template regex we have.

Include Filter in Data Studio for AI / LLM Platforms:

    
     (?i)^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|.*ai.*
    
   

Exclude Filter in Data Studio for AI / LLM Platforms:

    
     (?i)^.*(campaign|air|mail|chain|baidu|fair|laid|sail|gain|main|lain|rain|pain|sain|tail|tain|rail|hair|nail|paid|wait|waist|gait|raid|maid|faint|claim|pair|stair|hail|fail|bail|jail|pail|avail|daily|daisy|dairy|kuwait|haiti|taiwan|hainan|jaipur|jamaica|faisalabad|cain|lair|craig|isaiah|kait|zain|sai|shai|maia|gail|jaime|haines|tait|nair|jain).*

    
   

The video below shows where this is done within Data Studio. This video is relatively short so it also include how to set this up in Adobe Analytics.

Setting up segments in Adobe Analytics to Filter AI/LLM Visits and Conversions

For companies running Adobe Analytics, filtering out AI and LLM-specific traffic isn’t rocket science, it’s just a matter of building the right segments. Since different analytics platforms could use different terminologies, and definitions of dimensions and metrics, we did not use Source here in Adobe Analytics, instead, we used Referring Domain. The challenge here was we were not successful in using any Regex in Adobe Analytics, and found it easier to use “Contains Any Of” for the include rules and “Does Not Contain Any Of” for the exclude rules. All of the sources included are just space separated. This changes our Regex string to simply be a list of AI sources. This is what we used in the Contains Any Of filter:

    
     .ai ai. openai copilot chatgpt gemini gpt neeva writesonic nimble outrider perplexity bard edgeservices astastic copy.ai bnngpt deepseek grok metaai
    
   

And in the “Does Not Contain Any Of” exclude filter, we used:

    
     air avail baidu bail bailey brain cain campaign chain claim craig daily dairy daisy fail faint fair faisal gain gait hail hainan haines hair haiti isaiah jail jaime jain jaipur kait wait laid lain lair maia maid mail main nail nair paid pail pain pair raid rail rain sai sail sain shai stair tail tain tait taiwan waist zain
    
   

AI/LLM Tracking in Web Analytics

Tracking the impact of AI assistants and LLM platforms on your web traffic comes down to one unglamorous trick: filters and Regex. Everything from GA4’s default reports to more polished platforms like Data Studio or Adobe relies on identifying AI sources with smart filtering, no magic “AI Channel” just yet.

In GA4’s standard dashboards, the main route is to filter by session source, typically using an include-only Regex on known AI domains. This approach gives some visibility but is basic compared to custom reporting tools.

GA4 Freeform Explore reports level things up, letting you chain together both include and exclude Regex filters. This means you can fine-tune what you see, exclude junk, or build deeper comparisons, though the character limit can be a hurdle if you track lots of platforms.

Custom channel configuration in GA4 helps permanently group AI/LLM assistants by building dedicated definitions. One limitation to note: case sensitivity quirks in the Regex engine can throw a wrench in global rules, so always double-check your patterns in test reports.

Outside GA4, DataStudio streamlines reporting with widget-level Regex filters, offering fast visuals and easier ad-hoc filtering, great for live dashboards. Adobe Analytics makes catching AI/LLM users a breeze via “Contains Any Of” filters, bypassing some of the technical fuss of Regex altogether.

Altogether, while web analytics hasn’t caught up to AI referrals with native channels, these workarounds let you keep AI traffic and conversions in sight. The growth of AI as a marketing player means these tricks are quickly going from “nice to know” to “must have” in every analytics playbook.

TLDR Version - Regex for AI/LLM Sources in Web Analytics

If you need a Regex filter in your web analytics, generally this should work, and should be somewhat ready for future newer AI platform assistants that have “.ai” as a ccTLD, and as a subdomain:

    
     (?i)^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|\.ai$|^ai[abcefghijklnopqstuvwxyz.-].*|.*\.?ai\..*
    
   

But if you have the option to have both an include filter and an exclude filter then use these:

LLM/AI Platforms Traffic Source Include Filter:

    
     (?i)^.*(gpt|openai|gemini|google.*bard|bard.*google|copilot|perplexity|edge\s*services|deepseek|grok|meta\.ai|claude|writesonic|neeva|nimble).*$|.*ai.*
    
   

LLM/AI Platforms Traffic Source  Exclude Filter:

    
     (?i)^.*(campaign|air|mail|chain|baidu|fair|laid|sail|gain|main|lain|rain|pain|sain|tail|tain|rail|hair|nail|paid|wait|waist|gait|raid|maid|faint|claim|pair|stair|hail|fail|bail|jail|pail|avail|daily|daisy|dairy|kuwait|haiti|taiwan|hainan|jaipur|jamaica|faisalabad|cain|lair|craig|isaiah|kait|zain|sai|shai|maia|gail|jaime|haines|tait|nair|jain).*
    
   

The version that uses both include and exclude is better because it will also capture any referring source that uses “ai” in it but exclude almost every common word that has “ai” like rain, campaign, and many other words and proper nouns. 

Always check the limitations of different web analytics tools as some of the filters may not accept a pattern this long, and in that care you may need to split it up into more filters.

Will these AI / LLM Regex Filters Work Forever?

Most likely they will work for a long time, but as the industry comes up with new AI platforms, new tools with new domain names, new brands will appear, and probably one day, one of these newer tools will not be captured by this regex pattern. But we will continue to update this along the way as much as we can. You can bookmark this blog post as we update it for you. Until eventually the web analytics platforms themselves will start maintaining the list and will probably have a built in channel in the future. If there is a platform that you think should be added to the regex that we failed to add, let us know, tag me (Benj Arriola) on X or LinkedIn or anywhere you find me, and I’ll make sure we update this regex. Also note that some web analytics platforms may have some regex field limitations on the string length, and if it gets too long, we may need to split this up one regex into two in the future. If these do not work for your web analytics platform, let me know what the limitations are, and maybe we can modify these Regex for you and we will update this post and give you a shout out too.

If you do not want to even deal with any of this and wants an agency to help out with all the tracking and reporting, as well as optimizing traffic from search engines and AI engines, 85SIXTY is here to help you out, contact us.

85SIXTY is a data-driven digital marketing agency helping global brands optimizing performance in SEO, AEO, GEO, and all other AI optimization acronyms the industry is currently using.

Written by

Senior Director SEO @ 85SIXTY

Leave a comment:

Top
oh hello you
Award-winning
creative agency.
Delivering high-quality projects for international clients. Ask us about digital, branding and storytelling.

GENERAL INQUIRIES
borgholm@qodeinteractive.com

SOCIAL MEDIA