Because of Heap’s data model, we don’t currently have an accurate way to do analysis on new vs. existing users within Heap. However, you can find some examples of analysis that you can do below to understand how many new users you have within a given time range.
In Heap, new and existing users are defined as follows:
- New: A user whose first event / initial touch (such as their first pageview) was within that time frame (day, week, month etc).
- Existing: A user who had more than one Session during that time frame.
Via the New Users Segment
We have a default segment called New Users, which you can graph to compare new and existing user activity during a specific point of time. To use this segment, complete the following steps:
1. Navigate to Analyze > Graph and select Number of Users > Users who are in segment > Size of Segment > New Users.
2. Next, turn this into a multigraph by clicking the + Add Graph button below the graph, then selecting Count Unique > Session.
3. Click Run Query to generate a line graph with your new and existing users.
This graph allows you to see the total number of unique users who had a visit and the number of those users who were “new”. You can subtract your new users from your unique users to get the total count of returning users.
If you graph your new users immediately after installing Heap, your New Users count will be inflated as Heap is seeing these users for the first time.
You might be tempted to try to conduct this via a group by Count of Sessions at any time. However, behavioral group bys are for all time, and cannot be modified to represent a particular point in time.
Via Heap Connect
If you’re using Heap Connect, you can use the following SQL query to conduct true new vs. existing user analysis:
-- CTE to determine if a given session_id is a new or returning visit WITH new_vs_returning AS ( SELECT user_id, session_id, CASE WHEN row_number() over (PARTITION BY user_id ORDER BY time) = 1 THEN true ELSE false END AS new_visit from heap_production.sessions ) -- your query, including the new_visit column from the CTE SELECT time::date, p.user_id, p.session_id, path, nr.new_visit FROM heap_production.pageviews p -- join the CTE as a segment INNER JOIN new_vs_returning nr USING (user_id, session_id) ORDER BY "time" LIMIT 50