Some duplicate pairs (user_id
, event_id
) can be expected in the all_events
table, as long as the event_table_name
value is different for the rows that share an event ID.
If you have two event definitions that overlap, it’s possible for a single event with a unique event ID to qualify for multiple event definitions.
Let’s say you have two click event definitions: one that captures every click on your platform and one that captures clicks on a specific element. A click event that qualifies for the first event will automatically qualify for the second event because the first event captures any click.
If both event definitions are synced to Heap Connect, they will each have their own event table downstream and a single event that qualifies for both definitions will appear on both tables with the same event ID. An event will always have just one event ID, even if it qualifies for multiple event definitions (and by extension, is included on multiple event tables in your data warehouse).
The all_events
table is a UNION
of every individual event table that has been synced to your data warehouse. If both individual event tables include the same event, then the all_events
table will show duplicate event IDs. However, these two rows will have different event_table_name
values.
The primary key for all_events
should be a composite of user_id
, event_id
, and event_table_name
. In individual event tables, a composite of user_id
and event_id
should suffice.