Beta Feature
This feature is in beta. Please reach out to your Customer Success Manager or Sales if you’re interested in participating or if you have any feedback.
This integration is Destination only, meaning you can send data out of Heap but not send data into Heap.
Overview
Heap’s Databricks integration allows you to sync Heap data to Databricks to leverage Heap behavioral data in other tools.
Prerequisites
To connect this integration, you’ll need the following permissions:
- Admin or Architect privileges in Heap
- Access to an AWS-hosted Databricks account that uses the Unity Catalog
Setup
To get started, navigate to Integrations > Directory, search for Databricks, then select it where it appears.
You’ll be prompted to provide the following information:
- Hostname: The ID of your databricks account, which you can find in the account URL.
- Path: The path of the warehouse you are connecting via this integration.
- Catalog: The catalog that this Heap data should sync to; if left blank, this integration will create a new catalog.
- Schema (optional): The schema that this Heap data should sync to; if left blank, this integration will create a new schema.
- Token: This is required to allow Heap to write to the schema. The token must be a Personal Access Token (PAT) rather than an OAuth Token.
Once all those fields are populated, click the Connect button.
That’s it! Once setup is complete, you’ll see a sync within 48 hours with the following built-in tables.
- Pageviews
- Sessions
- Users
- user_migrations
You can can create an all_events
view to Databricks by setting up a query like this one:
SELECT
event_id,
time,
user_id,
session_id,
'test_event_table' AS event_table_name
FROM "TEST_DB"."TEST_SCHEMA"."TEST_EVENT_TABLE"
UNION
SELECT
event_id,
time,
user_id,
session_id,
'click_event_table' AS event_table_name
FROM "SCHEMA"."CLICK_EVENT_TABLE"
UNION
SELECT
event_id,
time,
user_id,
session_id,
'pageview_event_table' AS event_table_name
FROM "SCHEMA"."PAGEVIEW_EVENT_TABLE"
Limitations
Please note the following limitations for this integration:
- The All Events table is not synced to Databricks. As a workaround, you can create your own all_events.
- Defined properties syncing is not supported during beta.
- Segments syncing is not supported during beta.