Overview
If your organization uses VPC Service Controls to secure your Google Cloud environment, you'll need to configure additional settings to enable Heap's BigQuery data sync.
By default, Heap syncs data to BigQuery by staging files in our Google Cloud Storage bucket (gs://heap-bigquery) before loading them into your BigQuery dataset. However, VPC Service Controls prevent external resources from accessing your BigQuery environment. The solution is to use a staging bucket within your own Google Cloud project, connected via a VPC Service Controls perimeter bridge.
Requirements
To set up BigQuery sync with VPC Service Controls, you'll need to create:
- GCS Staging Bucket - A Google Cloud Storage bucket in your project that Heap can write to
- Perimeter Bridge - A VPC Service Controls configuration that allows data to flow from your staging bucket to your BigQuery dataset
Setup Instructions
Step 1: Create a Staging Bucket
Create a Google Cloud Storage bucket in your project that allows Heap to write data from outside your VPC perimeter. This bucket will temporarily store data files before they're loaded into BigQuery.
Step 2: Configure a Perimeter Bridge
Set up a VPC Service Controls perimeter bridge to connect your staging bucket to your BigQuery dataset. This bridge creates a secure pathway for data to move from the staging bucket into your protected BigQuery environment.
To learn more, see Google Cloud's article about sharing across perimeters with bridges.
Note: Perimeter bridges can take up to 24 hours to become fully operational after creation.
Step 3: Contact Heap Support
Once your staging bucket and perimeter bridge are configured, contact Heap Support with the following information:
- Your staging bucket name and location
- Your Heap Environment
- Your BigQuery dataset information
Our team will configure Heap to use your staging bucket for data sync.
Step 4: Test Your Configuration
After Heap completes the configuration on our side, we'll run a test sync to verify everything is working correctly. If any issues arise, we'll work with you to review your VPC Service Controls audit logs.
Troubleshooting
Identifying VPC Service Controls Issues
If the BigQuery sync fails due to VPC Service Controls configuration, you may see an error similar to:
Error loading batch into BigQuery
message=VPC Service Controls: Request is prohibited by organization's policyThis error indicates that your VPC Service Controls policies are blocking the data transfer. The most common causes are:
- The perimeter bridge is not fully configured
- The staging bucket is not properly included in the perimeter bridge
- Heap's service account (heap-204122) is not authorized in your VPC policies
- The perimeter bridge has not finished provisioning (can take up to 24 hours)
Reviewing Audit Logs
VPC Service Controls audit logs can help identify the specific policy blocking the sync. You can access these logs through Google Cloud's Logs Explorer.
To review VPC Service Controls logs:
- Navigate to Google Cloud's Logs Explorer in your project
- Filter for Audited Resource entries to view relevant VPC Service Controls events
- Look for messages indicating policy violations or access denials
- Reference Google's troubleshooting guide to interpret and resolve common issues
For more information on debugging VPC Service Controls, see Google's troubleshooting guide.
Getting Help
If you need assistance interpreting the logs or resolving any issues, please contact Heap Support with:
- The error message you're encountering
- Relevant log entries from your VPC Service Controls audit logs
- Your staging bucket and BigQuery dataset configuration details