BigQuery integration

While mabl provides many in-app dashboards for reviewing test output and activity, we recognize that your application may benefit from bespoke reports that are unique to your business. Using the mabl integration for BigQuery, you can extract, transform, and view such reports in real time in tools like BigQuery and Looker Studio.

With the mabl BigQuery integration, you can enable real-time custom reporting and make deeper analyses of testing patterns. This article provides instructions for setting up the mabl BigQuery integration and an overview of tables in the mabl BigQuery dataset.

📘

The BigQuery integration is available for users on Enterprise plans.

Setup

In order to use the mabl BigQuery integration, you must use a paid GCP BigQuery account.

The BigQuery Sandbox feature that is available to unpaid trial Google Cloud accounts cannot be used with the mabl BigQuery feed export because streaming inserts are not supported by the BigQuery Sandbox.

Create a service account for mabl

Add the service account [email protected] as a BigQuery Data Editor to the destination Google Cloud Platform project. Learn more.

📘

Alternative IAM permission setup

If you want to constrain the mabl [email protected] user to a greater extent, create a custom IAM role with the following permissions:

  • bigquery.datasets.create
  • bigquery.datasets.get
  • bigquery.tables.create
  • bigquery.tables.get
  • bigquery.tables.update
  • bigquery.tables.updateData

Assign the custom IAM role to the mabl service account on the target GCP project.

Configure the BigQuery integration

In the mabl app, visit Settings > Integrations and click on the + Setup button for the BigQuery integration.

On the Add BigQuery integration page, add the destination project ID, dataset name, and a prefix for the export table. Optionally, you may toggle on React personal information fields. Click save to confirm your settings.

Redact personal information fields

If you toggle on React personal information fields, mabl excludes the following fields from the BigQuery Export Feed:

  • trigger_user_email
  • actor_email

Mabl replaces the entries for these fields with the value PII_REDACTED on export. Other free text fields, such as plan_name or labels, are still included. Mabl does not inspect the content of these fields.

BigQuery table layout

The BigQuery integration provides a streaming export of mabl data. Upon running the first mabl test through the BigQuery Export Feed, the mabl_export dataset and following tables are created:

  • Plan runs: <table_prefix>_plan_run
  • Test runs: <table_prefix>_journey_run
  • Failure reasons: <table_prefix>_run_categorization
  • Performance test runs: <table_prefix>_performance_test_run
  • Activity feed: <table_prefix>_activity_feed

📘

Tests are referred to as "journeys" in the mabl BigQuery Export Feed.

For example, if you used the default names when setting up the BigQuery integration in mabl, the following BigQuery tables are created:

  • <your-project>:mabl_export.mabl_plan_run
  • <your-project>:mabl_export.mabl_journey_run
  • <your-project>:mabl_export.mabl_run_categorization
  • <your-project>:mabl_export.mabl_performance_test_run
  • <your-project>:mabl_export.mabl_activity_feed

Partitions

The tables are partitioned by day, based on the columns:

  • <table_prefix>_plan_run on column start_time
  • <table_prefix>_journey_run on column start_time
  • <table_prefix>_run_categorization on column created_time
  • <table_prefix>_performance-test_run on column start_time
  • <table_prefix>_activity_feed on column created_time

If you want to discard test data after a set amount of time, configure a partition expiration in your BigQuery data warehouse.

Table schema

Plan run table

The *_plan_run table contains every plan run on the integrated workspace, written upon plan run completion.

Column NameTypeDescription
idStringplan run id
plan_idStringplan id
plan_nameStringplan name
plan_urlStringplan UI URL
statusStringplan run status
successBooleanoverall plan run success
start_timeTimestampplan run start time UTC
stop_timeTimestampplan run stop time UTC
runtime_millisIntegertotal plan run time milliseconds
tagsString, Repeatedplan tags
labelsString, Repeatedplan labels
application_idStringapplication under test id
application_nameStringapplication under test name
application_urlStringapplication under test UI URL
starting_urlStringapplication under test starting url
trigger_typeStringexecution trigger type
trigger_user_idStringexecution triggering user (optional)
trigger_user_emailStringexecution triggering user email (optional)
deployment_idStringexecution trigger deployment id (optional)
workspace_idStringworkspace name
workspace_nameStringworkspace name

Test run table

The *_journey_run table contains every test run in the integrated workspace, written upon test run completion. If the test is part of a plan run, the test run may be written to the test run table before the plan run is complete.

TypeColumn NameDescription
Stringidtest run id
Stringjourney_idtest id
Stringjourney_nametest name
Stringjourney_urltest UI URL
Stringview_output_urltest run output UI URL
Stringstatustest run status
Booleansuccessoverall test run success
Timestampstart_timetest run start time UTC
Timestampstop_timetest run stop time UTC
Integerruntime_millistotal test run time milliseconds
String, Repeatedtagstest tags
String, Repeatedlabelstest labels
Stringmabl_branchmabl branch test ran against
Stringenvironment_idtest run environment
Stringenvironment_nametest run environment name
Stringenvironment_urltest run environment UI URL
Stringbrowser_typebrowser under test type
Stringbrowser_versionbrowser under test version
Stringplan_idplan id
Stringplan_nameplan name
Stringplan_urlplan UI URL
Stringplan_run_idplan run id
Stringworkspace_idworkspace id
Stringworkspace_nameworkspace name

Failure reason table

The *_run_categorization table contains every test run failure reason for the integrated workspace, written when a user sets a failure reason on a given test run output.

Column nameTypeDescription
idStringunique id
categoryStringfailure reason (NULL on removed)
journey_idStringtest id
journey_nameStringtest name
journey_run_idStringtest run id
created_timeTimestampgrading time UTC
start_timeTimestamptest run start time UTC
stop_timeTimestamptest run stop time UTC
grader_user_idStringgrading user id
grader_user_emailStringgrading user email (optional)
workspace_idStringworkspace id
workspace_nameStringworkspace name
tombstoneBooleanIndicator that failure reason has been removed from test run. Will be true if test run no longer has a failure reason and category will be NULL

Because BigQuery exports are done in a "data warehouse" fashion, tables are never updated or deleted. Instead, tables are only appended. As a result, a new row is written for each change to a test run's failure reason, which is stored in the category dimension.

Group by journey_run_id and take the latest create_time value to find the most recent category given to a test run. If a user clears the test run of a failure reason, the category is NULL.

You can use the following sample query to determine the latest failure reason. Replace my-project.mabl_export.mabl_run_categorization with the table name in your BigQuery system.

SELECT 
  failure_reason.* 

FROM ( 
  -- Get latest update time 
  SELECT 
    journey_run_id, 
    MAX(created_time) AS last_updated_time 
  FROM 
    `my-project.mabl_export.mabl_run_categorization` 
  GROUP BY 
    journey_run_id ) AS latest_categories 

JOIN `my-project.mabl_export.mabl_run_categorization` AS failure_reason 
ON failure_reason.created_time = latest_categories.last_updated_time 

-- Ignore removed failure reasons 
WHERE failure_reason.category IS NOT NULL

Performance test run table

The *_performance_test_run table contains metrics for every performance test run in the integrated workspace, written upon test completion. If the performance test is part of a plan run, the performance test run may be written to the performance test run table before the plan run is complete.

You may join the *_performance_test_run table with the *_journey_run table on id.

Column NameTypeDescription
idStringtest run id
journey_idStringtest id
journey_nameStringtest name
journey_urlStringtest UI URL
journey_run_idStringtest run output UI URL
start_timeTimestamptest run start time UTC
stop_timeTimestamptest run stop time UTC
api_virtual_user_hours_consumedFloatvirtual user hours consumed of performance test run
api_concurrent_user_countIntegerconcurrent users of performance test run
api_average_response_timeIntegeraverage response time in milliseconds of performance test run
api_requests_per_secondIntegeraverage requests per second of performance test run
api_error_rate_percentFloataverage error rate as a percent [0-100] of performance test run
workspace_idStringworkspace id
workspace_nameStringworkspace name

Activity feed table

The *_activity_feed table contains every activity feed entry in the integrated workspace, including activity data for all types supported by the activity feed in the mabl UI.

Column NameTypeDescription
idStringactivity feed entry id
entity_idStringid of the entity changed
entity_nameStringname of the entity changed
entity_typeStringtype of the entity changed
entity_canonical_idStringtest run output UI URL
entity_version_numberIntegerversion number for versioned entities
action_typeStringtype of action taken
action_timestampTimestamptime the action was taken
created_timeTimestampactivity feed entry creation time
actor_idStringid of the person or API key that took the action
actor_emailStringemail of the actor, if applicable
entity_previous_idStringid of the entity before this action
entity_previous_nameStringname of the entity before this action
entity_previous_version_numberIntegerversion of the entity before this action
workspace_idStringworkspace id
workspace_nameStringworkspace name

Next steps

After setting up the BigQuery integration, you can start querying the data or creating custom dashboards and charts using Looker Studio, Google Sheets, or another tool.