Explaining Firestore to BigQuery Integration | Coupler.io Blog (2023)

When building a demanding application, it’s important to use all the necessary tools to make it work as optimized as possible. While NoSQL databases provide all the required mechanisms for storage and retrieval of non-relational data, there are some cases where more specialized tools like BigQuery are needed.

This article covers Cloud Firestore, a NoSQL cloud database, and all the ways to move your data from Firestore to BigQuery. Understanding it will make your data analysis processes a lot easier.

Table of Contents hide

What is Firestore?

How Firestore stores data

(Video) Getting started with Firebase Analytics, BigQuery - Firecasts

Different methods to load data from Firestore to BigQuery

Export Firestore data to BigQuery via UI

How to export specific collections to BigQuery

Move data from Firestore to BigQuery via the command line

How to automate Firestore to BigQuery export

How to stream Firestore data to BigQuery

(Video) Google Cloud Datastore Demo | store and query data in Firestore in Datastore mode using the console

How to backfill your BigQuery dataset

Wrapping up – Firestore to BigQuery

What is Firestore?

Cloud Firestore is a cloud NoSQL database service provided by Google and Firebase. It’s used by mobile, web, and server applications to store and sync data. It offers great flexibility and scalability features and it keeps your application data in-sync across multiple client applications.

How Firestore stores data

But how does it work, you may ask? Well, Cloud Firestore is a cloud-hosted database that stores data in documents that contain fields mapping to values.

These documents are stored in collections which can be used to better organize your data. You can think of collections like database tables and documents as the rows within a table. The only difference here is that documents (within a collection) support many different data types and they don’t have to contain the same fields. The Cloud Firestore data model supports whatever data structure works best for your app.

Explaining Firestore to BigQuery Integration | Coupler.io Blog (1)

Different methods to load data from Firestore to BigQuery

While Cloud Firestore can support the scalability of your application, it’s not ideal when you want to analyze data. This is where BigQuery comes in, which is designed exactly for business agility.

There are two main approaches to loading data from Firestore to BigQuery:

  • Export Firestore data to Google Cloud storage and then import to Google BigQuery
  • Stream data from Firestore directly to BigQuery.

The first approach can be performed in two ways – via the UI and via the BigQuery command line.

We’ll demonstrate each of these methods. For these examples we will use a Firestore database with two collections:

  • Restaurants: Containing a set of documents with various restaurant information (e.g. city, ratings, price range, etc)
Explaining Firestore to BigQuery Integration | Coupler.io Blog (2)
  • Visitors: Containing a set of documents with visitor information (e.g. name, time visited, etc.)
Explaining Firestore to BigQuery Integration | Coupler.io Blog (3)

Export Firestore data to BigQuery via UI

The process of exporting data is really straightforward, even though there are a few things we need to pay attention to. Before using the import/export service and export your data, you must first ensure the below:

(Video) Enrich server-side GTM data using Firestore

  • The Google Cloud Project where the Firestore database belongs has enabled and working billing details. Even though the export process doesn’t cost anything, the Google Cloud Storage service requires billing details in case the exports exceed the free tier threshold.
  • Have a Google Cloud Storage bucket where we can store the exports in the same region as the Cloud Firestore database location.
  • Make sure you have the required permissions in both Cloud Firestore (Owner, Cloud Datastore Owner, or Cloud Datastore Import Export Admin) and Cloud Storage (Owner or Storage Admin).

After you ensure the above, then the process gets really simple:

Explaining Firestore to BigQuery Integration | Coupler.io Blog (4)
  • Click the Export entire database option.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (5)
  • Below Choose Destination, enter the name of a Cloud Storage bucket or use the Browse button to select a bucket.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (6)
  • Click Export.

This will create a folder within your selected cloud storage bucket that will contain the below:

  • A file with the name of the folder ending in export_metadata
  • A folder with all the namespaces and kinds.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (7)

In order to load something in BigQuery, we will need the export_metadata file. So following the below process, we can import everything:

Explaining Firestore to BigQuery Integration | Coupler.io Blog (8)
  • From the Create table from, select Cloud Storage and choose an export_metadata file generated during the export. The overall_export_metadata created outside of the all_namespaces folder cannot be used as it’s not recognized by BigQuery. Make sure the file format is Cloud Datastore backup.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (9)
  • Provide a table name for the destination table.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (10)
  • Leave everything else as is and click “Create table”.

How to export specific collections to BigQuery

There are some cases where you might need to export only a specific collection. To do that, you can follow the above process, and in the 3rd step select “Export one or more collection groups”:

Explaining Firestore to BigQuery Integration | Coupler.io Blog (11)

Move data from Firestore to BigQuery via the command line

As we mentioned earlier, you can also perform the above via the BigQuery command line. You can import your Firestore data using Google Cloud’s bq command.

First, you need to gather 4 requirements:

  • LOCATION: The region where your BigQuery dataset belongs to.
  • FORMAT: This should always be DATASTORE_BACKUP. This is because Datastore Backup is the correct option for Firestore.
  • DATASET: This is the dataset where the table which is going to contain the data belongs to.
  • TABLE: This is the table name where you are going to load the data. If you do not have a table, it will be created automatically.
  • PATH_TO_SOURCE: Replace it with the Cloud Storage URI of the export file. This URI has the following format of gs://firestore_example/20211011T1256/default_namespace/kind_restaurants/default_namespace_kind_Restaurant.export_metadata

As soon as you have this information, you can execute the below in Google Cloud Command line:

bq --location=europe-west3 load--source_format=DATASTORE_BACKUPfirestore_example.restaurant_datags://firestore_example/20211011T1256/default_namespace/kind_restaurants/default_namespace_kind_Restaurant.export_metadata

Voilà! You now have all your Firestore database exported in BigQuery.

How to automate Firestore to BigQuery export

The above options are a great way to export your Firestore data and import them back in BigQuery as a one-off. But there might be some cases where you need to automate Firestore to BigQuery export process and continuously update your export with new updates.

Automating Firestore to BigQuery exports can remove an extra process from your pipeline and keep your BigQuery dataset up to date. Every time you need to perform an analysis, you can do it on your latest version of data.

The process of exporting your data and updates in real-time is called streaming and luckily, there is an extension that helps you stream your Firestore data directly to BigQuery.

How to stream Firestore data to BigQuery

In order to stream Firestore data to BigQuery and update your BigQuery data when a document or a collection changes, you can use the “Export Collections to BigQuery” Firebase extension.

This extension exports the documents in a Cloud Firestore collection to BigQuery in real-time and incremental. It scans for document changes in the collection and automatically sends the action to BigQuery as well (document creation, deletion, or update).

(Video) Lesson#6 - BigQuery for beginners| Analyze data in google bigquery | Step by step tutorial (2020)

In order to install this extension, you can:

  • Go to the extension page.
  • If you have already set up billing hit “Next”
Explaining Firestore to BigQuery Integration | Coupler.io Blog (12)
  • Review which APIs will be enabled and hit “Next”.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (13)
  • Review all the access rights provided to the extension and hit “Next”.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (14)
  • Enter the configuration information:
    1. Cloud functions location: Usually the same location as your Firestore database and BigQuery dataset
    2. BigQuery dataset location: The location where BigQuery dataset can be found.
    3. Collection path: The collection you want to stream. You may use {wildcard} notation to match a subcollection of all documents in a collection, for example, businesses/{locationid}/restaurants. Read more about wildcards.
    4. Dataset Id: The BigQuery dataset where your table will be placed.
    5. Table Id: The BigQuery table where your data will be streamed.
    6. BigQuery SQL table partitioning option: Add in case you need your BigQuery table partitioned.
Explaining Firestore to BigQuery Integration | Coupler.io Blog (15)
  • Click “Install extension”.

How to backfill your BigQuery dataset

The above extension will start sending new information to BigQuery after it’s installed. This means that your full dataset will not be exported. You can only backfill your BigQuery dataset with all the documents in the collection if you are familiar with coding and by running this custom import script created by Google specifically for this case.

Please note that the import script should run after installing the above extension, otherwise if there are any writes to the database during the import, they won’t be included in the export.

Wrapping up – Firestore to BigQuery

In this article, we went through all the available options on how to move your data from Google Firestore to Google BigQuery. We’ve seen what Firestore is, how it stores data, and where it is frequently used. Then, we’ve seen the different methods to move your data as a one-time thing and through streaming.

Now that your Firestore data is already in BigQuery, you may be eager to couple it with more of your business information.

Coupler.io with its BigQuery integrations offers an easy way to automate the import process into BigQuery. You can fetch data from Airtable, Pipedrive, Hubspot, QuickBooks, Shopify, and many other sources.

The imports run automatically, on a schedule that works best for you. We don’t have a Firestore integration yet but we may add it in the future if there’s an interest in our community. Would you use it? If so, let us know via this form.

  • Explaining Firestore to BigQuery Integration | Coupler.io Blog (16)

    Dimitris Vogiatzis

    Technology, Data & Analytics Lead at Amplifyd

Back to Blog

(Video) Plaid's Journey to Global Large-Scale Real-Time Analytics With Google BigQuery (Cloud Next '19)

Videos

1. Lunch & Learn BigQuery & Firebase from other Google Cloud customers
(Daniel Zivkovic)
2. Tech Talk: Cloud Firestore
(Fluxon)
3. InfoTrust + Google Webinar | How to Control Your Data with Server-Side GTM
(InfoTrust)
4. How Data and Data Analytics are driving Patient Engagement
(aicorespot)
5. Google Cloud Data Storage Explained: What Is Storage Option in Google Cloud and How It Works?
(Whizlabs)
6. Data Cloud Summit
(Google Cloud)
Top Articles
Latest Posts
Article information

Author: Fredrick Kertzmann

Last Updated: 12/04/2023

Views: 6662

Rating: 4.6 / 5 (66 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Fredrick Kertzmann

Birthday: 2000-04-29

Address: Apt. 203 613 Huels Gateway, Ralphtown, LA 40204

Phone: +2135150832870

Job: Regional Design Producer

Hobby: Nordic skating, Lacemaking, Mountain biking, Rowing, Gardening, Water sports, role-playing games

Introduction: My name is Fredrick Kertzmann, I am a gleaming, encouraging, inexpensive, thankful, tender, quaint, precious person who loves writing and wants to share my knowledge and understanding with you.