cloud sql

BigQuery Short Notes

  • Devops, Misc

Dataset

Datasets are top-level containers that are used to organize and control access to your tables and views.

location

BigQuery processes queries in the same location.

Location cannot be changed after creation.

Use BigQuery Data Transfer Service or Cloud Composer to transfer data accross different locations.

There are two types of locations:

  • A region is a specific geographic place, such as us-central1.
  • A multi-region is a large geographic area, such as EU, US

considerations

  • Colocate with external data source
    • If dataset is in the US multi-regional location, GCS bucket must be in a multi-regional bucket in the US.
    • If dataset is in Cloud Bigtable, your dataset must be in the US or the EU multi-regional location.
  • Colocate with GCS buckets for loading and exporting data
    • The GCS bucket must be in a regional or multi-regional bucket in the same location. Except US multi-regional location, you can load data from a GCS bucket in any regional or multi-regional location.

availability and durability

Failure domains

  • Machine-level
  • Zonal
  • Regional

Failure types

  • Soft: power failure, network partition, or a machine crash, should never lose data.
  • Hard: damage from floods, terrorist attacks, earthquakes, and hurricanes, data might be lost.

Single region

  • No backup or replication to another region.
  • Better considering create cross-region backups.

Multi region

  • Data is stored in a single region but is backed up in a geographically-separated region to provide resilience to a regional disaster.

Read More »BigQuery Short Notes