Building an API with FastAPI to Access Google BigQuery Data

Introduction

A crucial part of any data pipeline is the final step, often referred to as "Serving Data." This process can take many forms, including data analysis, machine learning models, or exposing the data through an API. This tutorial is dedicated to the latter.

We will explore how to construct an API using FastAPI, a high-performance web framework for building APIs in Python. The dataset we will utilize has been previously scraped and stored in BigQuery. We will create endpoints that allow access to stock data stored in a Google BigQuery dataset.

You can download the code for this article here.

Feel free to subscribe to my substack, where this article was originally published.

Prerequisites and Data Structure

Before diving in, make sure you have the following:

Python installed on your computer (ideally version 3.7 or above)
A Google Cloud Platform (GCP) project with BigQuery enabled and access to the dataset
A basic understanding of Python and REST API principles

We can examine the dataset structure by looking at the previous article's illustration.

Installing Required Libraries and Credentials

Ensure you have the necessary libraries from prior articles; the main libraries for this one can be installed using the following command:

pip install fastapi uvicorn google-cloud-bigquery

Also, refer to this article for guidance on setting up Google Cloud credentials and the dataset.

Defining Our API

We can now begin defining our endpoints using FastAPI and querying data with a SQL-like syntax.

Importing Required Libraries

We start by importing the needed libraries and initializing our FastAPI application along with the BigQuery client:

from fastapi import FastAPI, HTTPException from google.cloud import bigquery from dotenv import load_dotenv import os

load_dotenv()

app = FastAPI()

# Initialize BigQuery client client = bigquery.Client()

PROJECT_ID = os.environ.get("PROJECT_ID") DATASET_ID = os.environ.get("DATASET_ID") TABLE_ID = os.environ.get("TABLE_ID") dataset_str = f"{PROJECT_ID}.{DATASET_ID}.{TABLE_ID}"

Make sure your .env file has these IDs, including the GOOGLE_APPLICATION_CREDENTIALS variable pointing to the JSON file. We define all the IDs since this is a personal database (only accessible from your account). Although we could make it public, we aim to learn the fundamentals of accessing data in BigQuery and building an API. Let's begin defining the endpoints one by one.

Retrieving Stock Data by Name

We can start by fetching data for a specific stock based on its name. This involves querying the data where stock_name matches the provided stock name and converting the results into JSON format:

# Define endpoint to retrieve stock data by stock name @app.get("/stock/{stock_name}") def get_stock_data(stock_name: str):

query = f"""

SELECT *

FROM {dataset_str}

WHERE stock_name = '{stock_name}'

"""

try:

# Execute the query

query_job = client.query(query)

results = query_job.result()

# Convert results to list of dictionaries

data = [dict(row) for row in results]

return data

except Exception as e:

raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}")

Before querying the endpoint, start the FastAPI server with the following command (ensure your file is named main.py or adjust the command accordingly):

uvicorn main:app --reload

The FastAPI server should now be running locally, and you can access the endpoints at http://localhost:8000. To check the stock data for "DAX," navigate to http://localhost:8000/stock/DAX.

If you encounter an error, it may be due to trailing spaces in the stock_name column of your dataset. You can either eliminate it from the database or use the URL with a trailing space.

You can also test the endpoint using:

curl http://localhost:8000/stock/DAX

The output should resemble the following:

Congratulations! You have successfully created your first endpoint to serve data. We can now build more complex queries.

Fetching Stock Data Within a Date Range

To accomplish this, we will execute an SQL query that filters data based on a date range specified by URL parameters.

# Endpoint to retrieve stock data within a date range @app.get("/stock/date_range/") async def get_stock_data_by_date_range(start_date: str, end_date: str):

query = f"""

SELECT *

FROM {dataset_str}

WHERE Date BETWEEN '{start_date}' AND '{end_date}'

"""

try:

# Execute the query

query_job = client.query(query)

results = query_job.result()

# Convert results to list of dictionaries

data = [dict(row) for row in results]

return data

except Exception as e:

raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}")

Now, let's query this endpoint:

curl http://localhost:8000/stock/date_range/?start_date=2024-02-01&end_date=2024-03-01

The output should look like this:

Searching for Stocks Within a Price Range

We can also filter stocks based on a specified price range:

# Endpoint to retrieve stock data by price range @app.get("/stock/price_range/") async def get_stock_data_by_price_range(min_price: float = Query(None), max_price: float = Query(None)):

condition = ""

if min_price is not None and max_price is not None:

condition = f"Price BETWEEN {min_price} AND {max_price}"

elif min_price is not None:

condition = f"Price >= {min_price}"

elif max_price is not None:

condition = f"Price <= {max_price}"

query = f"""

SELECT *

FROM {dataset_str}

WHERE {condition}

"""

try:

# Execute the query

query_job = client.query(query)

results = query_job.result()

# Convert results to list of dictionaries

data = [dict(row) for row in results]

return data

except Exception as e:

raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}")

Users can specify either a minimum or maximum price, or both.

curl http://localhost:8000/stock/price_range/?min_price=5000&max_price=10000

The result should look something like this:

Alternatively, you can query it with just the max_price:

curl http://localhost:8000/stock/price_range/?max_price=100

Only the “S&P 500 VIX” falls within my budget!

API Documentation

FastAPI automatically generates API documentation using your code (via Swagger UI), accessible at http://localhost:8000/docs:

Conclusion

This article serves as a tutorial for building an API with FastAPI to serve data from Google BigQuery. It covers the installation of necessary libraries, setting up credentials, and defining API endpoints for accessing stock data by name, within a date range, and within a price range. Basic knowledge of Python and REST API concepts is assumed, along with having Python 3.7 or higher and a Google Cloud Platform project with BigQuery enabled.

Thank You

Thank you for being part of the In Plain English community! Before you leave:

Be sure to clap and follow the author.
Follow us on: X | LinkedIn | YouTube | Discord | Newsletter
Explore our other platforms: Stackademic | CoFeed | Venture | Cubed
Discover more content at PlainEnglish.io

zhaopinboai.com

Building an API with FastAPI to Access Google BigQuery Data

Introduction

Prerequisites and Data Structure

Installing Required Libraries and Credentials

Defining Our API

Importing Required Libraries

Retrieving Stock Data by Name

Fetching Stock Data Within a Date Range

Searching for Stocks Within a Price Range

API Documentation

Conclusion

Thank You

Share the page:

Recent Post:

Koalas and Microbiomes: The Hidden Struggle for Survival

Let’s Reassess Our Plastic Dependency for a Sustainable Future

Finding Inner Peace in Everyday Life: A Spiritual Approach

# Transforming Perspectives: Beyond Features in Agile Teams

Effective Ways to Replace Substrings in Python Strings

Efficiently Parsing Large JSON Files in .NET: A Comprehensive Guide

Mastering Data Cleaning and Transformation for Web Scraping

Navigating Chaos: The Interplay of Weather and the Universe