Building an API with FastAPI to Access Google BigQuery Data
Written on
Introduction
A crucial part of any data pipeline is the final step, often referred to as "Serving Data." This process can take many forms, including data analysis, machine learning models, or exposing the data through an API. This tutorial is dedicated to the latter.
We will explore how to construct an API using FastAPI, a high-performance web framework for building APIs in Python. The dataset we will utilize has been previously scraped and stored in BigQuery. We will create endpoints that allow access to stock data stored in a Google BigQuery dataset.
You can download the code for this article here.
Feel free to subscribe to my substack, where this article was originally published.
Prerequisites and Data Structure
Before diving in, make sure you have the following:
- Python installed on your computer (ideally version 3.7 or above)
- A Google Cloud Platform (GCP) project with BigQuery enabled and access to the dataset
- A basic understanding of Python and REST API principles
We can examine the dataset structure by looking at the previous article's illustration.
Installing Required Libraries and Credentials
Ensure you have the necessary libraries from prior articles; the main libraries for this one can be installed using the following command:
pip install fastapi uvicorn google-cloud-bigquery
Also, refer to this article for guidance on setting up Google Cloud credentials and the dataset.
Defining Our API
We can now begin defining our endpoints using FastAPI and querying data with a SQL-like syntax.
Importing Required Libraries
We start by importing the needed libraries and initializing our FastAPI application along with the BigQuery client:
from fastapi import FastAPI, HTTPException from google.cloud import bigquery from dotenv import load_dotenv import os
load_dotenv()
app = FastAPI()
# Initialize BigQuery client client = bigquery.Client()
PROJECT_ID = os.environ.get("PROJECT_ID") DATASET_ID = os.environ.get("DATASET_ID") TABLE_ID = os.environ.get("TABLE_ID") dataset_str = f"{PROJECT_ID}.{DATASET_ID}.{TABLE_ID}"
Make sure your .env file has these IDs, including the GOOGLE_APPLICATION_CREDENTIALS variable pointing to the JSON file. We define all the IDs since this is a personal database (only accessible from your account). Although we could make it public, we aim to learn the fundamentals of accessing data in BigQuery and building an API. Let's begin defining the endpoints one by one.
Retrieving Stock Data by Name
We can start by fetching data for a specific stock based on its name. This involves querying the data where stock_name matches the provided stock name and converting the results into JSON format:
# Define endpoint to retrieve stock data by stock name @app.get("/stock/{stock_name}") def get_stock_data(stock_name: str):
query = f"""
SELECT *
FROM {dataset_str}
WHERE stock_name = '{stock_name}'
"""
try:
# Execute the query
query_job = client.query(query)
results = query_job.result()
# Convert results to list of dictionaries
data = [dict(row) for row in results]
return data
except Exception as e:
raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}")
Before querying the endpoint, start the FastAPI server with the following command (ensure your file is named main.py or adjust the command accordingly):
uvicorn main:app --reload
The FastAPI server should now be running locally, and you can access the endpoints at http://localhost:8000. To check the stock data for "DAX," navigate to http://localhost:8000/stock/DAX.
If you encounter an error, it may be due to trailing spaces in the stock_name column of your dataset. You can either eliminate it from the database or use the URL with a trailing space.
You can also test the endpoint using:
curl http://localhost:8000/stock/DAX
The output should resemble the following:
Congratulations! You have successfully created your first endpoint to serve data. We can now build more complex queries.
Fetching Stock Data Within a Date Range
To accomplish this, we will execute an SQL query that filters data based on a date range specified by URL parameters.
# Endpoint to retrieve stock data within a date range @app.get("/stock/date_range/") async def get_stock_data_by_date_range(start_date: str, end_date: str):
query = f"""
SELECT *
FROM {dataset_str}
WHERE Date BETWEEN '{start_date}' AND '{end_date}'
"""
try:
# Execute the query
query_job = client.query(query)
results = query_job.result()
# Convert results to list of dictionaries
data = [dict(row) for row in results]
return data
except Exception as e:
raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}")
Now, let's query this endpoint:
curl http://localhost:8000/stock/date_range/?start_date=2024-02-01&end_date=2024-03-01
The output should look like this:
Searching for Stocks Within a Price Range
We can also filter stocks based on a specified price range:
# Endpoint to retrieve stock data by price range @app.get("/stock/price_range/") async def get_stock_data_by_price_range(min_price: float = Query(None), max_price: float = Query(None)):
condition = ""
if min_price is not None and max_price is not None:
condition = f"Price BETWEEN {min_price} AND {max_price}"elif min_price is not None:
condition = f"Price >= {min_price}"elif max_price is not None:
condition = f"Price <= {max_price}"query = f"""
SELECT *
FROM {dataset_str}
WHERE {condition}
"""
try:
# Execute the query
query_job = client.query(query)
results = query_job.result()
# Convert results to list of dictionaries
data = [dict(row) for row in results]
return data
except Exception as e:
raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}")
Users can specify either a minimum or maximum price, or both.
curl http://localhost:8000/stock/price_range/?min_price=5000&max_price=10000
The result should look something like this:
Alternatively, you can query it with just the max_price:
curl http://localhost:8000/stock/price_range/?max_price=100
Only the “S&P 500 VIX” falls within my budget!
API Documentation
FastAPI automatically generates API documentation using your code (via Swagger UI), accessible at http://localhost:8000/docs:
Conclusion
This article serves as a tutorial for building an API with FastAPI to serve data from Google BigQuery. It covers the installation of necessary libraries, setting up credentials, and defining API endpoints for accessing stock data by name, within a date range, and within a price range. Basic knowledge of Python and REST API concepts is assumed, along with having Python 3.7 or higher and a Google Cloud Platform project with BigQuery enabled.
Thank You
Thank you for being part of the In Plain English community! Before you leave:
- Be sure to clap and follow the author.
- Follow us on: X | LinkedIn | YouTube | Discord | Newsletter
- Explore our other platforms: Stackademic | CoFeed | Venture | Cubed
- Discover more content at PlainEnglish.io