This is going to be a full guide to scraping FRED, our first large goal as the data collection team. As I had mentioned during our last meeting, last week, I scraped the metadata for all 825,000+ economic time series on FRED. I have uploaded this file to the Humun Google Cloud Storage space so that you all will have access to it. The main reason you need it is to obtain the series ID and other relevant metadata for each series that you are going to collect.
Start by downloading the metadata CSV from GC storage. In order to do this, you will need the following setup file, and the following script.
First run:
pip install google-cloud-storage
**
Create a new folder called “fred_dataset_download” to house the scripts/files.
Then, download these files:
service-account-key.json (a setup file with keys allowing you to access the GC space)
{
"type": "service_account",
"project_id": "teamcore-409617",
"private_key_id": "07b94c3a8bc1ecab1188fd7ccf300a1885def1b1",
"private_key": "-----BEGIN PRIVATE KEY-----\\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCvWVFelRcvAESX\\nJWndAbuyQQsSG5L3zH5vU22uwN0B4R6t+UG7oHL/6TelnI3t4kMZ+WnHAulsP277\\n8YcXOgPhkSu4IdiUy56gbpHIlthSchscI520aU5pi0TOq7pI19xKchQtUcSBNrLm\\neR+y8edEAdY3dINiWE7VDf2HetlQmWrPoqA5wqntrH2+VOoYuMElIjXbgWWD+KoH\\nd9eZ+/VCL07mJ1N7wUsLqljCupOtvQnvN4Yx8WoWg29CPKtGgQ2hIF0eXNjwBhqt\\nvO1QKI0SqnaRBwBHtk6y8EikNLCth+mgGm8cTxGFvrHrZXFxEcR6pK3dQ2uCN/GJ\\n1ZtxEXjdAgMBAAECggEAH9CGF7HXzL6Q81aFtGRr3H36Jv0rR3wKJaM+t5IFF2Hz\\nyc97enI0Y1O6dbkntDVVBOmwpDvWQ75nod0y7EcprJvFEanMbMTcAVJGb51U3vKW\\nkh6xLqpboIE1CQV17WEC9lvn5sga7fHReEkaNAK5eeiWaCXi761DklrxOMtUkg+S\\n3LKdHAabPsHkfFm91PUzvOwFamgemdSfHdAbAEigS4lVjiZM809gULZc886IVqfe\\n1SZiOcTwBt9gdO59UvkV0sm6yfW3yhpWrb5B7pvK9XHC6FYK5zd746gFiWfYEe9C\\nJCeSwinpjbouaAMCKQtuzXfY+JVziyatvEFTkIbTOQKBgQDhvjsM5AHstDm6K7ry\\nzn5UdpKb10DWtql5s7L7XTd+nR7wMrfZ6LnsZd9SkhU0wA/HpEbawyMQpMgZQEGO\\nxVAB05VuKRS7SgsWVae0Cut0Rn4KF0tTY1772NILQkYoryTcJiz0eipvPIHJ+CCL\\nCl8CABOa9lhMAadnSEkTLMB3dQKBgQDG2fIOb83a/z2ilD+4yTaB4rcv0x8YfnQN\\ntD4YlMQQP2vQVqhpl3/2r5iIyenmpTRcd3KGzCB42VEf6JwXa023gIq4zAgF4kKK\\nyaQKSntJzdI4D2wbyLSLHE0dojtQ7ool9l+WHwlYoDYdtdOEdSn7UB+uG8ewOyIR\\n8NWJap02yQKBgEXBPYf3MK0O58OiXatHqXu6BAWJ1yxB106W+5h2rn4+WOAKHAuG\\nwWTN+dsO7uSU8ItVNNvGbqBm+rnqxBc020slMUiQAyr4b0Kghyi4MxeD7NB7cDg9\\nPY1+6zC1cu6BaFdqqHuHAHPM86IQPSYZt0/r7CL3OkOKQ0tD5+i37GU9AoGAAazE\\nRSrb6QRNWJk3EC9hriZitJxqnqIyCAuEmmBmZlyiY9bXBEyqX0GLX1uUBMVPc5ft\\n9wSxIVNzQ3mKFwhoVytV/8h4KNSHCvQ31X5bG3wIUUCQAIvoOWO7ooxDQ6M+tqMk\\nmvcX9Q8kZYuqhGsYN22tVqIVRH67AruskMO9H0ECgYA3wsCH3IXPOP5iH+KBrVec\\ny2zRYtQW/Sl975vGofWBxhavdkGKg38jlZ0E0kAjXMysjrhkrYsj7tIP0+xKptuc\\nBFpo/HEN6SHfCcG6VVRWGj5ZwWyt1sMNyNAwvLcfbNBM4apWKp1/ha7dEVhvLHBZ\\n2lWQ0zw0SXQV+s+qTMogKQ==\\n-----END PRIVATE KEY-----\\n",
"client_email": "[email protected]",
"client_id": "110041470125125095856",
"auth_uri": "<https://accounts.google.com/o/oauth2/auth>",
"token_uri": "<https://oauth2.googleapis.com/token>",
"auth_provider_x509_cert_url": "<https://www.googleapis.com/oauth2/v1/certs>",
"client_x509_cert_url": "<https://www.googleapis.com/robot/v1/metadata/x509/bucket-access%40teamcore-409617.iam.gserviceaccount.com>",
"universe_domain": "googleapis.com"
}
from google.cloud import storage
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "PATH TO 'service-account-key.json' FROM ABOVE" #!! make sure to edit this
def download_from_gcs(gcs_path, local_directory):
"""Downloads a file from GCS to a local directory using google-cloud-storage."""
# Initialize a storage client
client = storage.Client()
# Extract the bucket name and blob name from the GCS path
bucket_name, blob_name = gcs_path[5:].split("/", 1)
# Construct the full local file path (local directory + filename)
filename = os.path.basename(blob_name)
local_path = os.path.join(local_directory, filename)
try:
# Get the bucket and blob
bucket = client.bucket(bucket_name)
blob = bucket.blob(blob_name)
# Download the blob to the local path
blob.download_to_filename(local_path)
print(f"Successfully downloaded {gcs_path} to {local_path}")
except Exception as e:
print(f"Failed to download {gcs_path} to {local_path}: {e}")
# Usage example:
gcs_path = "gs://humun-storage/path/in/bucket/all_fred_metadata.csv" # METADATA FILE PATH ON GCS - DO NOT CHANGE!
local_directory = r"C:\\Users\\shren\\Downloads" # CHANGE TO LOCATION FOR WHERE YOU WANT THE METADATA FILE TO BE SAVED
download_from_gcs(gcs_path, local_directory)
Next, run this file in the new directory that you created:
python ./downloadGC.py