Obtain Historical Weather Forecast Data in CSV Format Using Python

Introduction

Historical weather data is crucial for a wide range of applications, from climate change analysis and agricultural planning to energy demand forecasting and risk assessment. For data scientists and developers, having access to comprehensive historical weather records opens up possibilities for predictive modeling, pattern discovery, and deriving actionable insights.

However, obtaining historical weather data, especially in a structured format like CSV that is suitable for analysis, can be challenging. Many public datasets are limited in terms of time range or geographical coverage, while commercial APIs often come with usage restrictions and costs.

In this article, we will explore how to retrieve historical weather forecast data using Python and convert it into CSV format for further analysis. We will leverage the power of open-source libraries and weather APIs to automate the data retrieval process and create a reusable pipeline for obtaining historical weather data.

Weather APIs and Datasets

Before diving into the implementation, let‘s take a look at some of the available weather APIs and datasets:

  1. OpenWeatherMap API: OpenWeatherMap provides a free API for accessing current weather data, forecasts, and historical data. The free plan allows for limited requests per minute and provides data for the past 5 days. Paid plans offer more extensive historical data access.

  2. Dark Sky API: Dark Sky offers a powerful weather API with detailed historical data, including hourly and daily records. However, as of March 31, 2020, Dark Sky has been acquired by Apple, and new signups are no longer accepted.

  3. NOAA Climate Data Online: The National Oceanic and Atmospheric Administration (NOAA) provides access to a vast collection of climate and weather data through their Climate Data Online (CDO) service. The data is available in various formats, including CSV, but may require some preprocessing.

  4. WorldWeatherOnline API: WorldWeatherOnline offers a comprehensive weather API with historical data access. They provide a free trial with limited requests per day and paid plans for more extensive usage.

For the purpose of this article, we will use the OpenWeatherMap API to retrieve historical weather forecast data, as it provides a free plan and has good documentation.

Retrieving Historical Weather Data

Let‘s start by installing the necessary libraries. We will use the requests library to make API calls and the pandas library for data manipulation and CSV export.

pip install requests pandas

Next, sign up for a free API key from OpenWeatherMap by creating an account on their website: https://openweathermap.org/. Once you have the API key, you can start making requests to retrieve historical weather data.

Here‘s an example of how to retrieve historical weather data for a specific city using Python:

import requests
import pandas as pd

API_KEY = "YOUR_API_KEY"
BASE_URL = "http://api.openweathermap.org/data/2.5/forecast"

def get_historical_data(city, start_date, end_date):
    url = f"{BASE_URL}?q={city}&appid={API_KEY}&start={start_date}&end={end_date}&units=metric"

    response = requests.get(url)
    data = response.json()

    if response.status_code == 200:
        return data
    else:
        print(f"Error fetching data: {data[‘message‘]}")
        return None

# Specify the city and date range
city = "London"
start_date = "2023-05-01"
end_date = "2023-05-31"

# Retrieve historical data
historical_data = get_historical_data(city, start_date, end_date)

In this code snippet, we define a function get_historical_data that takes the city name, start date, and end date as parameters. It constructs the API URL with the necessary query parameters, including the API key, city name, date range, and units (metric in this case).

We then make a GET request to the API using the requests.get function and parse the JSON response. If the request is successful (status code 200), we return the data. Otherwise, we print an error message.

Converting JSON Data to CSV

The historical weather data retrieved from the API is in JSON format. To convert it into a structured CSV format suitable for analysis, we can use the pandas library. Here‘s how we can process the JSON data and export it to CSV:

if historical_data:
    # Extract relevant data from JSON
    data = []
    for entry in historical_data[‘list‘]:
        date_time = entry[‘dt_txt‘]
        temperature = entry[‘main‘][‘temp‘]
        humidity = entry[‘main‘][‘humidity‘]
        pressure = entry[‘main‘][‘pressure‘]
        wind_speed = entry[‘wind‘][‘speed‘]

        data.append({
            ‘date_time‘: date_time,
            ‘temperature‘: temperature,
            ‘humidity‘: humidity,
            ‘pressure‘: pressure,
            ‘wind_speed‘: wind_speed
        })

    # Create a DataFrame from the data
    df = pd.DataFrame(data)

    # Export DataFrame to CSV
    df.to_csv(f"{city}_historical_weather.csv", index=False)
    print(f"Historical weather data for {city} exported to CSV.")
else:
    print("Failed to retrieve historical weather data.")

In this code, we iterate over the list of forecast entries in the JSON data. For each entry, we extract the relevant information such as date/time, temperature, humidity, pressure, and wind speed. We store this data in a list of dictionaries.

Next, we create a pandas DataFrame from the extracted data using pd.DataFrame(). This allows us to work with the data in a structured tabular format.

Finally, we export the DataFrame to a CSV file using the to_csv() function, specifying the desired filename and setting index=False to exclude the row index from the CSV file. The resulting CSV file will have columns for date/time, temperature, humidity, pressure, and wind speed.

Automating Data Retrieval

To automate the process of retrieving historical weather data for multiple cities or date ranges, we can create a simple script that iterates over the desired parameters and saves the data to separate CSV files.

Here‘s an example script that retrieves historical weather data for multiple cities and date ranges:

import requests
import pandas as pd

API_KEY = "YOUR_API_KEY"
BASE_URL = "http://api.openweathermap.org/data/2.5/forecast"

def get_historical_data(city, start_date, end_date):
    url = f"{BASE_URL}?q={city}&appid={API_KEY}&start={start_date}&end={end_date}&units=metric"

    response = requests.get(url)
    data = response.json()

    if response.status_code == 200:
        return data
    else:
        print(f"Error fetching data for {city}: {data[‘message‘]}")
        return None

def export_to_csv(data, filename):
    # Extract relevant data from JSON
    processed_data = []
    for entry in data[‘list‘]:
        date_time = entry[‘dt_txt‘]
        temperature = entry[‘main‘][‘temp‘]
        humidity = entry[‘main‘][‘humidity‘]
        pressure = entry[‘main‘][‘pressure‘]
        wind_speed = entry[‘wind‘][‘speed‘]

        processed_data.append({
            ‘date_time‘: date_time,
            ‘temperature‘: temperature,
            ‘humidity‘: humidity,
            ‘pressure‘: pressure,
            ‘wind_speed‘: wind_speed
        })

    # Create a DataFrame from the processed data
    df = pd.DataFrame(processed_data)

    # Export DataFrame to CSV
    df.to_csv(filename, index=False)
    print(f"Historical weather data exported to {filename}")

# Specify the cities and date ranges
cities = ["London", "New York", "Tokyo"]
start_dates = ["2023-05-01", "2023-06-01", "2023-07-01"] 
end_dates = ["2023-05-31", "2023-06-30", "2023-07-31"]

# Retrieve and export historical data for each city and date range
for city, start_date, end_date in zip(cities, start_dates, end_dates):
    historical_data = get_historical_data(city, start_date, end_date)
    if historical_data:
        filename = f"{city}_historical_weather_{start_date}_{end_date}.csv"
        export_to_csv(historical_data, filename)

This script defines two functions: get_historical_data for retrieving data from the API and export_to_csv for processing the JSON data and exporting it to CSV.

We specify the cities and corresponding date ranges in separate lists. The script then iterates over the cities and date ranges using the zip function. For each combination, it retrieves the historical weather data using get_historical_data and exports it to a CSV file using export_to_csv.

The resulting CSV files will be named in the format {city}_historical_weather_{start_date}_{end_date}.csv, making it easy to identify the data for each city and date range.

Conclusion

Obtaining historical weather forecast data in CSV format using Python is a straightforward process thanks to the availability of weather APIs and powerful libraries like requests and pandas. By following the steps outlined in this article, you can create a reusable pipeline for retrieving and exporting historical weather data for various cities and date ranges.

However, it‘s important to keep in mind the limitations and usage restrictions of the chosen weather API. Make sure to review the API documentation, pricing plans, and terms of service before integrating it into your projects.

With the historical weather data in CSV format, you can perform various data analysis tasks, visualizations, and predictive modeling. This data can provide valuable insights for applications such as climate change analysis, agricultural planning, energy demand forecasting, and more.

Remember to handle API keys and sensitive information securely and follow best practices for data storage and processing when working with large datasets.

I hope this article has provided you with a solid foundation for retrieving historical weather forecast data using Python. Feel free to adapt the code examples to suit your specific requirements and explore additional features offered by the weather API.

Happy coding and data analysis!

Additional Resources

Similar Posts