Python Project – How to Create a Horoscope API with Beautiful Soup and Flask

As a full-stack developer, I‘m always looking for interesting projects to practice my skills and create something useful. Recently, I had the idea to build a horoscope API that would allow users to retrieve their daily, weekly, or monthly horoscope programmatically. In this in-depth tutorial, I‘ll walk you through the process of creating a horoscope API using Python, the Beautiful Soup library for web scraping, and the Flask framework for building the API.

We‘ll cover everything from setting up the project environment to deploying the final API so it can be used by other developers. Along the way, I‘ll share best practices and tips to help you create a robust and reliable API. Let‘s get started!

Setting up the Project Environment

To begin, we‘ll create a new directory for our project and set up a virtual environment to isolate our project dependencies. This is an important step to ensure that our project has all the required libraries without interfering with other Python projects on our system.

First, create a new directory and navigate into it:

mkdir horoscope-api
cd horoscope-api

Next, create a virtual environment using Python‘s built-in venv module:

python -m venv env

Activate the virtual environment:

  • On Windows:
    env\Scripts\activate
  • On macOS and Linux:
    source env/bin/activate

Now, we can install the required libraries for our project:

pip install beautifulsoup4 requests flask flask-restx python-decouple

Here‘s a brief overview of each library:

  • BeautifulSoup4: A library for parsing HTML and XML documents, which we‘ll use for web scraping
  • Requests: A simple HTTP library for making web requests
  • Flask: A lightweight web framework for building APIs
  • Flask-RESTX: An extension for Flask that adds support for building REST APIs with Swagger documentation
  • Python-decouple: A library for separating configuration settings from the codebase

Project Structure

Before we dive into the code, let‘s define a clear structure for our project. We‘ll create separate files for configuration, routes, and utility functions. This modular approach makes the codebase more maintainable and easier to understand.

Here‘s the project structure we‘ll use:

horoscope-api/
├── .env
├── config.py
├── main.py
└── core/
    ├── __init__.py
    ├── routes.py
    └── utils.py
  • .env: This file will store our environment variables and configuration settings
  • config.py: This file will define our application configuration classes
  • main.py: This will be the entry point of our Flask application
  • core/: This directory will contain the core components of our application
    • __init__.py: This file will initialize our Flask application and API
    • routes.py: This file will define our API routes and resource classes
    • utils.py: This file will contain utility functions for web scraping and data processing

Scraping Horoscope Data with Beautiful Soup

To provide horoscope data through our API, we first need to scrape it from a reliable source. For this tutorial, we‘ll be scraping data from Horoscope.com.

We‘ll create separate functions for scraping daily, weekly, and monthly horoscopes based on the zodiac sign and timeframe provided by the user.

Open the core/utils.py file and add the following code:

import requests
from bs4 import BeautifulSoup

def get_horoscope_by_day(zodiac_sign: int, day: str):
    url = f"https://www.horoscope.com/us/horoscopes/general/horoscope-general-daily-{day}.aspx?sign={zodiac_sign}"
    if "-" in day:
        day = day.replace("-", "")
        url = f"https://www.horoscope.com/us/horoscopes/general/horoscope-archive.aspx?sign={zodiac_sign}&laDate={day}"

    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    horoscope_data = soup.find("div", class_="main-horoscope").p.text

    return horoscope_data

def get_horoscope_by_week(zodiac_sign: int):
    url = f"https://www.horoscope.com/us/horoscopes/general/horoscope-general-weekly.aspx?sign={zodiac_sign}"

    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    horoscope_data = soup.find("div", class_="main-horoscope").p.text

    return horoscope_data

def get_horoscope_by_month(zodiac_sign: int):
    url = f"https://www.horoscope.com/us/horoscopes/general/horoscope-general-monthly.aspx?sign={zodiac_sign}"

    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    horoscope_data = soup.find("div", class_="main-horoscope").p.text

    return horoscope_data

In each function, we construct the URL based on the zodiac sign and timeframe (daily, weekly, or monthly). For daily horoscopes, we handle both current dates (today, yesterday, tomorrow) and specific dates in the format YYYY-MM-DD.

We then make a GET request to the URL using the requests library and parse the HTML content using Beautiful Soup. Finally, we locate the horoscope text within the HTML using the appropriate CSS selector and return it.

Building the API with Flask and Flask-RESTX

Now that we have our web scraping functions in place, let‘s build the API endpoints using Flask and Flask-RESTX.

Open the core/__init__.py file and add the following code:

from flask import Flask
from decouple import config
from flask_restx import Api

app = Flask(__name__)
app.config.from_object(config("APP_SETTINGS"))

api = Api(
    app,
    version=‘1.0‘,
    title=‘Horoscope API‘,
    description=‘Get daily, weekly, and monthly horoscopes using this simple API‘,
    doc=‘/‘
)

from core import routes

Here, we create a Flask application instance and load the configuration settings from the APP_SETTINGS environment variable defined in the .env file. We then initialize the Flask-RESTX API with the appropriate metadata.

Next, open the core/routes.py file and add the following code:

from flask import jsonify
from flask_restx import Namespace, Resource, reqparse
from core import api
from core.utils import get_horoscope_by_day, get_horoscope_by_week, get_horoscope_by_month

ns = api.namespace("horoscopes", description="Horoscope operations")

parser = reqparse.RequestParser()
parser.add_argument(‘sign‘, type=str, required=True, help=‘Zodiac sign‘)

day_parser = parser.copy()
day_parser.add_argument(‘day‘, type=str, required=True, help=‘Date in the format YYYY-MM-DD or today, yesterday, tomorrow‘)

ZODIAC_SIGNS = {
    ‘aries‘: 1, ‘taurus‘: 2, ‘gemini‘: 3, ‘cancer‘: 4, ‘leo‘: 5, ‘virgo‘: 6,
    ‘libra‘: 7, ‘scorpio‘: 8, ‘sagittarius‘: 9, ‘capricorn‘: 10, ‘aquarius‘: 11, ‘pisces‘: 12
}

@ns.route(‘/daily‘)
class DailyHoroscopeResource(Resource):
    @ns.expect(day_parser)
    def get(self):
        args = day_parser.parse_args()
        zodiac_sign = args[‘sign‘].lower()
        day = args[‘day‘]

        if zodiac_sign not in ZODIAC_SIGNS:
            return jsonify({"error": "Invalid zodiac sign"}), 400

        sign_number = ZODIAC_SIGNS[zodiac_sign]
        horoscope_data = get_horoscope_by_day(sign_number, day)

        return jsonify({"horoscope": horoscope_data})

@ns.route(‘/weekly‘)
class WeeklyHoroscopeResource(Resource):
    @ns.expect(parser)
    def get(self):
        args = parser.parse_args()
        zodiac_sign = args[‘sign‘].lower()

        if zodiac_sign not in ZODIAC_SIGNS:
            return jsonify({"error": "Invalid zodiac sign"}), 400

        sign_number = ZODIAC_SIGNS[zodiac_sign]
        horoscope_data = get_horoscope_by_week(sign_number)

        return jsonify({"horoscope": horoscope_data})

@ns.route(‘/monthly‘)
class MonthlyHoroscopeResource(Resource):
    @ns.expect(parser)
    def get(self):
        args = parser.parse_args()
        zodiac_sign = args[‘sign‘].lower()

        if zodiac_sign not in ZODIAC_SIGNS:
            return jsonify({"error": "Invalid zodiac sign"}), 400

        sign_number = ZODIAC_SIGNS[zodiac_sign]
        horoscope_data = get_horoscope_by_month(sign_number)

        return jsonify({"horoscope": horoscope_data})

In this code, we define a namespace for our horoscope endpoints and create request parsers to validate and extract the query parameters (sign and day). We also define a dictionary mapping zodiac signs to their corresponding numbers used by the Horoscope.com website.

For each endpoint (daily, weekly, monthly), we create a resource class that handles GET requests. We parse the query parameters, validate the zodiac sign, and call the appropriate web scraping function from utils.py. Finally, we return the horoscope data as a JSON response.

Running and Testing the API

We‘re now ready to run our Flask application and test the API endpoints. Open the `main.py` file and add the following code:

from core import app

if __name__ == ‘__main__‘:
    app.run(debug=True)

This code imports the Flask application instance from the core package and runs it in debug mode.

To start the application, run the following command in your terminal:

python main.py

You should see output similar to the following:

 * Serving Flask app "core" (lazy loading)
 * Environment: development
 * Debug mode: on
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 123-456-789

Open your web browser and navigate to http://localhost:5000/. You should see the Swagger documentation for your API, which allows you to interactively test the endpoints.

Horoscope API Swagger Documentation

Try making requests to the different endpoints with various zodiac signs and dates to ensure everything is working as expected.

Deploying the API

To make your horoscope API accessible to other developers, you‘ll need to deploy it to a public server. There are many options for deploying Flask applications, such as Heroku, AWS, or DigitalOcean.

For this tutorial, we‘ll use Heroku, a popular platform for deploying web applications. Follow these steps to deploy your API:

  1. Create a Heroku account at https://signup.heroku.com/
  2. Install the Heroku CLI by following the instructions at https://devcenter.heroku.com/articles/heroku-cli
  3. Create a new file named Procfile in your project root directory with the following content:
    web: gunicorn main:app
  4. Create a new file named requirements.txt in your project root directory and add the following dependencies:
    beautifulsoup4==4.9.3
    requests==2.25.1
    flask==1.1.2
    flask-restx==0.3.0
    python-decouple==3.4
    gunicorn==20.1.0
  5. Initialize a Git repository in your project root directory and commit your code:
    git init
    git add .
    git commit -m "Initial commit"
  6. Log in to Heroku using the CLI:
    heroku login
  7. Create a new Heroku app:
    heroku create your-app-name

    Replace your-app-name with a unique name for your application.

  8. Push your code to Heroku:
    git push heroku main
  9. Open your deployed application in a web browser:
    heroku open

Your horoscope API is now deployed and accessible to other developers!

Conclusion

In this tutorial, we‘ve covered the process of creating a horoscope API using Python, Beautiful Soup for web scraping, and Flask for building the API. We‘ve also deployed our API to Heroku, making it accessible to other developers.

You can now extend this API by adding more features, such as error handling, rate limiting, or caching. You can also explore other web scraping techniques and data sources to provide more comprehensive horoscope data.

Remember to always respect the terms of service and robots.txt files of the websites you scrape, and give credit to the original sources of the data.

I hope this tutorial has been helpful and informative. Happy coding!

Similar Posts