Chapter 6: When Things Go Wrong - The Exception Hierarchy

In Chapter 5: Authentication Handlers, we learned how to prove our identity to websites that require login or API keys. We assumed our requests would work if we provided the correct credentials.

But what happens when things don’t go as planned? The internet isn’t always reliable. Websites go down, networks have hiccups, URLs might be typed incorrectly, or servers might just be having a bad day. How does requests tell us about these problems, and how can we handle them gracefully in our code?

The Problem: Dealing with Request Failures

Imagine you’re building a script to check the weather using an online weather API. You use requests.get() to fetch the weather data. What could go wrong?

  • Your internet connection might be down.
  • The weather API website might be temporarily offline.
  • You might have mistyped the URL.
  • The website might take too long to respond (a timeout).
  • The website might respond, but with an error message (like “404 Not Found” or “500 Server Error”).

If any of these happen, requests will encounter an error. If you don’t prepare for these errors, your script might crash! We need a way to:

  1. Detect that an error occurred.
  2. Understand what kind of error it was (network issue? timeout? bad URL?).
  3. React appropriately (e.g., print a helpful message, try again later, use a default value).

The Solution: A Family Tree of Errors

Requests helps us by using a system of specific error messages called exceptions. When something goes wrong, requests doesn’t just give up silently; it raises an exception.

Think of it like a doctor diagnosing an illness. A doctor doesn’t just say “You’re sick.” They give a specific diagnosis: “You have the flu,” or “You have a broken arm,” or “You have allergies.” Each diagnosis tells you something specific about the problem and how to treat it.

Requests does something similar with its exceptions. It has a main, general exception called requests.exceptions.RequestException. All other specific requests errors are “children” or “descendants” of this main one, forming an Exception Hierarchy (like a family tree).

Analogy: The “Sickness” Family Tree 🌳

  • RequestException (The Grandparent): This is the most general category, like saying “Sickness.” If you catch this, you catch any problem related to requests.
  • ConnectionError, Timeout, HTTPError, URLRequired (The Parents): These are more specific categories under RequestException.
    • ConnectionError is like saying “Infection.”
    • Timeout is like saying “Fatigue.”
    • HTTPError is like saying “External Injury.”
    • URLRequired is like saying “Genetic Condition” (problem with the input itself).
  • ConnectTimeout, ReadTimeout (The Children): These are even more specific.
    • ConnectTimeout (child of Timeout) is like “Trouble Falling Asleep.”
    • ReadTimeout (child of Timeout) is like “Waking Up Too Early.” Both are types of “Fatigue” (Timeout).

This hierarchy allows you to decide how specific you want to be when handling errors.

Key Members of the Exception Family

All requests exceptions live inside the requests.exceptions module. You usually import the main requests library and access them like requests.exceptions.ConnectionError.

Here are some of the most common ones you’ll encounter:

  • requests.exceptions.RequestException: The base exception. Catching this catches all exceptions listed below.
  • requests.exceptions.ConnectionError: Problems connecting to the server. This could be due to:
    • DNS failure (can’t find the server’s address).
    • Refused connection (server is there but not accepting connections).
    • Network is unreachable.
  • requests.exceptions.Timeout: The request took too long. This is a parent category for:
    • requests.exceptions.ConnectTimeout: Timeout occurred while trying to establish the connection.
    • requests.exceptions.ReadTimeout: Timeout occurred after connecting, while waiting for the server to send data.
  • requests.exceptions.HTTPError: Raised when the server returns a “bad” status code (4xx for client errors like “404 Not Found”, or 5xx for server errors like “500 Internal Server Error”). Important: requests does not automatically raise this just because the status code is bad. You typically need to call the response.raise_for_status() method to trigger it.
  • requests.exceptions.TooManyRedirects: The request exceeded the maximum number of allowed redirects (usually 30).
  • requests.exceptions.URLRequired: You tried to make a request without providing a URL.
  • requests.exceptions.MissingSchema: The URL was missing the scheme (like http:// or https://).
  • requests.exceptions.InvalidURL: The URL was malformed in some way.
  • requests.exceptions.InvalidSchema: The URL scheme was not recognized (e.g., ftp:// might not be supported by default).

Handling Exceptions: The try...except Block

How do we use this hierarchy in our code? We use Python’s try...except block.

  1. Put the code that might cause an error (like requests.get()) inside the try: block.
  2. Follow it with one or more except: blocks. Each except: block specifies the type of exception it’s designed to catch.

Example 1: Catching Any requests Error

Let’s try fetching a URL that doesn’t exist and catch the most general exception.

import requests

# A URL that might cause a connection error (e.g., non-existent domain)
bad_url = 'https://this-domain-probably-does-not-exist-asdfghjkl.com'
good_url = 'https://httpbin.org/get'

url_to_try = bad_url # Change to good_url to see success case

print(f"Trying to fetch: {url_to_try}")

try:
    response = requests.get(url_to_try, timeout=5) # Add timeout
    response.raise_for_status() # Check for 4xx/5xx errors
    print("Success! Status Code:", response.status_code)
    # Process the response... (e.g., print response.text)

except requests.exceptions.RequestException as e:
    # This will catch ANY error originating from requests
    print(f"\nOh no! A requests-related error occurred:")
    print(f"Error Type: {type(e).__name__}")
    print(f"Error Details: {e}")

print("\nScript continues after handling the error.")

Possible Output (if url_to_try = bad_url):

Trying to fetch: https://this-domain-probably-does-not-exist-asdfghjkl.com

Oh no! A requests-related error occurred:
Error Type: ConnectionError
Error Details: HTTPSConnectionPool(host='this-domain-probably-does-not-exist-asdfghjkl.com', port=443): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x...>: Failed to resolve 'this-domain-probably-does-not-exist-asdfghjkl.com' ([Errno ...)"))

Script continues after handling the error.

Explanation:

  • We put requests.get() and response.raise_for_status() inside the try block.
  • If requests.get() fails (e.g., due to ConnectionError or Timeout), or if raise_for_status() detects a 4xx/5xx code (HTTPError), an exception is raised.
  • The except requests.exceptions.RequestException as e: block catches it because ConnectionError, Timeout, and HTTPError are all descendants of RequestException.
  • We print a helpful message and the details of the error (e). Crucially, the script doesn’t crash.

Example 2: Catching Specific Errors

Sometimes, you want to react differently based on the type of error. Was it a temporary network glitch, or did the server permanently remove the page?

import requests

# URL that gives a 404 error
not_found_url = 'https://httpbin.org/status/404'
# URL that is slow and might time out
timeout_url = 'https://httpbin.org/delay/5' # Delays response by 5 seconds

url_to_try = timeout_url # Change to not_found_url to see HTTPError

print(f"Trying to fetch: {url_to_try}")

try:
    # Set a short timeout to demonstrate Timeout exception
    response = requests.get(url_to_try, timeout=2)
    response.raise_for_status() # Check for 4xx/5xx status codes
    print("Success! Status Code:", response.status_code)
    # Process response...

except requests.exceptions.ConnectTimeout as e:
    print(f"\nError: Could not connect to the server in time.")
    print(f"Details: {e}")
    # Maybe retry later?

except requests.exceptions.ReadTimeout as e:
    print(f"\nError: Server took too long to send data.")
    print(f"Details: {e}")
    # Maybe the server is slow, could try again?

except requests.exceptions.ConnectionError as e:
    print(f"\nError: Network problem (e.g., DNS error, refused connection).")
    print(f"Details: {e}")
    # Check internet connection?

except requests.exceptions.HTTPError as e:
    print(f"\nError: Bad HTTP status code received from server.")
    print(f"Status Code: {e.response.status_code}")
    print(f"Details: {e}")
    # Was it a 404 Not Found? 500 Server Error?

except requests.exceptions.RequestException as e:
    # Catch any other requests error that wasn't specifically handled above
    print(f"\nAn unexpected requests error occurred:")
    print(f"Error Type: {type(e).__name__}")
    print(f"Details: {e}")

print("\nScript continues...")

Possible Output (if url_to_try = timeout_url):

Trying to fetch: https://httpbin.org/delay/5

Error: Server took too long to send data.
Details: HTTPSConnectionPool(host='httpbin.org', port=443): Read timed out. (read timeout=2)

Script continues...

Possible Output (if url_to_try = not_found_url):

Trying to fetch: https://httpbin.org/status/404

Error: Bad HTTP status code received from server.
Status Code: 404
Details: 404 Client Error: NOT FOUND for url: https://httpbin.org/status/404

Script continues...

Explanation:

  • We have multiple except blocks, ordered from most specific (ConnectTimeout, ReadTimeout) to more general (ConnectionError, HTTPError) and finally the catch-all RequestException.
  • Python tries the except blocks in order. When an exception occurs, the first matching block is executed.
  • If a ReadTimeout occurs, the except requests.exceptions.ReadTimeout block handles it. It won’t fall through to the except requests.exceptions.ConnectionError or except requests.exceptions.RequestException blocks, even though ReadTimeout is a type of RequestException.
  • This allows us to provide specific feedback or recovery logic for different error scenarios.

Inheritance Benefit: If you write except requests.exceptions.Timeout as e:, this block will catch both ConnectTimeout and ReadTimeout because they inherit from Timeout.

How It Works Internally: Wrapping Lower-Level Errors

Requests doesn’t handle network connections directly. It uses a lower-level library called urllib3 under the hood (managed via Transport Adapters). When urllib3 encounters a network problem (like a connection error or timeout), it raises its own specific exceptions (e.g., urllib3.exceptions.MaxRetryError, urllib3.exceptions.NewConnectionError, urllib3.exceptions.ReadTimeoutError).

Requests catches these urllib3 exceptions inside its Transport Adapters (specifically, the HTTPAdapter.send method) and then raises its own corresponding exception from the requests.exceptions hierarchy. This simplifies things for you – you only need to worry about catching requests exceptions, not the underlying urllib3 ones.

sequenceDiagram
    participant UserCode as Your Code
    participant ReqAPI as requests.get()
    participant Adapter as HTTPAdapter
    participant Urllib3 as urllib3 library
    participant Network

    UserCode->>ReqAPI: requests.get(bad_url, timeout=1)
    ReqAPI->>Adapter: send(prepared_request)
    Adapter->>Urllib3: urlopen(method, url, ..., timeout=1)
    Urllib3->>Network: Attempt connection...
    Network-->>Urllib3: Fails (e.g., DNS lookup fails)
    Urllib3->>Urllib3: Raise urllib3.exceptions.NewConnectionError
    Urllib3-->>Adapter: Propagate NewConnectionError
    Adapter->>Adapter: Catch NewConnectionError
    Adapter->>Adapter: Raise requests.exceptions.ConnectionError(original_error)
    Adapter-->>ReqAPI: Propagate ConnectionError
    ReqAPI-->>UserCode: Propagate ConnectionError
    UserCode->>UserCode: Catch requests.exceptions.ConnectionError

Let’s look at the definitions in requests/exceptions.py. You can see the inheritance structure clearly:

# File: requests/exceptions.py (Simplified View)

from urllib3.exceptions import HTTPError as BaseHTTPError

# The base class for all requests exceptions
class RequestException(IOError):
    """There was an ambiguous exception that occurred while handling your request."""
    # ... (stores request/response objects) ...

# Specific exceptions inheriting from RequestException or other requests exceptions
class HTTPError(RequestException):
    """An HTTP error occurred.""" # Typically raised by response.raise_for_status()

class ConnectionError(RequestException):
    """A Connection error occurred."""

class ProxyError(ConnectionError): # Inherits from ConnectionError
    """A proxy error occurred."""

class SSLError(ConnectionError): # Inherits from ConnectionError
    """An SSL error occurred."""

class Timeout(RequestException): # Inherits directly from RequestException
    """The request timed out."""

class ConnectTimeout(ConnectionError, Timeout): # Inherits from BOTH ConnectionError and Timeout!
    """The request timed out while trying to connect to the remote server."""

class ReadTimeout(Timeout): # Inherits from Timeout
    """The server did not send any data in the allotted amount of time."""

class URLRequired(RequestException):
    """A valid URL is required to make a request."""

class TooManyRedirects(RequestException):
    """Too many redirects."""

# ... other specific errors like MissingSchema, InvalidURL, etc. ...

# Some exceptions might also inherit from standard Python errors
class JSONDecodeError(RequestException, ValueError): # Inherits from RequestException and ValueError
    """Couldn't decode the text into json"""
    # Uses Python's built-in JSONDecodeError capabilities

And here’s a simplified view of how requests/adapters.py (HTTPAdapter.send) catches urllib3 errors and raises requests errors:

# File: requests/adapters.py (Simplified View in HTTPAdapter.send method)

from urllib3.exceptions import (
    MaxRetryError, ConnectTimeoutError, NewConnectionError, ResponseError,
    ProxyError as _ProxyError, SSLError as _SSLError, ReadTimeoutError,
    ProtocolError, ClosedPoolError, InvalidHeader as _InvalidHeader
)
from ..exceptions import (
    ConnectionError, ConnectTimeout, ReadTimeout, SSLError, ProxyError,
    RetryError, InvalidHeader, RequestException # And others
)

class HTTPAdapter(BaseAdapter):
    def send(self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None):
        # ... (prepare connection using self.get_connection_with_tls_context) ...
        conn = self.get_connection_with_tls_context(...)
        # ... (verify certs, prepare URL, add headers) ...

        try:
            # === Make the actual request using urllib3 ===
            resp = conn.urlopen(
                method=request.method,
                url=url,
                # ... other args like body, headers ...
                retries=self.max_retries,
                timeout=timeout,
            )

        # === Catch specific urllib3 errors and raise corresponding requests errors ===

        except (ProtocolError, OSError) as err: # General network/protocol errors
            raise ConnectionError(err, request=request)

        except MaxRetryError as e: # urllib3 retried but failed
            if isinstance(e.reason, ConnectTimeoutError):
                raise ConnectTimeout(e, request=request)
            if isinstance(e.reason, ResponseError): # Errors related to retry logic
                raise RetryError(e, request=request)
            if isinstance(e.reason, _ProxyError):
                raise ProxyError(e, request=request)
            if isinstance(e.reason, _SSLError):
                raise SSLError(e, request=request)
            # Fallback for other retry errors
            raise ConnectionError(e, request=request)

        except ClosedPoolError as e: # Connection pool was closed
            raise ConnectionError(e, request=request)

        except _ProxyError as e: # Direct proxy error
            raise ProxyError(e)

        except (_SSLError, ReadTimeoutError, _InvalidHeader) as e: # Other specific errors
            if isinstance(e, _SSLError):
                raise SSLError(e, request=request)
            elif isinstance(e, ReadTimeoutError):
                raise ReadTimeout(e, request=request)
            elif isinstance(e, _InvalidHeader):
                raise InvalidHeader(e, request=request)
            else:
                # Should not happen, but raise generic RequestException if needed
                raise RequestException(e, request=request)

        # ... (build and return the Response object if successful) ...
        return self.build_response(request, resp)

This wrapping makes your life easier by providing a consistent set of exceptions (requests.exceptions) to handle, regardless of the underlying urllib3 details.

Conclusion

You’ve learned about the requests Exception Hierarchy – a family tree of error types that requests raises when things go wrong.

  • You saw that all requests exceptions inherit from the base requests.exceptions.RequestException.
  • You learned about key specific exceptions like ConnectionError, Timeout (and its children ConnectTimeout, ReadTimeout), and HTTPError (raised by response.raise_for_status()).
  • You practiced using try...except blocks to catch both general (RequestException) and specific exceptions, allowing for tailored error handling.
  • You understood that requests wraps lower-level errors (from urllib3) into its own exception types, simplifying error handling for you.

Understanding this hierarchy is crucial for writing robust Python code that can gracefully handle the inevitable problems that occur when dealing with networks and web services.

So far, we’ve mostly used the default way requests handles connections. But what if we need more control over how connections are made, maybe to configure retries differently, or use different SSL settings? That’s where Transport Adapters come in.

Next: Chapter 7: Transport Adapters


Generated by AI Codebase Knowledge Builder