Spotipy stops extracting API data without giving an error code and renders credentials unusable #1085

omendo-galeo · 2024-04-09T10:19:12Z

I am trying to extract information about artists, albums and songs. The problem comes when after one iteration (approximately 1000 songs, 1000 albums and 1200 artists), the API stops working, but without returning a 429 error code or anything, it just waits, so I can't handle exceptions of any kind.

I have tried to make a custom session of requests to prevent it from staying in the retry state indefinitely but it does not react.

This code is only to extract tracks info:

import csv
import os
import json
import random
import spotipy
import time

from datetime import datetime
from dateutil import parser as date_parser
from dateutil.parser import ParserError
from tqdm import tqdm


def open_chunk_id(n_json):
    csv_file_path = f'output_chunks/chunk_{n_json}.csv'
    with open(csv_file_path, newline='') as csvfile:
        track_ids_reader = csv.reader(csvfile)
        next(track_ids_reader)  # Skip the header row
        track_ids = [row[0] for row in track_ids_reader]
    return track_ids


def fetch_track_info_from_csv(track_ids, access_token):
    track_data = []
    sp = spotipy.Spotify(auth=access_token)
    scrapped_time = datetime.now()
    for track_id in tqdm(track_ids, desc='Fetching track information', position=0):
        try:
            time.sleep(random.uniform(0, 0.5))
            track_info = sp.track(str(track_id))
            track_data.append({
                'Track ID': track_id,
                'Track Name': track_info['name'],
                'Artist(s)': [artist['name'] for artist in track_info['artists']],
                'Album': track_info['album']['name'],
                'Release Date': track_info['album']['release_date'],  
                'Popularity': track_info['popularity'],
                'Duration (ms)': track_info['duration_ms'],
                'Explicit': track_info['explicit'],
                'Track Number': track_info['track_number'],
                'URI': track_info['uri'].replace('spotify:track:', ''),
                'Album ID': track_info['album']['id'],
                'Artist ID(s)': [artist['id'] for artist in track_info['artists']],
                'Scrapped Time': scrapped_time.strftime('%Y-%m-%d %H:%M:%S')
            }) 
        except Exception as e:
            print(f"Failed to fetch information for track ID {track_id}: {str(e)}")
    
    return track_data


def save_track_data_to_json(track_data, output_json_file):
    # Restructure data to match JSON format
    json_data = []
    for track_info in track_data:
        try:
            parsed_date = date_parser.parse(track_info['Release Date'])
            if parsed_date.day == 1:
                # Case 1: Only year provided
                release_date = parsed_date.strftime('%Y-01-01')
            else:
                # Case 3: All info provided
                release_date = parsed_date.strftime('%Y-%m-%d')
        except ParserError:
            print(f"ParserError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.")
            # Set release_date to only the year
            release_date = parsed_date.strftime('%Y-01-01')
        except ValueError:
            print(f"ValueError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.")
            # Set release_date to only the year
            release_date = parsed_date.strftime('%Y-01-01')            
        except Exception as e:
            print(f"Failed to parse release date for track ID {track_info['Track ID']}: {str(e)}")
            # Default to None if parsing fails
            release_date = None

        json_data.append({
            'Track ID': track_info['Track ID'],
            'Track Name': track_info['Track Name'],
            'Artist(s)': track_info['Artist(s)'],
            'Album': track_info['Album'],
            'Release Date': release_date,
            'Popularity': track_info['Popularity'],
            'Duration (ms)': track_info['Duration (ms)'],
            'Explicit': track_info['Explicit'],
            'Artist ID(s)': track_info['Artist ID(s)'],
            'Track Number': track_info['Track Number'],
            'Album ID': track_info['Album ID'],
            'URI': track_info['URI'],
            'Scrapped Time': track_info['Scrapped Time']
        })

    # Save data to JSON file
    with open(output_json_file, 'w', encoding='utf-8') as json_file:
        json.dump(json_data, json_file, ensure_ascii=False, indent=4)

    print(f"Track information saved to {output_json_file}")

I need to know how to keep getting data from the API without the credentials being unusable, since doing calculations, the ratelimit that Spotify says is not exceeded. Otherwise, I need the API to return an error code to handle the exception and be able to pivot between multiple credentials.

After a complete and successful execution, I re-launch the script and at one point, the API stops at this point and does not return any errors. I re-launch the script with the same credentials, and it does not start, as they have been unusable for at least 24 hours.

Captura de pantalla 2024-04-09 a las 11 15 58

Environment:

macOS Sonoma 14.2.1
Python 3.11.2
spotipy last version
Pycharm

Additional context
Add any other context about the problem here.

tedwenn · 2024-04-21T19:49:09Z

I'm having the same issue with sp.album_tracks(). It just stops without returning any error. When I try stepping into function with the debugger, it never actually gets there. It just waits.

dieser-niko · 2024-04-22T07:58:36Z

If the script just freezes out of the blue, then it is probably because of Spotify's rate limit.

tedwenn · 2024-04-24T20:07:52Z

Yeah, it's a rate limit issue. When I try just calling the API directly using requests, I get a 429. Something about how spotipy is wrapping the API is stalling, rather than returning the 429.

dieser-niko · 2024-04-24T20:40:37Z

If you want to stop the freezing, you can do something similar to this: #766 (comment)

Edit: I've gotta admit, I don't know how the script will behave. My guess is that it's going to raise some kind of ratelimit error

omendo-galeo added the bug label Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spotipy stops extracting API data without giving an error code and renders credentials unusable #1085

Spotipy stops extracting API data without giving an error code and renders credentials unusable #1085

omendo-galeo commented Apr 9, 2024

tedwenn commented Apr 21, 2024

dieser-niko commented Apr 22, 2024

tedwenn commented Apr 24, 2024

dieser-niko commented Apr 24, 2024 •

edited

Spotipy stops extracting API data without giving an error code and renders credentials unusable #1085

Spotipy stops extracting API data without giving an error code and renders credentials unusable #1085

Comments

omendo-galeo commented Apr 9, 2024

tedwenn commented Apr 21, 2024

dieser-niko commented Apr 22, 2024

tedwenn commented Apr 24, 2024

dieser-niko commented Apr 24, 2024 • edited

dieser-niko commented Apr 24, 2024 •

edited