Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spotipy stops extracting API data without giving an error code and renders credentials unusable #1085

Open
omendo-galeo opened this issue Apr 9, 2024 · 4 comments
Labels

Comments

@omendo-galeo
Copy link

I am trying to extract information about artists, albums and songs. The problem comes when after one iteration (approximately 1000 songs, 1000 albums and 1200 artists), the API stops working, but without returning a 429 error code or anything, it just waits, so I can't handle exceptions of any kind.

I have tried to make a custom session of requests to prevent it from staying in the retry state indefinitely but it does not react.

This code is only to extract tracks info:

import csv
import os
import json
import random
import spotipy
import time

from datetime import datetime
from dateutil import parser as date_parser
from dateutil.parser import ParserError
from tqdm import tqdm


def open_chunk_id(n_json):
    csv_file_path = f'output_chunks/chunk_{n_json}.csv'
    with open(csv_file_path, newline='') as csvfile:
        track_ids_reader = csv.reader(csvfile)
        next(track_ids_reader)  # Skip the header row
        track_ids = [row[0] for row in track_ids_reader]
    return track_ids


def fetch_track_info_from_csv(track_ids, access_token):
    track_data = []
    sp = spotipy.Spotify(auth=access_token)
    scrapped_time = datetime.now()
    for track_id in tqdm(track_ids, desc='Fetching track information', position=0):
        try:
            time.sleep(random.uniform(0, 0.5))
            track_info = sp.track(str(track_id))
            track_data.append({
                'Track ID': track_id,
                'Track Name': track_info['name'],
                'Artist(s)': [artist['name'] for artist in track_info['artists']],
                'Album': track_info['album']['name'],
                'Release Date': track_info['album']['release_date'],  
                'Popularity': track_info['popularity'],
                'Duration (ms)': track_info['duration_ms'],
                'Explicit': track_info['explicit'],
                'Track Number': track_info['track_number'],
                'URI': track_info['uri'].replace('spotify:track:', ''),
                'Album ID': track_info['album']['id'],
                'Artist ID(s)': [artist['id'] for artist in track_info['artists']],
                'Scrapped Time': scrapped_time.strftime('%Y-%m-%d %H:%M:%S')
            }) 
        except Exception as e:
            print(f"Failed to fetch information for track ID {track_id}: {str(e)}")
    
    return track_data


def save_track_data_to_json(track_data, output_json_file):
    # Restructure data to match JSON format
    json_data = []
    for track_info in track_data:
        try:
            parsed_date = date_parser.parse(track_info['Release Date'])
            if parsed_date.day == 1:
                # Case 1: Only year provided
                release_date = parsed_date.strftime('%Y-01-01')
            else:
                # Case 3: All info provided
                release_date = parsed_date.strftime('%Y-%m-%d')
        except ParserError:
            print(f"ParserError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.")
            # Set release_date to only the year
            release_date = parsed_date.strftime('%Y-01-01')
        except ValueError:
            print(f"ValueError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.")
            # Set release_date to only the year
            release_date = parsed_date.strftime('%Y-01-01')            
        except Exception as e:
            print(f"Failed to parse release date for track ID {track_info['Track ID']}: {str(e)}")
            # Default to None if parsing fails
            release_date = None

        json_data.append({
            'Track ID': track_info['Track ID'],
            'Track Name': track_info['Track Name'],
            'Artist(s)': track_info['Artist(s)'],
            'Album': track_info['Album'],
            'Release Date': release_date,
            'Popularity': track_info['Popularity'],
            'Duration (ms)': track_info['Duration (ms)'],
            'Explicit': track_info['Explicit'],
            'Artist ID(s)': track_info['Artist ID(s)'],
            'Track Number': track_info['Track Number'],
            'Album ID': track_info['Album ID'],
            'URI': track_info['URI'],
            'Scrapped Time': track_info['Scrapped Time']
        })

    # Save data to JSON file
    with open(output_json_file, 'w', encoding='utf-8') as json_file:
        json.dump(json_data, json_file, ensure_ascii=False, indent=4)

    print(f"Track information saved to {output_json_file}")

I need to know how to keep getting data from the API without the credentials being unusable, since doing calculations, the ratelimit that Spotify says is not exceeded. Otherwise, I need the API to return an error code to handle the exception and be able to pivot between multiple credentials.

After a complete and successful execution, I re-launch the script and at one point, the API stops at this point and does not return any errors. I re-launch the script with the same credentials, and it does not start, as they have been unusable for at least 24 hours.

Captura de pantalla 2024-04-09 a las 11 15 58

Environment:

  • macOS Sonoma 14.2.1
  • Python 3.11.2
  • spotipy last version
  • Pycharm

Additional context
Add any other context about the problem here.

@tedwenn
Copy link

tedwenn commented Apr 21, 2024

I'm having the same issue with sp.album_tracks(). It just stops without returning any error. When I try stepping into function with the debugger, it never actually gets there. It just waits.

@dieser-niko
Copy link
Member

If the script just freezes out of the blue, then it is probably because of Spotify's rate limit.

@tedwenn
Copy link

tedwenn commented Apr 24, 2024

Yeah, it's a rate limit issue. When I try just calling the API directly using requests, I get a 429. Something about how spotipy is wrapping the API is stalling, rather than returning the 429.

@dieser-niko
Copy link
Member

dieser-niko commented Apr 24, 2024

If you want to stop the freezing, you can do something similar to this: #766 (comment)

Edit: I've gotta admit, I don't know how the script will behave. My guess is that it's going to raise some kind of ratelimit error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants