1 year ago

#261005

test-img

YannP

How to solve this list index out of range error

i want to obtain proper list of marks and models of boats from two dataset (one lambda an another of reference) with fuzzywuzzy (levensthein model in python) but i have an issue in my code that i don't understand

the two datasets:

https://www.transfernow.net/dl/202203070QxpVjYJ

there is my code :

   #%%
    from fuzzywuzzy import process
    import pandas as pd
    
    #%%
    BASE_LAMBDA_PATH = '../ressources/marques_modeles_lambda_entier.csv'
    BASE_REF_PATH = '../ressources/marques_modeles_ref_entier.csv'
    #%%
    lambda_df = pd.read_csv(BASE_LAMBDA_PATH, sep=";")
    #%%
    ref_df = pd.read_csv(BASE_REF_PATH, sep=";")
    
    #%% j'ai créé ma liste de résultat (initée à vide)
    df_result = pd.DataFrame(columns=['marque', 'lambda','ref','score'])
    
    #%% je parcours ma table de modèles lambda
    for ind in lambda_df.index:
        marque = lambda_df['MARQUE_REF'][ind]
        modele_lambda = lambda_df['MODELE'][ind]
        ref_list = (ref_df[(ref_df['lib_marque'] == marque)]['lib_model']).to_list()
        choices = process.extract(modele_lambda, ref_list, limit=1)
        approx = choices[0][0]
        score = choices[0][1]
        df2 = pd.DataFrame(data = [(marque, modele_lambda, approx, score)],\
             columns=['marque', 'lambda','ref','score'])
        df_result = pd.concat([df_result, df2], axis=0, ignore_index=True)
    
    df_result.to_csv('output_matching_groupe.csv', sep=';', index=False)
    
    '''
    tdep = time.time()
    tfin = time.time()
    print(f"duree de {tfin-tdep} secondes")
    '''
    # %%

the error:

    IndexError                                Traceback (most recent call last)
    c:\Users\boats\src\list_matching_groupe.py in <cell line: 1>()
          20 ref_list = (ref_df[(ref_df['lib_marque'] == marque)]['lib_model']).to_list()
          21 choices = process.extract(modele_lambda, ref_list, limit=1)
    ----> 22 approx = choices[0][0]
          23 score = choices[0][1]
          24 df2 = pd.DataFrame(data = [(marque, modele_lambda, approx, score)],\
          25      columns=['marque', 'lambda','ref','score'])
    
    IndexError: list index out of range

I don't understand it because choices[0][0] actually works i obtain: 'Guy Couach 1401'

python

python-3.x

pandas

fuzzywuzzy

0 Answers

Your Answer

Accepted video resources