Here I have two datasets "Y_N" and "data". "Y_N" have 8 thousand record and in "data" have 1.6 million records. in both dataset each record in the form of string. so my task is to match each record of "Y_N" with each record of "data" and calculate similarity index for each combination.
I did this by using for loop but its take much more time(probably 1 week)
so how I can speed up my code instead of using for loop is there any other way for that?
from fuzzywuzzy import fuzzfrom fuzzywuzzy import processsm = [(Y_N['priceGuideDescription'][i], data['priceGuideDescription'][j], fuzz.ratio(Y_N['priceGuideDescription'][i], data['priceGuideDescription'][j]) ) for i in range(len(Y_N)) for j in range(0, 7000) ]import pandas as pddf = pd.DataFrame(sm)df.head()