"""
In data analysis, cosine similarity is a measure of similarity between two non-zero
vectors defined in an inner product space. Cosine similarity is the cosine of the
angle between the vectors; that is, it is the dot product of the vectors divided by
the product of their lengths. It follows that the cosine similarity does not depend
on the magnitudes of the vectors, but only on their angle. The cosine similarity
always belongs to the interval (-1,1)
For example, two proportional vectors have a cosine similarity of +1, two
orthogonal vectors have a similarity of 0, and two opposite vectors have
a similarity of minus 1. In some contexts, the component values of the vectors
cannot be negative, in which case the cosine similarity is bounded in (0,1)

check on youtube: https://www.youtube.com/watch?v=P3j70Qw-bPY
"""



import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

# Load the user–movie matrix
matrix = pd.read_csv(
    "user_movie_matrix_10x100.csv",
    index_col=0
)

# Replace missing ratings with 0
matrix_filled = matrix.fillna(0)

# Compute cosine similarity
similarity = cosine_similarity(matrix_filled)

# Convert to DataFrame
similarity_df = pd.DataFrame(
    similarity,
    index=matrix.index,
    columns=matrix.index
)

# Save to CSV
similarity_df.to_csv(
    "user_cosine_similarity.csv"
)

print(similarity_df)