python flask big-data spark bigdata movie-recommendation movielens-dataset Updated Oct 10, 2020; Jupyter Notebook; rixwew / pytorch-fm Star 406 Code Issues Pull requests Factorization Machine models in PyTorch . Contains movie ratings from grouplens site. In the dataset, users and movies are represented with integer IDs, while ratings range from 1 to 5 at a gap of 0.5. This is a report on the movieLens dataset available here. Users were selected at random for inclusion. Some versions provide addational information such as user info or tags. MovieLens helps you find movies you will like. MovieLens 10M We also provide interactive visual graph mining. To select a subset of nodes. Released 1/2009. This network dataset is in the category of Heterogeneous Networks, While it is a small dataset, you can quickly download it and run Spark code on it. by varying the training data on the MovieLens 10 million ratings (ML-10M) dataset. Dataset Items Users Ratings Density (%) Ratings scale MovieLens 1M 3,883 movies 6,040 1,000,209 4.26 [1-5] MovieLens 10M 10,682 movies 71,567 10,000,054 1.31 [1-5] MovieLens 20M 27,278 movies 138,493 20,000,263 0.53 [1-5] Netflix 17,770 movies 480,189 100,480,507 1.18 [1-5] This network dataset is in the category of Heterogeneous Networks MOVIELENS-10M-NORATINGS.ZIP .7z. The user and item IDs are non-negative long (64 bit) integers, and the rating value is a double (64 bit floating point number). UPDATE: If you're interested in learning pandas from a SQL perspective and would prefer to watch a video, you can find video of my 2014 PyData NYC talk here. They have released 20M dataset as well in 2016. 