Fork me on GitHub

MyMediaLite: How to use the command line programs

News

MyMediaLite 3.11 has been released.


MyMediaLite is mainly a library, meant to be used by other applications.
We provide two command-line tools that offer much of MyMediaLite's functionality. They allow you to find out how MyMediaLite can deal with a dataset without having to integrate the library in your application or having to develop your own program.

Rating Prediction

The programs expect the data to be in a simple text format:

  user_id item_id rating
  
where user_id and item_id are integers referring to users and items, respectively, and rating is a floating-point number expressing how much a user likes an item. The separator between the values can be either spaces, tabs, or commas. If there are more than three columns, all additional columns will be ignored. A small example dataset:
  1       1       5
  1       2       3
  1       3       4
  1       4       3
  1       5       3
  1       7       4
  

The general usage of the rating prediction program is as follows:

    rating_prediction --training-file=TRAINING_FILE --test-file=TEST_FILE --recommender=METHOD [OPTIONS]
    
METHOD is the recommender to use, which will be trained using the contents of TRAINING_FILE. The recommender will then predict the data in TEST_FILE, and the program will display the RMSE (root mean square error) and MAE (mean absolute error) of the predictions.

If you call rating_prediction without arguments, it will provide a list of recommenders to choose from, plus their arguments and further options.

You can download the MovieLens 100k ratings dataset from the GroupLens Research website and unzip it to have something to play with. If you have downloaded the MyMediaLite source code, you can do this automatically by entering

      make download-movielens
    

To try out a simple baseline method on the data, you just need to enter

    rating_prediction --training-file=u1.base --test-file=u1.test --recommender=UserAverage
    
which should give a result like
    UserAverage training_time 00:00:00.000098 RMSE 1.063 MAE 0.85019 testing_time 00:00:00.032326
    

To use a more advanced recommender, enter

    rating_prediction --training-file=u1.base --test-file=u1.test --recommender=BiasedMatrixFactorization
    
which yields better result than the user average:
    BiasedMatrixFactorization num_factors=10 regularization=0.015 learn_rate=0.01 num_iter=30
                                init_mean=0 init_stdev=0.1
    training_time 00:00:03.3575780 RMSE 0.96108 MAE 0.75124 testing_time 00:00:00.0159740
    
The key-value pairs after the method name represent arguments to the recommender that may be modified to get even better results. For instance, we could use more latent factors per user and item, which leads to a more complex (and hopefully more accurate) model:
    rating_prediction --training-file=u1.base --test-file=u1.test --recommender=BiasedMatrixFactorization --recommender-options="num_factors=20"
    ...
    ... RMSE 0.98029 MAE 0.76558
    
Wait a second. The RMSE actually got worse!
This may be because we do not train our model with the optimal arguments. One thing to look at is the number of iterations. If we iterate for too long, the learning process overfits the training data, which means the resulting model does not generalize well to predict unknown future data. The options find_iter=A, num_iter=B and max_iter=C help us to find the right number of iterations. Togehter, the three options mean "From iteration B on, give out the evaluation results on the test set every A iterations, until you reach iteration C."
    rating_prediction --training-file=u1.base --test-file=u1.test --recommender=BiasedMatrixFactorization --recomender-options="num_factors=20 num_iter=0" --max-iter=25 --num-iter=0
    ...
    RMSE 1.17083 MAE 0.96918 iteration 0
    RMSE 1.01383 MAE 0.8143 iteration 1
    RMSE 0.98742 MAE 0.78742 iteration 2
    RMSE 0.97672 MAE 0.77668 iteration 3
    RMSE 0.9709 MAE 0.77078 iteration 4
    RMSE 0.96723 MAE 0.76702 iteration 5
    RMSE 0.96466 MAE 0.76442 iteration 6
    RMSE 0.96269 MAE 0.76241 iteration 7
    RMSE 0.96104 MAE 0.76069 iteration 8
    RMSE 0.95958 MAE 0.75917 iteration 9
    RMSE 0.95825 MAE 0.75783 iteration 10
    RMSE 0.95711 MAE 0.75667 iteration 11
    RMSE 0.95626 MAE 0.75569 iteration 12
    RMSE 0.95578 MAE 0.75501 iteration 13
    RMSE 0.95573 MAE 0.75467 iteration 14
    RMSE 0.95611 MAE 0.75467 iteration 15
    RMSE 0.9569 MAE 0.75499 iteration 16
    RMSE 0.95802 MAE 0.75551 iteration 17
    RMSE 0.95942 MAE 0.75623 iteration 18
    RMSE 0.96102 MAE 0.7571 iteration 19
    RMSE 0.96277 MAE 0.75806 iteration 20
    RMSE 0.96463 MAE 0.75909 iteration 21
    RMSE 0.96656 MAE 0.76017 iteration 22
    RMSE 0.96852 MAE 0.7613 iteration 23
    RMSE 0.9705 MAE 0.76246 iteration 24
    RMSE 0.97247 MAE 0.76364 iteration 25
    
This means that we should probably set --num-iter to around 15 to get better results.

Warning: Depending on the recommender, the choice of arguments (hyperparameters) may be crucial to the recommender's performance. Sometimes you may find suitable values by playing a bit with the arguments, starting from their default values. However, there is no guarantee that this will work! You should use a principled approach to find good hyperparameters, e.g. cross-validation and grid search.

Item Prediction from Positive-Only Feedback

The item recommendation program behaves similarly to the rating prediction program, so we concentrate on the differences here.

item_recommendation --training-file=TRAINING_FILE --test-file=TEST_FILE --recommender=METHOD [OPTIONS]

Again, if you call item_recommendation without arguments, it will provide a list of recommenders to choose from, plus their arguments and further options.

The third column in the data files may be omitted. If it is present, it will be ignored.

Example:

  1       1
  1       2
  1       3
  1       4
  1       5
  1       7
  

Instead of RMSE and MAP, the evaluation measures are now prec@N (precision at N), AUC (area under the ROC curve), MAP (mean average precision), and NDCG (normalized discounted cumulative gain).

Let us start again with some baseline methods, Random and MostPopular:

  item_recommendation --training-file=u1.base --test-file=u1.test --recommender=Random
  random training_time 00:00:00.0001040
    AUC 0.49924 prec@5 0.02789 prec@10 0.02898 MAP 0.00122 NDCG 0.37214
    num_users 459 num_items 1650 testing_time 00:00:02.7115540

  item_recommendation --training-file=u1.base --test-file=u1.test --recommender=MostPopular
  MostPopular training_time 00:00:00.0015710
    AUC 0.8543 prec@5 0.322 prec@10 0.30458 MAP 0.02187 NDCG 0.57038
    num_users 459 num_items 1650 testing_time 00:00:02.3813790
  

User-based collaborative filtering:

  item_recommendation --training-file=u1.base --test-file=u1.test --recommender=UserKNN
  UserKNN k=80 training_time 00:00:05.6057200
    AUC 0.91682 prec@5 0.52505 prec@10 0.46776 MAP 0.06482 NDCG 0.68793
    num_users 459 num_items 1650 testing_time 00:00:08.8362840
  

Note that item recommendation evaluation usually takes longer than the rating prediction evaluation, because for each user, scores for every candidate item (possibly all items) have to be computed. You can restrict the number of predictions to be made using the options --test-users=FILE and --candidate-items=FILE to save time.

The item recommendation program supports the same options (--find-iter=N etc.) for iteratively trained recommenders like BPRMF and WRMF.

ContactFollow us on Twitter