Anonymous web data from www.microsoft.com

Task Type

Classification, Collaborative Filtering

Sources

Donor

Jack S. Breese, David Heckerman, Carl M. Kadie
Microsoft Research, Redmond WA, 98052-6399, USA
breese@microsoft.com, heckerma@microsoft.com, carlk@microsoft.com
Date Donated: November 30, 1998

Problem Description

Analysis Task

Predict the areas of www.microsoft.com that a user visited based on data on what other areas he or she visited.

Evaluation Criteria and Constraints

Important solution characteristics are: predictive accuracy, learning time, and speed of predictions.

Preprocessing and Modifications

No additional preprocessing to the data was done.

Other Relevant Information

Experimental procedures are described in:

J. Breese, D. Heckerman., C. Kadie _Empirical Analysis of Predictive Algorithms for Collaborative Filtering_ Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI, July, 1998.

The train- and test set used in this paper are provided as 'anonymous-mswebtrain.dst' and 'anonymous-mswebtest.dst'

Results

Results for this dataset are reported in:

J. Breese, D. Heckerman., C. Kadie Empirical Analysis of Predictive Algorithms for Collaborative Filtering Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI, July, 1998.

This paper presents a comparison of a number of memory-based (correlation and vector similarity techniques) as well as model-based (cluster models and Bayesian networks) methods. In terms of predictive accuracy, the results indicate that the authors' Bayesian network approach to collaborative filtering is the best performing approach on this dataset.

References and Further Information

Results on this dataset were expanded as Microsoft Research Technical Report MSR-TR-98-12.


The UCI KDD Archive
Information and Computer Science
University of California, Irvine
Irvine, CA 92697-3425
Last modified: July 12, 1999