The UC Irvine Knowledge Discovery in Databases (KDD) Archive is a new online repository of large data sets which encompasses a wide variety of data types, analysis tasks, and application areas. The primary role of this repository is to enable researchers in knowledge discovery and data mining to scale existing and future data analysis algorithms to very large and complex data sets.
Creation of this archive was supported by a grant from the Information and Data Management Program at the National Science Foundation. The archive is intended to serve as a permanent repository of publicly-accessible data sets for research in KDD and data mining. It complements the original UCI Machine Learning Archive , which typically focuses on smaller classification-oriented data sets.
We are seeking submissions of large, well-documented data sets that can be made publicly available. Data types and tasks of interest include, but is not limited to:
Data Types | Tasks | |
---|---|---|
multivariate time series sequential relational text image spatial multimedia transactional heterogeneous sound/audio |
classification regression clustering density estimation retrieval causal modeling visualization discovery exploratory data analysis data cleaning recommendation systems |
Submission Guidelines: Please see the UCI KDD Archive web site for detailed instructions.
Seth Hettich (sjh@ics.uci.edu)
librarian