Machine Learning – ML

Machine Learning – ML is a subfield of Computer Science – CS that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. In 1959, Arthur Samuel defined machine learning as a “Field of study that gives computers the ability to learn without being explicitly programmed”. Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

Machine learning is closely related to (and often overlaps with) computational statistics; a discipline which also focuses in prediction-making through the use of computers. It has strong ties to mathematical optimization, which delivers methods, theory and application domains to the field. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms is unfeasible. Example applications include spam filtering, optical character recognition (OCR), search engines and computer vision. Machine learning is sometimes conflated with data mining, where the latter sub-field focuses more on exploratory data analysis and is known as unsupervised learning.

Within the field of data analytics, machine learning is a method used to devise complex models and algorithms that lend themselves to prediction – in commercial use, this is known as predictive analytics. These analytical models allow researchers, data scientists, engineers, and analysts to “produce reliable, repeatable decisions and results” and uncover “hidden insights” through learning from historical relationships and trends in the data.


Tom M. Mitchell provided a widely quoted, more formal definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.” This definition is notable for its defining machine learning in fundamentally operational rather than cognitive terms, thus following Alan Turing’s proposal in his paper “Computing Machinery and Intelligence” that the question “Can machines think?” be replaced with the question “Can machines do what we (as thinking entities) can do?”

Types of Problems and Tasks

Machine learning tasks are typically classified into three broad categories, depending on the nature of the learning “signal” or “feedback” available to a learning system. These are

  • Supervised Learning: The computer is presented with example inputs and their desired outputs, given by a “teacher”, and the goal is to learn a general rule that maps inputs to outputs.
  • Unsupervised Learning: No labels are given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (feature learning).
  • Reinforcement Learning: A computer program interacts with a dynamic environment in which it must perform a certain goal (such as driving a vehicle), without a teacher explicitly telling it whether it has come close to its goal. Another example is learning to play a game by playing against an opponent.

Between supervised and unsupervised learning is semi-supervised learning, where the teacher gives an incomplete training signal: a training set with some (often many) of the target outputs missing. Transduction is a special case of this principle where the entire set of problem instances is known at learning time, except that part of the targets are missing.

Among other categories of machine learning problems, learning to learn learns its own inductive bias based on previous experience. Developmental learning, elaborated for robot learning, generates its own sequences (also called curriculum) of learning situations to cumulatively acquire repertoires of novel skills through autonomous self-exploration and social interaction with human teachers, and using guidance mechanisms such as active learning, maturation, motor synergies, and imitation.

Another categorization of machine learning tasks arises when one considers the desired output of a machine-learned system:

  • In classification, inputs are divided into two or more classes, and the learner must produce a model that assigns unseen inputs to one or more (multi-label classification) of these classes. This is typically tackled in a supervised way. Spam filtering is an example of classification, where the inputs are email (or other) messages and the classes are “spam” and “not spam”.
  • In regression, also a supervised problem, the outputs are continuous rather than discrete.
  • In clustering, a set of inputs is to be divided into groups. Unlike in classification, the groups are not known beforehand, making this typically an unsupervised task.
  • Density estimation finds the distribution of inputs in some space.
  • Dimensionality reduction simplifies inputs by mapping them into a lower-dimensional space. Topic modeling is a related problem, where a program is given a list of human language documents and is tasked to find out which documents cover similar topics.
A support vector machine is a classifier that divides its input space into two regions, separated by a linear boundary. Here, it has learned to distinguish black and white circles
A support vector machine is a classifier that divides its input space into two regions, separated by a linear boundary. Here, it has learned to distinguish black and white circles

History and relationship to other fields

As a scientific endeavour, machine learning grew out of the quest for artificial intelligence. Already in the early days of AI as an academic discipline, some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed “neural networks”; these were mostly perceptrons and other models that were later found to be reinventions of the generalized linear models of statistics. Probabilistic reasoning was also employed, especially in automated medical diagnosis.

However, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. Probabilistic systems were plagued by theoretical and practical problems of data acquisition and representation. By 1980, expert systems had come to dominate AI, and statistics was out of favor. Work on symbolic/knowledge-based learning did continue within AI, leading to inductive logic programming, but the more statistical line of research was now outside the field of AI proper, in pattern recognition and information retrieval. Neural networks research had been abandoned by AI and computer science around the same time. This line, too, was continued outside the AI/CS field, as “connectionism”, by researchers from other disciplines including Hopfield, Rumelhart and Hinton. Their main success came in the mid-1980s with the reinvention of backpropagation.

Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics and probability theory. It also benefited from the increasing availability of digitized information, and the possibility to distribute that via the Internet.

Machine learning and data mining often employ the same methods and overlap significantly.

They can be roughly distinguished as follows:

  • Machine learning focuses on prediction, based on known properties learned from the training data.
  • Data mining focuses on the discovery of (previously) unknown properties in the data. This is the analysis step of Knowledge Discovery in Databases.

The two areas overlap in many ways: data mining uses many machine learning methods, but often with a slightly different goal in mind. On the other hand, machine learning also employs data mining methods as “unsupervised learning” or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to reproduce known knowledge, while in Knowledge Discovery and Data Mining (KDD) the key task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data.

Machine learning also has intimate ties to optimization: many learning problems are formulated as minimization of some loss function on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances (for example, in classification, one wants to assign a label to instances, and models are trained to correctly predict the pre-assigned labels of a set examples). The difference between the two fields arises from the goal of generalization: while optimization algorithms can minimize the loss on a training set, machine learning is concerned with minimizing the loss on unseen samples.

Relation to Statistics

Machine learning and statistics are closely related fields. According to Michael I. Jordan, the ideas of machine learning, from methodological principles to theoretical tools, have had a long pre-history in statistics. He also suggested the term data science as a placeholder to call the overall field.

Leo Breiman distinguished two statistical modelling paradigms: data model and algorithmic model, wherein ‘algorithmic model’ means more or less the machine learning algorithms like Random forest.

Some statisticians have adopted methods from machine learning, leading to a combined field that they call statistical learning.


A core objective of a learner is to generalize from its experience. Generalization in this context is the ability of a learning machine to perform accurately on new, unseen examples/tasks after having experienced a learning data set. The training examples come from some generally unknown probability distribution (considered representative of the space of occurrences) and the learner has to build a general model about this space that enables it to produce sufficiently accurate predictions in new cases.

The computational analysis of machine learning algorithms and their performance is a branch of theoretical computer science known as computational learning theory. Because training sets are finite and the future is uncertain, learning theory usually does not yield guarantees of the performance of algorithms. Instead, probabilistic bounds on the performance are quite common. The bias–variance decomposition is one way to quantify generalization error.

How well a model, trained with existing examples, predicts the output for unknown instances is called generalization. For best generalization, complexity of the hypothesis should match the complexity of the function underlying the data. If the hypothesis is less complex than the function, we’ve underfitted. Then, we increase the complexity, the training error decreases. But if our hypothesis is too complex, we’ve overfitted. After then, we should find the hypothesis that has the minimum training error.

In addition to performance bounds, computational learning theorists study the time complexity and feasibility of learning. In computational learning theory, a computation is considered feasible if it can be done in polynomial time. There are two kinds of time complexity results. Positive results show that a certain class of functions can be learned in polynomial time. Negative results show that certain classes cannot be learned in polynomial time.

There are many similarities between machine learning theory and statistical inference, although they use different terms.


Decision Tree Learning

Decision Tree Learning uses a decision tree as a predictive model, which maps observations about an item to conclusions about the item’s target value.

Association Rule Learning

Association Rule Learning is a method for discovering interesting relations between variables in large databases.

Artificial Neural Networks

An Artificial Neural Network (ANN) learning algorithm, usually called “Neural Network” (NN), is a learning algorithm that is inspired by the structure and functional aspects of biological neural networks. Computations are structured in terms of an interconnected group of artificial neurons, processing information using a connectionist approach to computation. Modern neural networks are non-linear statistical data modeling tools. They are usually used to model complex relationships between inputs and outputs, to find patterns in data, or to capture the statistical structure in an unknown joint probability distribution between observed variables.

Deep Learning

Falling hardware prices and the development of GPUs for personal use in the last few years have contributed to the development of the concept of Deep learning which consists of multiple hidden layers in an artificial neural network. This approach tries to model the way the human brain processes light and sound into vision and hearing. Some successful applications of deep learning are computer vision and speech recognition.

Inductive Logic Programming

Inductive logic programming (ILP) is an approach to rule learning using logic programming as a uniform representation for input examples, background knowledge, and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesized logic program that entails all positive and no negative examples. Inductive programming is a related field that considers any kind of programming languages for representing hypotheses (and not only logic programming), such as functional programs.

Support Vector Machines

Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other.


Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that observations within the same cluster are similar according to some predesignated criterion or criteria, while observations drawn from different clusters are dissimilar. Different clustering techniques make different assumptions on the structure of the data, often defined by some similarity metric and evaluated for example by internal compactness (similarity between members of the same cluster) and separation between different clusters. Other methods are based on estimated density and graph connectivity. Clustering is a method of unsupervised learning, and a common technique for statistical data analysis.

Bayesian Networks

A Bayesian network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional independencies via a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases. Efficient algorithms exist that perform inference and learning.

Reinforcement Learning

Reinforcement learning is concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward. Reinforcement learning algorithms attempt to find a policy that maps states of the world to the actions the agent ought to take in those states. Reinforcement learning differs from the supervised learning problem in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected.

Representation Learning

Several learning algorithms, mostly unsupervised learning algorithms, aim at discovering better representations of the inputs provided during training. Classical examples include principal components analysis and cluster analysis. Representation learning algorithms often attempt to preserve the information in their input but transform it in a way that makes it useful, often as a pre-processing step before performing classification or predictions, allowing to reconstruct the inputs coming from the unknown data generating distribution, while not being necessarily faithful for configurations that are implausible under that distribution.

Manifold learning algorithms attempt to do so under the constraint that the learned representation is low-dimensional. Sparse coding algorithms attempt to do so under the constraint that the learned representation is sparse (has many zeros). Multilinear subspace learning algorithms aim to learn low-dimensional representations directly from tensor representations for multidimensional data, without reshaping them into (high-dimensional) vectors. Deep learning algorithms discover multiple levels of representation, or a hierarchy of features, with higher-level, more abstract features defined in terms of (or generating) lower-level features. It has been argued that an intelligent machine is one that learns a representation that disentangles the underlying factors of variation that explain the observed data.

Similarity and Metric Learning

In this problem, the learning machine is given pairs of examples that are considered similar and pairs of less similar objects. It then needs to learn a similarity function (or a distance metric function) that can predict if new objects are similar. It is sometimes used in Recommendation systems.

Sparse Dictionary Learning

In this method, a datum is represented as a linear combination of basis functions, and the coefficients are assumed to be sparse. Let x be a d-dimensional datum, D be a d by n matrix, where each column of D represents a basis function. r is the coefficient to represent x using D. Mathematically, sparse dictionary learning means solving x ≈ D r {\displaystyle x\approx Dr} {\displaystyle x\approx Dr} where r is sparse. Generally speaking, n is assumed to be larger than d to allow the freedom for a sparse representation.

Learning a dictionary along with sparse representations is strongly NP-hard and also difficult to solve approximately. A popular heuristic method for sparse dictionary learning is K-SVD.

Sparse dictionary learning has been applied in several contexts. In classification, the problem is to determine which classes a previously unseen datum belongs to. Suppose a dictionary for each class has already been built. Then a new datum is associated with the class such that it’s best sparsely represented by the corresponding dictionary. Sparse dictionary learning has also been applied in image de-noising. The key idea is that a clean image patch can be sparsely represented by an image dictionary, but the noise cannot.

Genetic Algorithms

A genetic algorithm (GA) is a search heuristic that mimics the process of natural selection, and uses methods such as mutation and crossover to generate new genotype in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms found some uses in the 1980s and 1990s. Vice versa, machine learning techniques have been used to improve the performance of genetic and evolutionary algorithms.


Applications for Machine Learning include but are not lmited to:

  • Adaptive websites
  • Affective computing
  • Bioinformatics
  • Brain-machine interfaces
  • Cheminformatics
  • Classifying DNA sequences
  • Computational anatomy
  • Computer vision, including object recognition
  • Detecting credit card fraud
  • Game playing[28]
  • Information retrieval
  • Internet fraud detection
  • Marketing
  • Machine perception
  • Medical diagnosis
  • Natural language processing[29]
  • Optimization and metaheuristic
  • Online advertising
  • Recommender systems
  • Robot locomotion
  • Search engines
  • Sentiment analysis (or opinion mining)
  • Sequence mining
  • Software engineering
  • Speech and handwriting recognition
  • Stock market analysis
  • Structural health monitoring
  • Syntactic pattern recognition
  • Economics

In 2006, the online movie company Netflix held the first “Netflix Prize” competition to find a program to better predict user preferences and improve the accuracy on its existing Cinematch movie recommendation algorithm by at least 10%. A joint team made up of researchers from AT&T Labs-Research in collaboration with the teams Big Chaos and Pragmatic Theory built an ensemble model to win the Grand Prize in 2009 for $1 million. Shortly after the prize was awarded, Netflix realized that viewers’ ratings were not the best indicators of their viewing patterns (“everything is a recommendation”) and they changed their recommendation engine accordingly.

In 2010 The Wall Street Journal wrote about money management firm Rebellion Research’s use of machine learning to predict economic movements. The article describes Rebellion Research’s prediction of the financial crisis and economic recovery.

In 2014 it has been reported that a machine learning algorithm has been applied in Art History to study fine art paintings, and that it may have revealed previously unrecognized influences between artists.


Software suites containing a variety of Machine Learning – ML Algorithms include the following:

Open Source Software

  • Caffe
  • dlib
  • ELKI
  • Encog
  • GNU Octave
  • H2O
  • Mahout
  • Mallet (software project)
  • mlpy
  • MOA (Massive Online Analysis)
  • ND4J with Deeplearning4j
  • NuPIC
  • OpenAI
  • OpenCV
  • OpenNN
  • Orange
  • R
  • scikit-learn
  • scikit-image
  • Shogun
  • TensorFlow
  • Torch (machine learning)
  • Spark
  • Yooreeka
  • Weka

Commercial Software with Open Source editions

  • RapidMiner

Commercial Software

  • Angoss KnowledgeSTUDIO
  • Ayasdi
  • Databricks
  • Google Prediction API
  • IBM SPSS Modeler
  • KXEN Modeler
  • LIONsolver
  • Mathematica
  • Microsoft Azure Machine Learning
  • Neural Designer
  • NeuroSolutions
  • Oracle Data Mining
  • SAS Enterprise Miner
  • STATISTICA Data Miner

List of Machine Learning Concepts

Supervised Learning

  • AODE
  • Artificial neural network
    • Backpropagation
    • Autoencoders
    • Hopfield networks
    • Boltzmann machines
    • Restricted Boltzmann Machines
    • Spiking neural networks
  • Bayesian statistics
    • Bayesian network
    • Bayesian knowledge base
  • Case-based reasoning
  • Gaussian process regression
  • Gene expression programming
  • Group method of data handling (GMDH)
  • Inductive logic programming
  • Instance-based learning
  • Lazy learning
  • Learning Automata
  • Learning Vector Quantization
  • Logistic Model Tree
  • Minimum message length (decision trees, decision graphs, etc.)
    • Nearest Neighbor Algorithm
    • Analogical modeling
  • Probably approximately correct learning (PAC) learning
  • Ripple down rules, a knowledge acquisition methodology
  • Symbolic machine learning algorithms
  • Support vector machines
  • Random Forests
  • Ensembles of classifiers
    • Bootstrap aggregating (bagging)
    • Boosting (meta-algorithm)
  • Ordinal classification
  • Information fuzzy networks (IFN)
  • Conditional Random Field
  • Linear classifiers
    • Fisher’s linear discriminant
    • Linear regression
    • Logistic regression
    • Multinomial logistic regression
    • Naive Bayes classifier
    • Perceptron
    • Support vector machines
  • Quadratic classifiers
  • k-nearest neighbor
  • Boosting
  • Decision trees
    • C4.5
    • Random forests
    • ID3
    • CART
    • SLIQ
    • SPRINT
  • Bayesian networks
    • Naive Bayes
  • Hidden Markov models

Unsupervised Learning

  • Expectation-maximization algorithm
  • Vector Quantization
  • Generative topographic map
  • Information bottleneck method

Artificial Neural Network

  • Self-organizing map

Association Rule Learning

  • Apriori algorithm
  • Eclat algorithm
  • FP-growth algorithm

Hierarchical Clustering

  • Single-linkage clustering
  • Conceptual clustering

Cluster Analysis

  • K-means algorithm
  • Fuzzy clustering
  • OPTICS algorithm

Outlier Detection

  • Local Outlier Factor

Semi-Supervised Learning

  • Generative models
  • Low-density separation
  • Graph-based methods
  • Co-training

Reinforcement Learning

  • Temporal difference learning
  • Q-learning
  • Learning Automata

Deep Learning

  • Deep belief networks
  • Deep Boltzmann machines
  • Deep Convolutional neural networks
  • Deep Recurrent neural networks
  • Hierarchical temporal memory

List of Datasets for Machine Learning – ML

These datasets are used for machine learning research and have been cited in peer-reviewed academic journals and other publications. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning algorithms (such as deep learning), computer hardware, and, less-intuitively, the availability of high-quality training datasets. High-quality labeled training datasets for supervised and semi-supervised machine learning algorithms are usually difficult and expensive to produce because of the large amount of time needed to label the data. Although they do not need to be labeled, high-quality datasets for unsupervised learning can also be difficult and costly to produce. This list aggregates high-quality datasets that have been shown to be of value to the machine learning research community from multiple different data repositories to provide greater coverage of the topic than is otherwise available.

Image data

Datasets consisting primarily of images or videos for tasks such as object detection, facial recognition, and multi-label classification.

Facial Recognition

In computer vision, facial images have been used extensively to develop facial recognition systems, face detection, and many other projects that use facial images.

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Facial Recognition Technology (FERET) 11338 images of 1199 individuals in different positions and at different times. None. 11,338 Images Classification, facial recognition 2003 [6][7] United States Department of Defense
Pose, Illumination, and Expression (PIE) 41,368 color images of 68 people in 13 different poses. Images labeled with expressions. 41,368 Images, text Classification, facial recognition 2000 [8][9] R. Gross et al.
SCFace Color images of faces at various angles. Location of facial features extracted. Coordinates of features given. 4,160 Images, text Classification, facial recognition 2011 [10][11] M. Grgic et al.
YouTube Faces DB Videos of 1,595 different people gathered from YouTube. Each clip is between 48 and 6,070 frames. Identity of those appearing in videos and descriptors. 3,425 videos Video, text Video classification, facial recognition 2011 [12][13] L. Wolf et al.
Grammatical Facial Expressions Dataset Grammatical Facial Expressions from Brazilian Sign Language. Microsoft Kinect features extracted. 27,965 Text Facial gesture recognition 2014 [14] F. Freitas et al.
CMU Face Images Dataset Images of faces. Each person is photographed multiple times to capture different expressions. Labels and features. 640 Images, Text Facial recognition 1999 [15][16] T. Mitchell
Yale Face Database Faces of 15 individuals in 11 different expressions. Labels of expressions. 165 Images Facial recognition 1997 [17][18] J. Yang et al.
Cohn-Kanade AU-Coded Expression Database Large database of images with labels for expressions. Tracking of certain facial features. 500+ sequences Images, text Facial recognition 2000 [19][20] T. Kanade et al.
FaceScrub Images of public figures scrubbed from image searching. Name and m/f annotation. 107,818 Images, text Facial recognition 2014 [21][22] H. Ng et al.
BioID Face Database Images of faces with eye positions marked. Manually set eye positions. 1521 Images, text Facial recognition 2001 [23][24] BioID
Skin Segmentation Dataset Randomly sampled color values from face images. B, G, R, values extracted. 245,057 Text Segmentation, classification 2012 [25][26] R. Bhatt.
Bosphorus 3D Facial image database. 34 action units and 6 expressions labeled; 24 facial landmarks labeled. 4652 Images, text Facial recognition, classification 2008 [27][28] A Savran et al.
UOY 3D-Face neutral face, 5 expressions: anger, happiness, sadness, eyes closed, eyebrows raised. labeling. 5250 Images, text Facial recognition, classification ~2004 [29][30] University of York
CASIA Expressions: Anger, smile, laugh, surprise, closed eyes. None. 4624 Images, text Facial recognition, classification 2007 [31][32] Institute of Automation, Chinese Academy of Sciences
BU-3DFE neutral face, and 6 expressions: anger, happiness, sadness, surprise, disgust, fear (4 levels). 3D images extracted. None. 2500 Images, text Facial recognition, classification 2006 [33] Binghamton University
Face Recognition Grand Challenge Dataset Up to 22 samples for each subject. Expressions: anger, happiness, sadness, surprise, disgust, puffy. 3D Data. None. 4007 Images, text Facial recognition, classification 2004 [34][35] National Institute of Standards and Technology
Gavabdb Up to 61 samples for each subject. Expressions neutral face, smile, frontal accentuated laugh, frontal random gesture. 3D images. None. 549 Images, text Facial recognition, classification 2008 [36][37] King Juan Carlos University
3D-RMA Up to 100 subjects, expressions mostly neutral. Several poses as well. None. 9971 Images, text Facial recognition, classification 2004 [38][39] Royal Military Academy (Belgium)

Object Detection and Recognition

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Berkeley 3-D Object Dataset 849 images taken in 75 different scenes. About 50 different object classes are labeled. Object bounding boxes and labeling. 849 labeled images, text Object recognition 2014 [40][41] A. Janoch et al.
Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500) 500 natural images, explicitly separated into disjoint train, validation and test subsets + benchmarking code. Based on BSDS300. Each image segmented by five different subjects on average. 500 Segmented images Contour detection and hierarchical image segmentation 2011 [42] University of California, Berkeley
Microsoft Common Objects in Context (COCO) complex everyday scenes of common objects in their natural context. Object highlighting, labeling, and classification into 91 object types. 2,500,000 Labeled images, text Object recognition 2015 [43][44] T. Lin et al.
SUN Database Very large scene and object recognition database. Places and objects are labeled. Objects are segmented. 131,067 Images, text Object recognition, scene recognition 2014 [45][46] J. Xiao et al.
ImageNet Labeled object image database, used in the ImageNet Large Scale Visual Recognition Challenge Labeled objects, bounding boxes, descriptive words, SIFT features 14,197,122 Images, text Object recognition, scene recognition 2014 [47][48] J. Deng et al.
TV News Channel Commercial Detection Dataset TV commercials and news broadcasts. Audio and video features extracted from still images. 129,685 Text Clustering, classification 2015 [49][50] P. Guha et al.
Statlog (Image Segmentation) Dataset The instances were drawn randomly from a database of 7 outdoor images and hand-segmented to create a classification for every pixel. Many features calculated. 2310 Text Classification 1990 [51] University of Massachusetts
Caltech 101 Pictures of objects. Detailed object outlines marked. 9146 Images Classification, object recognition. 2003 [52][53] F. Li et al.
Caltech-256 Large dataset of images for object classification. Images categorized and hand-sorted. 30,607 Images, Text Classification, object detection 2007 [54][55] G. Griffin et al.
SIFT10M Dataset SIFT features of Caltech-256 dataset. Extensive SIFT feature extraction. 11,164,866 Text Classification, object detection 2016 [56] X. Fu et al.
LabelMe Annotated pictures of scenes. Objects outlined. 187,240 Images, text Classification, object detection 2005 [57] MIT Computer Science and Artificial Intelligence Laboratory
Cityscapes Dataset Stereo video sequences recorded in street scenes, with pixel-level annotations. Metadata also included. Pixel-level segmentation and labeling 25,000 Images, text Classification, object detection 2016 [58] Daimler AG et al.

Handwriting and Character Recognition

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Artificial Characters Dataset Artificially generated data describing the structure of 10 capital English letters. Coordinates of lines drawn given as integers. Various other features. 6000 Text Handwriting recognition, classification 1992 [59] H. Guvenir et al.
Letter Dataset Upper case printed letters. 17 features are extracted from all images. 20,000 Text OCR, classification 1991 [60][61] D. Slate et al.
Character Trajectories Dataset Labeled samples of pen tip trajectories for people writing simple characters. 3-dimensional pen tip velocity trajectory matrix for each sample 2858 Text Handwriting recognition, classification 2008 [62][63] B. Williams
Chars74K Dataset Character recognition in natural images of symbols used in both English and Kannada   74,107   Character recognition, handwriting recognition, OCR, classification 2009 [64] T. de Campos
UJI Pen Characters Dataset Isolated handwritten characters Coordinates of pen position as characters were written given. 11,640 Text Handwriting recognition, classification 2009 [65][66] F. Prat et al.
Gisette Dataset Handwriting samples from the often-confused 4 and 9 characters. Features extracted from images, split into train/test, handwriting images size-normalized. 13,500 Images, text Handwriting recognition, classification 2003 [67] Yann LeCun et al.
MNIST Database Database of handwritten digits. Hand-labeled. 60,000 Images, text Classification 1998 [68][69] National Institute of Standards and Technology
Optical Recognition of Handwritten Digits Dataset Normalized bitmaps of handwritten data. Size normalized and mapped to bitmaps. 5620 Images, text Handwriting recognition, classification 1998 [70] E. Alpaydin et al.
Pen-Based Recognition of Handwritten Digits Dataset Handwritten digits on electronic pen-tablet. Feature vectors extracted to be uniformly spaced. 10,992 Images, text Handwriting recognition, classification 1998 [71][72] E. Alpaydin et al.
Semeion Handwritten Digit Dataset Handwritten digits from 80 people. All handwritten digits have been normalized for size and mapped to the same grid. 1593 Images, text Handwriting recognition, classification 2008 [73] T. Srl

Aerial Images

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Aerial Image Segmentation Dataset 80 high-resolution aerial images with spatial resolution ranging from 0.3 to 1.0. Images manually segmented. 80 Images Aerial Classification, object detection 2013 [74][75] J. Yuan et al.
KIT AIS Data Set Multiple labeled training and evaluation datasets of aerial images of crowds. Images manually labeled to show paths of individuals through crowds. ~ 150 Images with paths People tracking, aerial tracking 2012 [76][77] M. Butenuth et al.
Wilt Dataset Remote sensing data of diseased trees and other land cover. Various features extracted. 4899 Images Classification, aerial object detection 2014 [78][79] B. Johnson
Forest Type Mapping Dataset Satellite imagery of forests in Japan. Image wavelength bands extracted. 326 Text Classification 2015 [80][81] B. Johnson
Overhead Imagery Research Data Set Annotated overhead imagery. Images with multiple objects. Over 30 annotations and over 60 statistics that describe the target within the context of the image. 1000 Images, text Classification 2009 [82][83] F. Tanner et al.

Other Images

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
MPII Cooking Activities Dataset Videos and images of various cooking activities. Activity paths and directions, labels, fine-grained motion labeling, activity class, still image extraction and labeling. 881,755 frames Labeled video, images, text Classification 2012 [84][85] M. Rohrbach et al.
Stanford Dogs Dataset Images of 120 breeds of dogs from around the world. Train/test splits and ImageNet annotations provided. 20,580 Images, text Fine-grain classification 2011 [86][87] A. Khosla et al.
The Oxford-IIIT Pet Dataset 37 categories of pets with roughly 200 images of each. Breed labeled, tight bounding box, foreground-background segmentation. ~ 7,400 Images, text Classification, object detection 2012 [87][88] O. Parkhi et al.
Corel Image Features Data Set Database of images with features extracted. Many features including color histogram, co-occurrence texture, and colormoments, 68,040 Text Classification, object detection 1999 [89][90] M. Ortega-Bindenberger et al.
Online Video Characteristics and Transcoding Time Dataset. Transcoding times for various different videos and video properties. Video features given. 168,286 Text Regression 2015 [91] T. Deneke et al.
Microsoft Sequential Image Narrative Dataset (SIND) Dataset for sequential vision-to-language Descriptive caption and storytelling given for each photo, and photos are arranged in sequences 81,743 Images, text Visual storytelling 2016 [92] Microsoft Research
NYU Depth Dataset V2 Video sequences from indoor scenes including RBG and Depth channels from the Microsoft Kinect   464 scenes Videos with RGB + depth channels Visible region segmentation 2012 [93] Silberman et al.

Text Data

Datasets consisting primarily of text for tasks such as natural language processing, sentiment analysis, translation, and cluster analysis.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Amazon reviews US product reviews from None. ~ 82M Text Classification, sentiment analysis 2015 [94] McAuley et al.
OpinRank Review Dataset Reviews of cars and hotels from and TripAdvisor respectively. None. 42,230 / ~259,000 respectively Text Sentiment analysis, clustering 2011 [95][96] K. Ganesan et al.
MovieLens 22,000,000 ratings and 580,000 tags applied to 33,000 movies by 240,000 users. None. ~ 22M Text Regression, clustering, classification 2016 [97] GroupLens Research
Yahoo! Music User Ratings of Musical Artists Over 10M ratings of artists by Yahoo users. None described. ~ 10M Text Clustering, regression 2004 [98][99] Yahoo!
Car Evaluation Data Set Car properties and their overall acceptability. Six categorical features given. 1728 Text Classification 1997 [100][101] M. Bohanec
YouTube Comedy Slam Preference Dataset User vote data for pairs of videos shown on YouTube. Users voted on funnier videos. Video metadata given. 1,138,562 Text Classification 2012 [102][103] Google
Skytrax User Reviews Dataset User reviews of airlines, airports, seats, and lounges from Skytrax. Ratings are fine-grain and include many aspects of airport experience. 41396 Text Classification, regression 2015 [104] Q. Nguyen
Teaching Assistant Evaluation Dataset Teaching assistant reviews. Features of each instance such as class, class size, and instructor are given. 151 Text Classification 1997 [105][106] W. Loh et al.

News Articles

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
NYSK Dataset English news articles about the case relating to allegations of sexual assault against the former IMF director Dominique Strauss-Kahn. Filtered and presented in XML format. 10,421 XML, text Sentiment analysis, topic extraction 2013 [107] Dermouche, M. et al.
The Reuters Corpus Volume 1 Large corpus of Reuters news stories in English. Fine-grain categorization and topic codes. 810,000 Text Classification, clustering, summarization 2002 [108] Reuters
The Reuters Corpus Volume 2 Large corpus of Reuters news stories in multiple languages. Fine-grain categorization and topic codes. 487,000 Text Classification, clustering, summarization 2005 [109] Reuters
Thomson Reuters Text Research Collection Large corpus of news stories. Details not described. 1,800,370 Text Classification, clustering, summarization 2009 [110] T. Rose et al
Saudi Newspapers Corpus 31,030 Arabic newspaper articles. Metadata extracted. 31,030 JSON Summarization, clustering 2015 [111] M. Alhagri


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Enron Email Dataset Emails from employees at Enron organized into folders. Attachments removed, invalid email addresses converted to or ~ 500,000 Text Network analysis, sentiment analysis 2004 (2015) [112][113] Klimt, B. and Y. Yang
Ling-Spam Dataset Corpus containing both legitimate and spam emails. Four version of the corpus involving whether or not a lemmatiser or stop-list was enabled.   Text Classification 2000 [114][115] Androutsopoulos, J. et al.
SMS Spam Collection Dataset Collected SMS spam messages. None. 5574 Text Classification 2011 [116][117] T. Almeida et al.
Twenty Newsgroups Dataset Messages from 20 different newsgroups. None. 20,000 Text Natural language processing 1999 [118] T. Mitchell et al.
Spambase Dataset Spam emails. Many text features extracted. 4601 Text Spam detection, classification 1999 [119] M. Hopkins et al.

Twitter and Tweets

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Sentiment140 Tweet data from 2009 including original text, time stamp, user and sentiment. Classified using distant supervision from presence of emoticon in tweet. 1,578,627 Tweets, comma, separated values Sentiment analysis 2009 [120][121] A. Go et al.
ASU Twitter Dataset Twitter network data, not actual tweets. Shows connections between a large number of users. None. 11,316,811 users, 85,331,846 connections Text Clustering, graph analysis 2009 [122][123] R. Zafarani et al.
SNAP Social Circles: Twitter Database Large twitter network data. Node features, circles, and ego networks. 1,768,149 Text Clustering, graph analysis 2012 [124][125] J. McAuley et al.
Twitter Dataset for Arabic Sentiment Analysis Arabic tweets. Samples hand-labeled as positive or negative. 2000 Text Classification 2014 [126][127] N. Abdulla
Buzz in Social Media Dataset Data from Twitter and Tom’s Hardware. This dataset focuses on specific buzz topics being discussed on those sites. Data is windowed so that the user can attempt to predict the events leading up to social media buzz. 140,000 Text Regression, Classification 2013 [128][129] F. Kawala et al.

Other Text

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Legal Case Reports Federal Court of Australia cases from 2006–2009. None. 4,000 Text Summarization,citation analysis 2012 [130][131] F. Galgani et al.
Blogger Authorship Corpus Blog entries of 19,320 people from Blogger self-provided gender, age, industry, and astrological sign. 681,288 Text Sentiment analysis, summarization, classification 2006 [132][133] J. Schler et al.
Social Structure of Facebook Networks Large dataset of the social structure of Facebook. None. 100 colleges covered Text Network analysis, clustering 2012 [134][135] A. Traud et al.
Dataset for the Machine Comprehension of Text Stories and associated questions for testing comprehension of text. None. 660 Text Natural language processing, machine comprehension 2013 [136][137] M. Richardson et al.
The Penn Treebank Project Naturally occurring text annotated for linguistic structure. Text is parsed into semantic trees. ~ 1M words Text Natural language processing, summarization 1995 [138][139] M. Marcus et al.
DEXTER Dataset Task given is to determine, from features given, which articles are about corporate acquisitions. Features extracted include word stems. Distractor features included. 2600 Text Classification 2008 [140] Reuters
Google Books N-grams N-grams from a very large corpus of books None. 2.2 TB of text Text Classification, clustering, regression 2011 [141][142] Google
Personae Corpus Collected for experiments in Authorship Attribution and Personality Prediction. Consists of 145 Dutch-language essays. In addition to normal texts, syntactically annotated texts are given. 145 Text Classification, regression 2008 [143][144] K. Luyckx et al.
CNAE-9 Dataset Categorization task for free text descriptions of Brazilian companies. Word frequency has been extracted. 1080 Text Classification 2012 [145][146] P. Ciarelli et al.
Sentiment Labeled Sentences Dataset 3000 sentiment labeled sentences. Sentiment of each sentence has been hand labeled as positive or negative. 3000 Text Classification, sentiment analysis 2015 [147][148] D. Kotzias
BlogFeedback Dataset Dataset to predict the number of comments a post will receive based on features of that post. Many features of each post extracted. 60,021 Text Regression 2014 [149][150] K. Buza

Sound Data

Datasets of sounds and sound features.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Parkinson Speech Dataset Multiple recordings of people with and without Parkinson’s Disease. Voice features extracted, disease scored by physician using unified Parkinson’s disease rating scale 1,040 Text Classification, regression 2013 [151][152] B. E. Sakar et al.
Spoken Arabic Digits Spoken Arabic digits from 44 male and 44 female. Time-series of mel-frequency cepstrum coefficients. 8,800 Text Classification 2010 [153][154] M. Bedda et al.
ISOLET Dataset Spoken letter names. Features extracted from sounds. 7797 Text Classification 1994 [155][156] R. Cole et al.
Japanese Vowels Dataset Nine male speakers uttered two Japanese vowels successively. Applied 12-degree linear prediction analysis to it to obtain a discrete-time series with 12 cepstrum coefficients. 640 Text Classification 1999 [157][158] M. Kudo et al.
Parkinson’s Telemonitoring Dataset Multiple recordings of people with and without Parkinson’s Disease. Sound features extracted. 5875 Text Classification 2009 [159][160] A. Tsanas et al.
TIMIT Recordings of 630 speakers of eight major dialects of American English, each reading ten phonetically rich sentences. Speech is lexically and phonemically transcribed. 6300 Text Speech recognition, classification. 1986 [161][162] J. Garofolo et al.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Geographical Original of Music Data Set Audio features of music samples from different locations. Audio features extracted using MARSYAS software. 1,059 Text Geographical classification, clustering 2014 [163][164] F. Zhou et al.
Million Song Dataset Audio features from one million different songs. Audio features extracted. 1M Text Classification, clustering 2011 [165][166] T. Bertin-Mahieux et al.
Bach Choral Harmony Dataset Bach chorale chords. Audio features extracted. 5665 Text Classification 2014 [167][168] D. Radicioni et al.

Other Sounds

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
UrbanSound Labeled sound recordings of sounds like air conditioners, car horns and children playing. Sorted into folders by class of events as well as metadata in a JSON file and annotations in a CSV file. 1,059 Sound(WAV) Classification 2014 [169][170] J. Salamon et al.

Signal Data

Datasets containing electric signal information requiring some sort of Signal processing for further analysis.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Witty Worm Dataset Dataset detailing the spread of the Witty worm and the infected computers. Split into a publicly available set and a restricted set containing more sensitive information like IP and UDP headers. 55,909 IP addresses Text Classification 2004 [171][172] Center for Applied Internet Data Analysis
Cuff-Less Blood Pressure Estimation Dataset Cleaned vital signals from human patients which can be used to estimate blood pressure. 125 Hz vital signs have been cleaned. 12,000 Text Classification, regression 2015 [173][174] M. Kachuee et al.
Gas Sensor Array Drift Dataset Measurements from 16 chemical sensors utilized in simulations for drift compensation. Extensive number of features given. 13,910 Text Classification 2012 [175][176] A. Vergara
Servo Dataset Data covering the nonlinear relationships observed in a servo-amplifier circuit. Levels of various components as a function of other components are given. 167 Text Regression 1993 [177][178] K. Ullrich
UJIIndoorLoc-Mag Dataset Indoor localization database to test indoor positioning systems. Data is magnetic field based. Train and test splits given. 40,000 Text Classification, regression, clustering 2015 [179][180] D. Rambla et al.
Sensorless Drive Diagnosis Dataset Electrical signals from motors with defective components. Statistical features extracted. 58,508 Text Classification 2015 [181][182] M. Bator

Motion Tracking

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Wearable Computing: Classification of Body Postures and Movements (PUC-Rio) People performing five standard actions while wearing motion tackers. None. 165,632 Text Classification 2013 [183][184] Pontifical Catholic University of Rio de Janeiro
Gesture Phase Segmentation Dataset Features extracted from video of people doing various gestures. Features extracted aim at studying gesture phase segmentation. 9900 Text Classification, clustering 2014 [185][186] R. Madeo et a
Vicon Physical Action Data Set Dataset 10 normal and 10 aggressive physical actions that measure the human activity tracked by a 3D tracker. Many parameters recorded by 3D tracker. 3000 Text Classification 2011 [187][188] T. Theodoridis
Daily and Sports Activities Dataset Motor sensor data for 19 daily and sports activities. Many sensors given, no preprocessing done on signals. 9120 Text Classification 2013 [189][190] B. Barshan et al.
Human Activity Recognition Using Smartphones Dataset Gyroscope and accelerometer data from people wearing smartphones and performing normal actions. Actions performed are labeled, all signals preprocessed for noise. 10,299 Text Classification 2012 [191][192] J. Reyes-Ortiz et al.
Australian Sign Language Signs Australian sign language signs captured by motion-tracking gloves. None. 2565 Text Classification 2002 [193][194] M. Kadous
Weight Lifting Exercises monitored with Inertial Measurement Units Five variations of the biceps curl exercise monitored with IMUs. Some statistics calculated from raw data. 39,242 Text Classification 2013 [195][196] W. Ugulino et al.
sEMG for Basic Hand movements Dataset Two databases of surface electromyographic signals of 6 hand movements. None. 3000 Text Classification 2014 [197][198] C. Sapsanis et al.
REALDISP Activity Recognition Dataset Evaluate techniques dealing with the effects of sensor displacement in wearable activity recognition. None. 1419 Text Classification 2014 [198][199] O. Banos et al.
Heterogeneity Activity Recognition Dataset Data from multiple different smart devices for humans performing various activities. None. 43,930,257 Text Classification, clustering 2015 [200][201] A. Stisen et al.
Indoor User Movement Prediction from RSS Data Temporal wireless network data that can be used to track the movement of people in an office. None. 13,197 Text Classification 2016 [202][203] D. Bacciu
PAMAP2 Physical Activity Monitoring Dataset 18 different types of physical activities performed by 9 subjects wearing 3 IMUs. None. 3,850,505 Text Classification 2012 [204] A. Reiss
OPPORTUNITY Activity Recognition Dataset Human Activity Recognition from wearable, object, and ambient sensors is a dataset devised to benchmark human activity recognition algorithms. None. 2551 Text Classification 2012 [205][206] D. Roggen et al.

Other Signals

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Wine Dataset Chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. 13 properties of each wine are given 178 Text Classification, regression 1991 [207][208] M. Forina et al.
Combined Cycle Power Plant Data Set Data from various sensors within a power plant running for 6 years. None 9568 Text Regression 2014 [209][210] P. Tufekci et al

Physical Data

Datasets from physical systems

High Energy Physics

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
HIGGS Dataset Monte Carlo simulations of particle accelerator collisions. 28 features of each collision are given. 11M Text Classification 2014 [211][212][213] D. Whiteson
HEPMASS Dataset Monte Carlo simulations of particle accelerator collisions. Goal is to separate the signal from noise. 28 features of each collision are given. 10,500,000 Text Classification 2016 [212][213][214] D. Whiteson


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Yacht Hydrodynamics Dataset Yacht performance based on dimensions. Six features are given for each yacht. 308 Text Regression 2013 [215][216] R. Lopez
Robot Execution Failures Dataset 5 data sets that center around robotic failure to execute common tasks. Integer valued features such as torque and other sensor measurements. 463 Text Classification 1999 [217] L. Seabra et al.
Pittsburgh Bridges Dataset Design description is given in terms of several properties of various bridges. Various bridge features are given. 108 Text Classification 1990 [218][219] Y. Reich et al.
Automobile Dataset Data about automobiles, their insurance risk, and their normalized losses. Car features extracted. 205 Text Regression 1987 [220][221] J. Schimmer et al
Auto MPG Dataset MPG data for cars. Eight features of each car given. 398 Text Regression 1993 [222] Carnegie Mellon University
Energy Efficiency Dataset Heating and cooling requirements given as a function of building parameters. Building parameters given. 768 Text Classification, regression 2012 [223][224] A. Xifara et al.
Airfoil Self-Noise Dataset A series of aerodynamic and acoustic tests of two and three-dimensional airfoil blade sections. Data about frequency, angle of attack, etc., are given. 1503 Text Regression 2014 [225] R. Lopez
Challenger USA Space Shuttle O-Ring Dataset Attempt to predict O-ring problems given past Challenger data. Several features of each flight, such as launch temperature, are given. 23 Text Regression 1993 [226][227] D. Draper et al.
Statlog (Shuttle) Dataset NASA space shuttle datasets. Nine features given. 58,000 Text Classification 58000 [228] NASA


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Volcanoes on Venus – JARtool experiment Dataset Venus images returned by the Magellan spacecraft. Images are labeled by humans. not given Images Classification 1991 [229][230] M. Burl
MAGIC Gamma Telescope Dataset Monte Carlo generated high-energy gamma particle events. Numerous features extracted from the simulations. 19,020 Text Classification 2007 [230][231] R. Bock
Solar Flare Dataset Measurements of the number of certain types of solar flare events occurring in a 24-hour period. Many solar flare-specific features are given. 1389 Text Regression, classification 1989 [232] G. Bradshaw

Earth Science

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Volcanoes of the World Volcanic eruption data for all known volcanic events on earth. Details such as region, subregion, tectonic setting, dominant rock type are given. 1535 Text Regression, classification 2013 [233] E. Venzke et al.
Seismic-bumps Dataset Seismic activities from a coal mine. Seismic activity was classified as hazardous or not. 2584 Text Classification 2013 [234][235] M. Sikora et al.

Other Physical

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Concrete Compressive Strength Dataset Dataset of concrete properties and compressive strength. Nine features are given for each sample. 1030 Text Regression 2007 [236][237] I. Yeh
Concrete Slump Test Dataset Concrete slump flow given in terms of properties. Features of concrete given such as fly ash, water, etc. 103 Text Regression 2009 [238][239] I. Yeh
Musk Dataset Predict if a molecule, given the features, will be a musk or a non-musk. 168 features given for each molecule. 6598 Text Classification 1994 [240] Arris Pharmaceutical Corp.
Steel Plates Faults Dataset Steel plates of 7 different types. 27 features given for each sample. 1941 Text Classification 2010 [241] Semeion Research Center

Biological Data

Datasets from biological systems.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
EEG Database Study to examine EEG correlates of genetic predisposition to alcoholism. Measurements from 64 electrodes placed on the scalp sampled at 256 Hz (3.9 ms epoch) for 1 second. 122 Text Classification 1999 [242][243] H. Begleiter
P300 Interface Dataset Data from nine subjects collected using P300-based brain-computer interface for disabled subjects. Split into four sessions for each subject. MATLAB code given. 1,224 Text Classification 2008 [244][245] U. Hoffman et al.
Heart Disease Data Set Attributed of patients with and without heart disease. 75 attributes given for each patient with some missing values. 303 Text Classification 1988 [246][247] A. Janosi et al.
Breast Cancer Wisconsin (Diagnostic) Dataset Dataset of features of breast masses. Diagnoses by physician is given. 10 features for each sample are given. 569 Text Classification 1995 [248][249] W. Wolberg et al.
National Survey on Drug Use and Health Large scale survey on health and drug use in the United States. None. 55,268 Text Classification, regression 2012 [250] United States Department of Health and Human Services
Lung Cancer Dataset Lung cancer dataset without attribute definitions 56 features are given for each case 32 Text Classification 1992 [251][252] Z. Hong et al.
Arrhythmia Dataset Data for a group of patients, of which some have cardiac arrhythmia. 276 features for each instance. 452 Text Classification 1998 [253][254] H. Altay et al.
Diabetes 130-US hospitals for years 1999–2008 Dataset 9 years of readmission data across 130 US hospitals for patients with diabetes. Many features of each readmission are given. 100,000 Text Classification, clustering 2014 [255][256] J. Clore et al.
Diabetic Retinopathy Debrecen Dataset Features extracted from images of eyes with and without diabetic retinopathy. Features extracted and conditions diagnosed. 1151 Text Classification 2014 [257][258] B. Antal et al.
Liver Disorders Dataset Data for people with liver disorders. Seven biological features given for each patient. 345 Text Classification 1990 [259][260] Bupa Medical Research Ltd.
Thyroid Disease Dataset 10 databases of thyroid disease patient data. None. 7200 Text Classification 1987 [261][262] R. Quinlan
Mesothelioma Dataset Mesothelioma patient data. Large number of features, including asbestos exposure, are given. 324 Text Classification 2016 [263][264] A. Tanrikulu et al.
KEGG Metabolic Reaction Network (Undirected) Dataset Network of metabolic pathways. A reaction network and a relation network are given. Detailed features for each network node and pathway are given. 65,554 Text Classification, clustering, regression 2011 [265] M. Naeem et al.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Abalone Dataset Physical measurements of Abalone. Weather patterns and location are also given. None. 4177 Text Regression 1995 [266] Marine Research Laboratories – Taroona
Zoo Dataset Artificial dataset covering 7 classes of animals. Animals are classed into 7 categories and features are given for each. 101 Text Classification 1990 [267] R. Forsyth
Demospongiae Dataset Data about marine sponges. 503 sponges in the Demosponge class are described by various features. 503 Text Classification 2010 [268] E. Armengol et al.
Splice-junction Gene Sequences Dataset Primate splice-junction gene sequences (DNA) with associated imperfect domain theory. None. 3190 Text Classification 1992 [252] G. Towell et al.
Mice Protein Expression Dataset Expression levels of 77 proteins measured in the cerebral cortex of mice. None. 1080 Text Classification, Clustering 2015 [269][270] C. Higuera et al.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Forest Fires Dataset Forest fires and their properties. 13 features of each fire are extracted. 517 Text Regression 2008 [271][272] P. Cortez et al.
Iris Dataset Three types of iris plants are described by 4 different attributes. None. 150 Text Classification 1936 [273][274] R. Fisher
Plant Species Leaves Dataset Sixteen samples of leaf each of one-hundred plant species. Shape descriptor, fine-scale margin, and texture histograms are given. 1600 Text Classification 2012 [275][276] J. Cope et al.
Mushroom Dataset Mushroom attributes and classification. Many properties of each mushroom are given. 8124 Text Classification 1987 [277] J. Schlimmer
Soybean Dataset Database of diseased soybean plants. 35 features for each plant are given. Plants are classified into 19 categories. 307 Text Classification 1988 [278] R. Michalshi et al.
Seeds Dataset Measurements of geometrical properties of kernels belonging to three different varieties of wheat. None. 210 Text Classification, clustering 2012 [279][280] Charytanowicz et al.
Covertype Dataset Data for predicting forest cover type strictly from cartographic variables. Many geographical features given. 581,012 Text Classification 1998 [281][282] J. Blackard et al.
Abscisic Acid Signaling Network Dataset Data for a plant signaling network. Goal is to determine set of rules that governs the network. None. 300 Text Causal-discovery 2008 [283] J. Jenkens et al.
Folio Dataset 20 photos of leaves for each of 32 species. None. 637 Images, text Classification, clustering 2015 [284][285] T. Munisami et al.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Ecoli Dataset Protein localization sites. Various features of the protein localizations sites are given. 336 Text Classification 1996 [286][287] K. Nakai et al.
MicroMass Dataset Identification of microorganisms from mass-spectrometry data. Various mass spectrometer features. 931 Text Classification 2013 [288][289] P. Mahe et al.
Yeast Dataset Predictions of Cellular localization sites of proteins. Eight features given per instance. 1484 Text Classification 1996 [290][291] K. Nakai et al.

Multivariate Data

Datasets consisting of rows of observations and columns of attributes characterizing those observations. Typically used for regression analysis or classification but other types of algorithms can also be used. This section includes datasets that do not fit in the above categories.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Dow Jones Index Weekly data of stocks from the first and second quarters of 2011. Calculated values included such as percentage change and a lags. 750 Comma separated values Classification, regression, time Series 2014 [292][293] M. Brown et al.
Statlog (Australian Credit Approval) Credit card applications either accepted or rejected and attributes about the application. Attribute names are removed as well as identifying information. Factors have been relabeled. 690 Comma separated values Classification 1987 [294][295] R. Quinlan
eBay auction data Auction data from various objects over various length auctions Contains all bids, bidderID, bid times, and opening prices. ~ 550 Text Regression, classification 2012 [296][297] G. Shmueli et al.
Statlog (German Credit Data) Binary credit classification into “good” or “bad” with many features Various financial features of each person are given. 690 Text Classification 1994 [298] H. Hofmann
Bank Marketing Dataset Data from a large marketing campaign carried out by a large bank . Many attributes of the clients contacted are given. If the client subscribed to the bank is also given. 45,211 Text Classification 2012 [299][300] S. Moro et al.
Istanbul Stock Exchange Dataset Several stock indexes tracked for almost two years. None. 536 Text Classification, regression 2013 [301][302] O. Akbilgic
Default of Credit Card Clients Credit default data for Taiwanese creditors. Various features about each account are given. 30,000 Text Classification 2016 [303][304] I. Yeh


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Cloud DataSet Data about 1024 different clouds. Image features extracted. 1024 Text Classification, clustering 1989 [305] P. Collard
El Nino Dataset Oceanographic and surface meteorological readings taken from a series of buoys positioned throughout the equatorial Pacific. 12 weather attributes are measured at each buoy. 178080 Text Regression 1999 [306] Pacific Marine Environmental Laboratory
Greenhouse Gas Observing Network Dataset Time-series of greenhouse gas concentrations at 2921 grid cells in California created using simulations of the weather. None. 2921 Text Regression 2015 [307] D. Lucas
Atmospheric CO2 from Continuous Air Samples at Mauna Loa Observatory Continuous air samples in Hawaii, USA. 44 years of records. None. 44 years Text Regression 2001 [308] Mauna Loa Observatory
Ionosphere Dataset Radar data from the ionosphere. Task is to classify into good and bad radar returns. Many radar features given. 351 Text Classification 1989 [262][309] Johns Hopkins University
Ozone Level Detection Dataset Two ground ozone level datasets. Many features given, including weather conditions at time of measurement. 2536 Text Classification 2008 [310][311] K. Zhang et al.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Adult Dataset Census data from 1994 containing demographic features of adults and their income. Cleaned and anonymized. 48,842 Comma separated values Classification 1996 [312] United States Census Bureau
Census-Income (KDD) Weighted census data from the 1994 and 1995 Current Population Surveys. Split into training and test sets. 299,285 Comma separated values Classification 2000 [313][314] United States Census Bureau
IPUMS Census Database Census data from the Los Angeles and Long Beach areas. None 256,932 Text Classification, regression 1999 [315] IPUMS
US Census Data 1990 Partial data from 1990 US census. Results randomized and useful attributes selected. 2,458,285 Text Classification, regression 1990 [316] United States Census Bureau


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Bike Sharing Dataset Hourly and daily count of rental bikes in a large city. Many features, including weather, length of trip, etc, are given. 17,389 Text Regression 2013 [317][318] H. Fanaee-T
New York City Taxi Trip Data Trip data for yellow and green taxis in New York City. Gives pick up and drop off locations, fares, and other details of trips. 6 years Text Classificaiton, clustering 2015 [319] New York City Taxi and Limousine Commission
Taxi Service Trajectory ECML PKDD Trajectories of all taxis in a large city. Many features given, including start and stop points. 1,710,671 Text Clustering, causal-discovery 2015 [320][321] M. Ferreira et al.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Webpages from Common Crawl 2012 Large collection of webpages and how they are connected via hyperlinks None. 3.5B Text clustering, classification 2013 [322] V. Granville
Internet Advertisements Dataset Dataset for predicting if a given image is an advertisement or not. Features encode geometry of ads and phrases occurring in the URL. 3279 Text Classification 1998 [323][324] N. Kushmerick
Internet Usage Dataset General demographics of internet users. None. 10,104 Text Classification, clustering 1999 [325] D. Cook
URL Dataset 120 days of URL data from a large conference. Many features of each URL are given. 2,396,130 Text Classification 2009 [326][327] J. Ma
Phishing Websites Dataset Dataset of phishing websites. Many features of each site are given. 2456 Text Classification 2015 [328] R. Mustafa et al.
Online Retail Dataset Online transactions for a UK online retailer. Details of each transaction given. 541,909 Text Classification, clustering 2015 [329] D. Chen
Freebase Simple Topic Dump Freebase is an online effort to structure all human knowledge. Topics from Freebase have been extracted. large Text Classification, clustering 2011 [330][331] Freebase
Farm Ads Dataset The text of farm ads from websites. Binary approval or disapproval by content owners is given. SVMlight sparse vectors of text words in ads calculated. 4143 Text Classification 2011 [332][333] C. Masterharm et al.


Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Poker Hand Dataset 5 card hands from a standard 52 card deck. Attributes of each hand are given, including the Poker hands formed by the cards it contains. 1,025,010 Text Regression, classification 2007 [334] R. Cattral
Connect-4 Dataset Contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced. None. 67,557 Text Classification 1995 [335] J. Tromp
Chess (King-Rook vs. King) Dataset Endgame Database for White King and Rook against Black King. None. 28,056 Text Classification 1994 [336][337] M. Bain et al.
Chess (King-Rook vs. King-Pawn) Dataset King+Rook versus King+Pawn on a7. None. 3196 Text Classification 1989 [338] R. Holte
Tic-Tac-Toe Endgame Dataset Binary classification for win conditions in tic-tac-toe. None. 958 Text Classification 1991 [339] D. Aha

Other Multivariate

Dataset Name Brief description Preprocessing Instances Format Default Task Created (updated) Reference Creator
Housing Data Set Median home values of Boston with associated home and neighborhood attributes. None. 506 Text Regression 1993 [340] D. Harrison et al.
The Getty Vocabularies structured terminology for art and other material culture, archival materials, visual surrogates, and bibliographic materials. None. large Text Classification 2015 [341] Getty Center
Yahoo! Front Page Today Module User Click Log User click log for news articles displayed in the Featured Tab of the Today Module on Yahoo! Front Page. Conjoint analysis with a bilinear model. 45,811,883 user visits Text Regression, clustering 2009 [342][343] Chu et al.
British Oceanographic Data Centre Biological, chemical, physical and geophysical data for oceans. 22K variables tracked. Various. 22K variables, many instances Text Regression, clustering 2015 [344] British Oceanographic Data Centre
Congressional Voting Records Dataset Voting data for all USA representatives on 16 issues. Beyond the raw voting data, various other features are provided. 435 Text Classification 1987 [345] J. Schlimmer
Entree Chicago Recommendation Dataset Record of user interactions with Entree Chicago recommendation system. Details of each users usage of the app are recorded in detail. 50,672 Text Regression, recommendation 2000 [346] R. Burke
Insurance Company Benchmark (COIL 2000) Information on customers of an insurance company. Many features of each customer and the services they use. 9,000 Text Regression, classification 2000 [347][348] P. van der Putten
Nursery Dataset Data from applicants to nursery schools. Data about applicant’s family and various other factors included. 12,960 Text Classification 1997 [349][350] V. Rajkovic et al.
University Dataset Data describing attributed of a large number of universities. None. 285 Text Clustering, classification 1988 [351] S. Sounders et al.
Blood Transfusion Service Center Dataset Data from blood transfusion service center. Gives data on donors return rate, frequency, etc. None. 748 Text Classification 2008 [352][353] I. Yeh
Record Linkage Comparison Patterns Dataset Large dataset of records. Task is to link relevant records together. Blocking procedure applied to select only certain record pairs. 5,749,132 Text Classification 2011 [354][355] University of Mainz
Nomao Dataset Nomao collects data about places from many different sources. Task is to detect items that describe the same place. Duplicates labeled. 34,465 Text Classification 2012 [356][357] Nomao Labs
Movie Dataset Data for 10,000 movies. Several features for each movie are given. 10,000 Text Clustering, classification 1999 [358] G. Wiederhold
Open University Learning Analytics Dataset Information about students and their interactions with a virtual learning environment. None. ~ 30,000 Text Classification, clustering, regression 2015 [359][360] J. Kuzilek et al.


  1. Wissner-Gross, A. “Datasets Over Algorithms”. Retrieved 8 January 2016.
  2. Weiss, Gary M., and Foster Provost. “Learning when training data are costly: the effect of class distribution on tree induction.” Journal of Artificial Intelligence Research (2003): 315-354.
  3. Turney, Peter. “Types of cost in inductive concept learning.” (2000).
  4. Abney, Steven. Semisupervised learning for computational linguistics. CRC Press, 2007.
  5. Žliobaitė, Indrė, et al. “Active learning with evolving streaming data.” Machine Learning and Knowledge Discovery in Databases. Springer Berlin Heidelberg, 2011. 597-612.
  6. Phillips, P. Jonathon, et al. “The FERET database and evaluation procedure for face-recognition algorithms.” Image and vision computing 16.5 (1998): 295-306.
  7. Wiskott, Laurenz, et al. “Face recognition by elastic bunch graph matching.”Pattern Analysis and Machine Intelligence, IEEE Transactions on 19.7 (1997): 775-779.
  8. Sim, Terence, Simon Baker, and Maan Bsat. “The CMU pose, illumination, and expression (PIE) database.” Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on. IEEE, 2002.
  9. Schroff, Florian, et al. “Pose, illumination and expression invariant pairwise face-similarity measure via doppelgänger list comparison.”Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.
  10. Grgic, Mislav, Kresimir Delac, and Sonja Grgic. “SCface–surveillance cameras face database.” Multimedia tools and applications 51.3 (2011): 863-879.
  11. Wallace, Roy, et al. “Inter-session variability modelling and joint factor analysis for face authentication.” Biometrics (IJCB), 2011 International Joint Conference on. IEEE, 2011.
  12. Schroff, Florian, Dmitry Kalenichenko, and James Philbin. “Facenet: A unified embedding for face recognition and clustering.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
  13. Wolf, Lior, Tal Hassner, and Itay Maoz. “Face recognition in unconstrained videos with matched background similarity.” Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
  14. de Almeida Freitas, Fernando, et al. “Grammatical Facial Expressions Recognition with Machine Learning.” FLAIRS Conference. 2014.
  15. Mitchell, Tom M. “Machine learning. WCB.” (1997).
  16. Xiaofeng He and Partha Niyogi. Locality Preserving Projections. NIPS. 2003.
  17. Georghiades, A. “Yale face database.” Center for computational Vision and Control at Yale University, http://cvc. yale. edu/projects/yalefaces/yalefa 2 (1997).
  18. Nguyen, Duy, et al. “Real-time face detection and lip feature extraction using field-programmable gate arrays.” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 36.4 (2006): 902-912.
  19. Kanade, Takeo, Jeffrey F. Cohn, and Yingli Tian. “Comprehensive database for facial expression analysis.” Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on. IEEE, 2000.
  20. Zeng, Zhihong, et al. “A survey of affect recognition methods: Audio, visual, and spontaneous expressions.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.1 (2009): 39-58.
  21. Ng, Hong-Wei, and Stefan Winkler. “A data-driven approach to cleaning large face datasets.” Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2014.
  22. RoyChowdhury, Aruni; Lin, Tsung-Yu; Maji, Subhransu; Learned-Miller, Erik (2015). “One-to-many face recognition with bilinear CNNs”. arXiv:1506.01342 [cs.CV].
  23. Jesorsky, Oliver, Klaus J. Kirchberg, and Robert W. Frischholz. “Robust face detection using the hausdorff distance.” Audio-and video-based biometric person authentication. Springer Berlin Heidelberg, 2001.
  24. Huang, Gary B., et al. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Vol. 1. No. 2. Technical Report 07-49, University of Massachusetts, Amherst, 2007.
  25. Bhatt, Rajen B., et al. “Efficient skin region segmentation using low complexity fuzzy decision tree model.” India Conference (INDICON), 2009 Annual IEEE. IEEE, 2009.
  26. Lingala, Mounika, et al. “Fuzzy logic color detection: Blue areas in melanoma dermoscopy images.” Computerized Medical Imaging and Graphics 38.5 (2014): 403-410.
  27. Maes, Chris, et al. “Feature detection on 3D face surfaces for pose normalisation and recognition.” Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on. IEEE, 2010.
  28. Savran, Arman, et al. “Bosphorus database for 3D face analysis.” Biometrics and Identity Management. Springer Berlin Heidelberg, 2008. 47-56.
  29. Heseltine, Thomas, Nick Pears, and Jim Austin. “Three-dimensional face recognition: An eigensurface approach.” Image Processing, 2004. ICIP’04. 2004 International Conference on. Vol. 2. IEEE, 2004.
  30. Ge, Yun, et al. “3D Novel Face Sample Modeling for Face Recognition.”Journal of Multimedia 6.5 (2011): 467-475.
  31. Wang, Yueming, Jianzhuang Liu, and Xiaoou Tang. “Robust 3D face recognition by local shape difference boosting.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 32.10 (2010): 1858–1870.
  32. Zhong, Cheng, Zhenan Sun, and Tieniu Tan. “Robust 3D face recognition using learned visual codebook.” Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on. IEEE, 2007.
  33. Soyel, Hamit, and Hasan Demirel. “Facial expression recognition using 3D facial feature distances.” Image Analysis and Recognition. Springer Berlin Heidelberg, 2007. 831-838.
  34. Bowyer, Kevin W., Kyong Chang, and Patrick Flynn. “A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition.” Computer vision and image understanding 101.1 (2006): 1-15.
  35. Tan, Xiaoyang, and Bill Triggs. “Enhanced local texture feature sets for face recognition under difficult lighting conditions.” Image Processing, IEEE Transactions on 19.6 (2010): 1635–1650.
  36. Mousavi, Mir Hashem, Karim Faez, and Amin Asghari. “Three dimensional face recognition using svm classifier.” Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on. IEEE, 2008.
  37. Amberg, Brian, Reinhard Knothe, and Thomas Vetter. “Expression invariant 3D face recognition with a morphable model.” Automatic Face & Gesture Recognition, 2008. FG’08. 8th IEEE International Conference on. IEEE, 2008.
  38. İrfanoğlu, M. O., Berk Gökberk, and Lale Akarun. “3D shape-based face recognition using automatically registered facial surfaces.” Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. Vol. 4. IEEE, 2004.
  39. Beumier, Charles, and Marc Acheroy. “Face verification from 3D and grey level clues.” Pattern recognition letters 22.12 (2001): 1321–1329.
  40. Karayev, S., et al. “A category-level 3-D object dataset: putting the Kinect to work.” Proceedings of the IEEE International Conference on Computer Vision Workshops. 2011.
  41. Tighe, Joseph, and Svetlana Lazebnik. “Superparsing: scalable nonparametric image parsing with superpixels.” Computer Vision–ECCV 2010. Springer Berlin Heidelberg, 2010. 352-365.
  42. Arbelaez, P.; Maire, M; Fowlkes, C; Malik, J (May 2011). “Contour Detection and Hierarchical Image Segmentation” (PDF). IEEE TPAM 33 (5): 898–916. Retrieved 27 February 2016.
  43. Lin, Tsung-Yi, et al. “Microsoft coco: Common objects in context.” Computer Vision–ECCV 2014. Springer International Publishing, 2014. 740-755.
  44. Russakovsky, Olga, et al. “Imagenet large scale visual recognition challenge.” International Journal of Computer Vision 115.3 (2015): 211-252.
  45. Xiao, Jianxiong, et al. “Sun database: Large-scale scene recognition from abbey to zoo.” Computer vision and pattern recognition (CVPR), 2010 IEEE conference on. IEEE, 2010.
  46. Donahue, Jeff; Jia, Yangqing; Vinyals, Oriol; Hoffman, Judy; Zhang, Ning; Tzeng, Eric; Darrell, Trevor (2013). “DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition”. arXiv:1310.1531 [cs.CV].
  47. Deng, Jia, et al. “Imagenet: A large-scale hierarchical image database.”Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.
  48. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
  49. Vyas, Apoorv, et al. “Commercial Block Detection in Broadcast News Videos.” Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing. ACM, 2014.
  50. Hauptmann, Alexander G., and Michael J. Witbrock. “Story segmentation and detection of commercials in broadcast news video.” Research and Technology Advances in Digital Libraries, 1998. ADL 98. Proceedings. IEEE International Forum on. IEEE, 1998.
  51. Tung, Anthony KH, Xin Xu, and Beng Chin Ooi. “Curler: finding and visualizing nonlinear correlation clusters.” Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 2005.
  52. Jarrett, Kevin, et al. “What is the best multi-stage architecture for object recognition?.” Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009.
  53. Lazebnik, Svetlana, Cordelia Schmid, and Jean Ponce. “Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories.”Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. Vol. 2. IEEE, 2006.
  54. Griffin, G., A. Holub, and P. Perona. Caltech-256 object category dataset California Inst. Technol., Tech. Rep. 7694, 2007 [Online]. Available: http://authors. library. caltech. edu/7694, 2007.
  55. Baeza-Yates, Ricardo, and Berthier Ribeiro-Neto. Modern information retrieval. Vol. 463. New York: ACM press, 1999.
  56. Fu, Xiping, et al. “NOKMeans: Non-Orthogonal K-means Hashing.” Computer Vision–ACCV 2014. Springer International Publishing, 2014. 162-177.
  57. Heitz, Geremy, et al. “Shape-based object localization for descriptive classification.” International journal of computer vision 84.1 (2009): 40-62.
  58. M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes Dataset.” In CVPR Workshop on The Future of Datasets in Vision, 2015.
  59. Botta, M., A. Giordana, and L. Saitta. “Learning fuzzy concept definitions.” Fuzzy Systems, 1993., Second IEEE International Conference on. IEEE, 1993.
  60. Frey, Peter W., and David J. Slate. “Letter recognition using Holland-style adaptive classifiers.” Machine learning 6.2 (1991): 161-182.
  61. Peltonen, Jaakko, Arto Klami, and Samuel Kaski. “Improved learning of Riemannian metrics for exploratory analysis.” Neural Networks 17.8 (2004): 1087–1100.
  62. Williams, Ben H., Marc Toussaint, and Amos J. Storkey. Extracting motion primitives from natural handwriting data. Springer Berlin Heidelberg, 2006.
  63. Meier, Franziska, et al. “Movement segmentation using a primitive library.”Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on. IEEE, 2011.
  64. T. E. de Campos, B. R. Babu and M. Varma. Character recognition in natural images. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, February 2009
  65. Llorens, David, et al. “The UJIpenchars Database: a Pen-Based Database of Isolated Handwritten Characters.” LREC. 2008.
  66. Calderara, Simone, Andrea Prati, and Rita Cucchiara. “Mixtures of von mises distributions for people trajectory shape analysis.” Circuits and Systems for Video Technology, IEEE Transactions on 21.4 (2011): 457-471.
  67. Guyon, Isabelle, et al. “Result analysis of the nips 2003 feature selection challenge.” Advances in neural information processing systems. 2004.
  68. LeCun, Yann, et al. “Gradient-based learning applied to document recognition.” Proceedings of the IEEE 86.11 (1998): 2278–2324.
  69. Kussul, Ernst, and Tatiana Baidyk. “Improved method of handwritten digit recognition tested on MNIST database.” Image and Vision Computing22.12 (2004): 971-981.
  70. Xu, Lei, Adam Krzyżak, and Ching Y. Suen. “Methods of combining multiple classifiers and their applications to handwriting recognition.”Systems, Man and Cybernetics, IEEE Transactions on 22.3 (1992): 418-435.
  71. Alimoglu, Fevzi, et al. “Combining multiple classifiers for pen-based handwritten digit recognition.” (1996).
  72. Tang, E. Ke, et al. “Linear dimensionality reduction using relevance weighted LDA.” Pattern recognition 38.4 (2005): 485-493.
  73. Hong, Yi, et al. “Learning a mixture of sparse distance metrics for classification and dimensionality reduction.” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.
  74. Yuan, Jiangye, Shaun S. Gleason, and Anil M. Cheriyadat. “Systematic benchmarking of aerial image segmentation.” Geoscience and Remote Sensing Letters, IEEE 10.6 (2013): 1527–1531.
  75. Vatsavai, Ranga Raju. “Object based image classification: state of the art and computational challenges.” Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data. ACM, 2013.
  76. Butenuth, Matthias, et al. “Integrating pedestrian simulation, tracking and event detection for crowd analysis.” Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on. IEEE, 2011.
  77. Fradi, Hajer, and Jean-Luc Dugelay. “Low level crowd analysis using frame-wise normalized feature for people counting.” Information Forensics and Security (WIFS), 2012 IEEE International Workshop on. IEEE, 2012.
  78. Johnson, Brian Alan, Ryutaro Tateishi, and Nguyen Thanh Hoan. “A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees.” International journal of remote sensing34.20 (2013): 6969-6982.
  79. Mohd Pozi, Muhammad Syafiq, et al. “A new classification model for a class imbalanced data set using genetic programming and support vector machines: case study for wilt disease classification.” Remote Sensing Letters6.7 (2015): 568-577.
  80. Johnson, Brian, Ryutaro Tateishi, and Zhixiao Xie. “Using geographically weighted variables for image classification.” Remote Sensing Letters 3.6 (2012): 491-499.
  81. Chatterjee, Sankhadeep, et al. “Forest Type Classification: A Hybrid NN-GA Model Based Approach.” Information Systems Design and Intelligent Applications. Springer India, 2016. 227-236.
  82. Diegert, Carl. “A combinatorial method for tracing objects using semantics of their shape.” Applied Imagery Pattern Recognition Workshop (AIPR), 2010 IEEE 39th. IEEE, 2010.
  83. Razakarivony, Sebastien, and Frédéric Jurie. “Small target detection combining foreground and background manifolds.” IAPR International Conference on Machine Vision Applications. 2013.
  84. Rohrbach, Marcus, et al. “A database for fine grained activity detection of cooking activities.”Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
  85. Kuehne, Hilde, Ali Arslan, and Thomas Serre. “The language of actions: Recovering the syntax and semantics of goal-directed human activities.”Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.
  86. Khosla, Aditya, et al. “Novel dataset for fine-grained image categorization: Stanford dogs.”Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC). 2011.
  87. Parkhi, Omkar M., et al. “Cats and dogs.”Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
  88. Razavian, Ali, et al. “CNN features off-the-shelf: an astounding baseline for recognition.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014.
  89. Ortega, Michael, et al. “Supporting ranked boolean similarity queries in MARS.” Knowledge and Data Engineering, IEEE Transactions on 10.6 (1998): 905-925.
  90. He, Xuming, Richard S. Zemel, and Miguel Á. Carreira-Perpiñán. “Multiscale conditional random fields for image labeling.” Computer vision and pattern recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE computer society conference on. Vol. 2. IEEE, 2004.
  91. Deneke, Tewodros, et al. “Video transcoding time prediction for proactive load balancing.” Multimedia and Expo (ICME), 2014 IEEE International Conference on. IEEE, 2014.
  92. Ting-Hao (Kenneth) Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Aishwarya Agrawal, Jacob Devlin, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, Lucy Vanderwende, Michel Galley, Margaret Mitchell (13 Apr 2016). “Visual Storytelling”. arXiv:1604.03968 [cs.CL].
  93. Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus (2012). “Indoor Segmentation and Support Inference from RGBD Images”. ECCV.
  94. McAuley, Julian, et al. “Image-based recommendations on styles and substitutes.” Proceedings of the 38th international ACM SIGIR conference on Research and development in information retrieval. ACM, 2015
  95. Ganesan, Kavita, and Chengxiang Zhai. “Opinion-based entity ranking.” Information retrieval 15.2 (2012): 116-150.
  96. Lv, Yuanhua, Dimitrios Lymberopoulos, and Qiang Wu. “An exploration of ranking heuristics in mobile local search.” Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval. ACM, 2012.
  97. Harper, F. Maxwell, and Joseph A. Konstan. “The MovieLens Datasets: History and Context.” ACM Transactions on Interactive Intelligent Systems (TiiS) 5.4 (2015): 19.
  98. Koenigstein, Noam, Gideon Dror, and Yehuda Koren. “Yahoo! music recommendations: modeling music ratings with temporal dynamics and item taxonomy.” Proceedings of the fifth ACM conference on Recommender systems. ACM, 2011.
  99. McFee, Brian, et al. “The million song dataset challenge.” Proceedings of the 21st international conference companion on World Wide Web. ACM, 2012.
  100. Bohanec, Marko, and Vladislav Rajkovic. “Knowledge acquisition and explanation for multi-attribute decision making.” 8th Intl Workshop on Expert Systems and their Applications. 1988.
  101. Tan, Peter J., and David L. Dowe. “MML inference of decision graphs with multi-way joins.” Australian Joint Conference on Artificial Intelligence. 2002.
  102. “Quantifying comedy on YouTube: why the number of o’s in your LOL matter”. Google Research Blog. Retrieved 2016-02-26.
  103. Kim, Byung Joo. “A Classifier for Big Data.”Convergence and Hybrid Information Technology. Springer Berlin Heidelberg, 2012. 505-512.
  104. Pérezgonzález, Jose D., and Andrew Gilbey. “Predicting Skytrax airport rankings from customer reviews.” Journal of Airport Management 5.4 (2011): 335-339.
  105. Loh, Wei-Yin, and Yu-Shan Shih. “Split selection methods for classification trees.” Statistica sinica(1997): 815-840.
  106. Lim, Tjen-Sien, Wei-Yin Loh, and Yu-Shan Shih. “A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms.” Machine learning 40.3 (2000): 203-228.
  107. Dermouche, Mohamed, et al. “A Joint Model for Topic-Sentiment Evolution over Time.” Data Mining (ICDM), 2014 IEEE International Conference on. IEEE, 2014.
  108. Rose, Tony, Mark Stevenson, and Miles Whitehead. “The Reuters Corpus Volume 1-from Yesterday’s News to Tomorrow’s Language Resources.”LREC. Vol. 2. 2002.
  109. Amini, Massih, Nicolas Usunier, and Cyril Goutte. “Learning from multiple partially observed views-an application to multilingual text categorization.”Advances in neural information processing systems. 2009.
  110. Liu, Ming, et al. “VRCA: a clustering algorithm for massive amount of texts.”Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2015.
  111. Al-Harbi, S, Almuhareb, A, Al-Thubaity , A, Khorsheed, M. S. and Al-Rajeh, A (2008) Automatic Arabic Text Classification. In, Proceedings of The 9th International Conference on the Statistical Analysis of Textual Data, Lyon, France
  112. Klimt, Bryan, and Yiming Yang. “Introducing the Enron Corpus.” CEAS. 2004.
  113. Kossinets, Gueorgi, Jon Kleinberg, and Duncan Watts. “The structure of information pathways in a social communication network.” Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2008.
  114. Androutsopoulos, Ion; Koutsias, John; Chandrinos, Konstantinos V.; Paliouras, George; Spyropoulos, Constantine D. (2000). “An evaluation of Naive Bayesian anti-spam filtering”. Proceedings of the workshop on Machine Learning in the New Information Age, G. Potamias, V. Moustakis and M. van Someren (eds.), the European Conference on Machine Learning, Barcelona, Spain, pp 11 (2000): 9–17. arXiv:cs/0006013.
  115. Bratko, Andrej, et al. “Spam filtering using statistical data compression models.” The Journal of Machine Learning Research 7 (2006): 2673–2698.
  116. Almeida, Tiago A., José María G. Hidalgo, and Akebo Yamakami. “Contributions to the study of SMS spam filtering: new collection and results.”Proceedings of the 11th ACM symposium on Document engineering. ACM, 2011.
  117. Delany, Sarah Jane, Mark Buckley, and Derek Greene. “SMS spam filtering: methods and data.” Expert Systems with Applications 39.10 (2012): 9899-9908.
  118. Joachims, Thorsten. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. No. CMU-CS-96-118. Carnegie-mellon univ pittsburgh pa dept of computer science, 1996.
  119. Dimitrakakis, Christos, and Samy Bengio. Online Policy Adaptation for Ensemble Algorithms. No. EPFL-REPORT-82788. IDIAP, 2002.
  120. Go, Alec, Richa Bhayani, and Lei Huang. “Twitter sentiment classification using distant supervision.” CS224N Project Report, Stanford 1 (2009): 12.
  121. Chikersal, Prerna, Soujanya Poria, and Erik Cambria. “SeNTU: sentiment analysis of tweets by combining a rule-based classifier with supervised learning.” Proceedings of the International Workshop on Semantic Evaluation, SemEval. 2015.
  122. Zafarani, Reza, and Huan Liu. “Social computing data repository at ASU.” School of Computing, Informatics and Decision Systems Engineering, Arizona State University (2009).
  123. Bisgin, Halil, Nitin Agarwal, and Xiaowei Xu. “Investigating homophily in online social networks.” Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on. Vol. 1. IEEE, 2010.
  124. McAuley, Julian J., and Jure Leskovec. “Learning to Discover Social Circles in Ego Networks.” NIPS. Vol. 2012. 2012.
  125. Šubelj, Lovro, Dalibor Fiala, and Marko Bajec. “Network-based statistical comparison of citation topology of bibliographic databases.” Scientific reports 4 (2014).
  126. Abdulla, N., et al. “Arabic sentiment analysis: Corpus-based and lexicon-based.” Proceedings of The IEEE conference on Applied Electrical Engineering and Computing Technologies (AEECT). 2013.
  127. Abooraig, Raddad, et al. “On the automatic categorization of arabic articles based on their political orientation.” Third International Conference on Informatics Engineering and Information Science (ICIEIS2014). 2014.
  128. Kawala, François, et al. “Prédictions d’activité dans les réseaux sociaux en ligne.” 4ième conférence sur les modèles et l’analyse des réseaux: Approches mathématiques et informatiques. 2013.
  129. Sabharwal, Ashish; Samulowitz, Horst; Tesauro, Gerald (2015). “Selecting Near-Optimal Learners via Incremental Data Allocation”. arXiv:1601.00024 [cs.LG].
  130. Galgani, Filippo, Paul Compton, and Achim Hoffmann. “Combining different summarization techniques for legal text.” Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data. Association for Computational Linguistics, 2012.
  131. Nagwani, N. K. “Summarizing large text collection using topic modeling and clustering based on MapReduce framework.” Journal of Big Data 2.1 (2015): 1-18.
  132. Schler, Jonathan, et al. “Effects of Age and Gender on Blogging.” AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs. Vol. 6. 2006.
  133. Anand, Pranav, et al. “Believe Me-We Can Do This! Annotating Persuasive Acts in Blog Text.”Computational Models of Natural Argument. 2011.
  134. Traud, Amanda L., Peter J. Mucha, and Mason A. Porter. “Social structure of Facebook networks.” Physica A: Statistical Mechanics and its Applications391.16 (2012): 4165-4180.
  135. Richard, Emile; Savalle, Pierre-Andre; Vayatis, Nicolas (2012). “Estimation of Simultaneously Sparse and Low Rank Matrices”. arXiv:1206.6474 [cs.DS].
  136. Richardson, Matthew, Christopher JC Burges, and Erin Renshaw. “MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text.” EMNLP. Vol. 1. 2013.
  137. Weston, Jason; Bordes, Antoine; Chopra, Sumit; Rush, Alexander M.; Bart van Merriënboer; Joulin, Armand; Mikolov, Tomas (2015). “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks”. arXiv:1502.05698 [cs.AI].
  138. Marcus, Mitchell P., Mary Ann Marcinkiewicz, and Beatrice Santorini. “Building a large annotated corpus of English: The Penn Treebank.”Computational linguistics 19.2 (1993): 313-330.
  139. Collins, Michael. “Head-driven statistical models for natural language parsing.” Computational linguistics 29.4 (2003): 589-637.
  140. Guyon, Isabelle, et al., eds. Feature extraction: foundations and applications. Vol. 207. Springer, 2008.
  141. Lin, Yuri, et al. “Syntactic annotations for the google books ngram corpus.” Proceedings of the ACL 2012 system demonstrations. Association for Computational Linguistics, 2012.
  142. Krishnamoorthy, Niveda, et al. “Generating Natural-Language Video Descriptions Using Text-Mined Knowledge.” AAAI. Vol. 1. 2013.
  143. Luyckx, Kim, and Walter Daelemans. “Personae: a Corpus for Author and Personality Prediction from Text.” LREC. 2008.
  144. Solorio, Thamar, Ragib Hasan, and Mainul Mizan. “A case study of sockpuppet detection in wikipedia.” Workshop on Language Analysis in Social Media (LASM) at NAACL HLT. 2013.
  145. Ciarelli, Patrick Marques, and Elias Oliveira. “Agglomeration and elimination of terms for dimensionality reduction.” Intelligent Systems Design and Applications, 2009. ISDA’09. Ninth International Conference on. IEEE, 2009.
  146. Zhou, Mingyuan, Oscar Hernan Madrid Padilla, and James G. Scott. “Priors for random count matrices derived from a family of negative binomial processes.” Journal of the American Statistical Association just-accepted (2015): 00-00.
  147. Kotzias, Dimitrios, et al. “From group to individual labels using deep features.” Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015.
  148. Ning, Yue; Muthiah, Sathappan; Rangwala, Huzefa; Ramakrishnan, Naren (2016). “Modeling Precursors for Event Forecasting via Nested Multi-Instance Learning”. arXiv:1602.08033 [cs.SI].
  149. Buza, Krisztian. “Feedback prediction for blogs.”Data analysis, machine learning and knowledge discovery. Springer International Publishing, 2014. 145-152.
  150. Soysal, Ömer M. “Association rule mining with mostly associated sequential patterns.” Expert Systems with Applications 42.5 (2015): 2582–2592.
  151. Sakar, Betul Erdogdu, et al. “Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings.” Biomedical and Health Informatics, IEEE Journal of 17.4 (2013): 828-834.
  152. Zhao, Shunan, et al. “Automatic detection of expressed emotion in Parkinson’s disease.” Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014.
  153. Used in: Hammami, Nacereddine, and Mouldi Bedda. “Improved tree model for arabic speech recognition.” Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on. Vol. 5. IEEE, 2010.
  154. Maaten, Laurens. “Learning discriminative fisher kernels.” Proceedings of the 28th International Conference on Machine Learning (ICML-11). 2011.
  155. Cole, Ronald, and Mark Fanty. “Spoken letter recognition.” Proc. Third DARPA Speech and Natural Language Workshop. 1990.
  156. Chapelle, Olivier, Vikas Sindhwani, and Sathiya S. Keerthi. “Optimization techniques for semi-supervised support vector machines.” The Journal of Machine Learning Research 9 (2008): 203-233.
  157. Kudo, Mineichi, Jun Toyama, and Masaru Shimbo. “Multidimensional curve classification using passing-through regions.” Pattern Recognition Letters 20.11 (1999): 1103–1111.
  158. Jaeger, Herbert, et al. “Optimization and applications of echo state networks with leaky-integrator neurons.” Neural Networks 20.3 (2007): 335-352.
  159. Tsanas, Athanasios, et al. “Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests.” Biomedical Engineering, IEEE Transactions on 57.4 (2010): 884-893.
  160. Clifford, Gari D., and David Clifton. “Wireless technology in disease management and medicine.” Annual review of medicine 63 (2012): 479-492.
  161. Zue, Victor, Stephanie Seneff, and James Glass. “Speech database development at MIT: TIMIT and beyond.” Speech Communication 9.4 (1990): 351-356.
  162. Kapadia, Sadik, Valtcho Valtchev, and S. J. Young. “MMI training for continuous phoneme recognition on the TIMIT database.” Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on. Vol. 2. IEEE, 1993.
  163. Zhou, Fang, Q. Claire, and Ross D. King. “Predicting the geographical origin of music.” Data Mining (ICDM), 2014 IEEE International Conference on. IEEE, 2014.
  164. Saccenti, Edoardo, and José Camacho. “On the use of the observation‐wise k‐fold operation in PCA cross‐validation.” Journal of Chemometrics 29.8 (2015): 467-478.
  165. Bertin-Mahieux, Thierry, et al. “The million song dataset.” ISMIR 2011: Proceedings of the 12th International Society for Music Information Retrieval Conference, October 24–28, 2011, Miami, Florida. University of Miami, 2011.
  166. Henaff, Mikael, et al. “Unsupervised learning of sparse features for scalable audio classification.” ISMIR. Vol. 11. 2011.
  167. Esposito, Roberto, and Daniele P. Radicioni. “Carpediem: Optimizing the viterbi algorithm and applications to supervised sequential learning.”The Journal of Machine Learning Research 10 (2009): 1851–1880.
  168. Sourati, Jamshid, et al. “Classification Active Learning Based on Mutual Information.” Entropy 18.2 (2016): 51.
  169. Salamon, Justin, Christopher Jacoby, and Juan Pablo Bello. “A dataset and taxonomy for urban sound research.” Proceedings of the ACM International Conference on Multimedia. ACM, 2014.
  170. Lagrange, Mathieu; Lafay, Grégoire; Rossignol, Mathias; Benetos, Emmanouil; Roebel, Axel (2015). “An evaluation framework for event detection using a morphological model of acoustic scenes”. arXiv:1502.00141 [stat.ML].
  171. The CAIDA UCSD Dataset on the Witty Worm – March 19–24, 2004,
  172. Chen, Zesheng, and Chuanyi Ji. “Optimal worm-scanning method using vulnerable-host distributions.” International Journal of Security and Networks 2.1-2 (2007): 71-80.
  173. Kachuee, Mohamad, et al. “Cuff-less high-accuracy calibration-free blood pressure estimation using pulse transit time.” Circuits and Systems (ISCAS), 2015 IEEE International Symposium on. IEEE, 2015.
  174. PhysioBank, PhysioToolkit. “PhysioNet: components of a new research resource for complex physiologic signals.” Circulation. v101 i23. e215-e220.
  175. Vergara, Alexander, et al. “Chemical gas sensor drift compensation using classifier ensembles.”Sensors and Actuators B: Chemical 166 (2012): 320-329.
  176. Korotcenkov, G., and B. K. Cho. “Engineering approaches to improvement of conductometric gas sensor parameters. Part 2: Decrease of dissipated (consumable) power and improvement stability and reliability.” Sensors and Actuators B: Chemical 198 (2014): 316-341.
  177. Quinlan, John R. “Learning with continuous classes.” 5th Australian joint conference on artificial intelligence. Vol. 92. 1992.
  178. Merz, Christopher J., and Michael J. Pazzani. “A principal components approach to combining regression estimates.” Machine learning 36.1-2 (1999): 9-32.
  179. Torres-Sospedra, Joaquin, et al. “UJIIndoorLoc-Mag: A new database for magnetic field-based localization problems.” Indoor Positioning and Indoor Navigation (IPIN), 2015 International Conference on. IEEE, 2015.
  180. Berkvens, Rafael, Maarten Weyn, and Herbert Peremans. “Mean Mutual Information of Probabilistic Wi-Fi Localization.” Indoor Positioning and Indoor Navigation (IPIN), 2015 International Conference on. Banff, Canada: IPIN. 2015.
  181. Paschke, Fabian, et al. “Sensorlose Zustandsüberwachung an Synchronmotoren.”Proceedings. 23. Workshop Computational Intelligence, Dortmund, 5.-6. Dezember 2013. KIT Scientific Publishing, 2013.
  182. Lessmeier, Christian, et al. “Data Acquisition and Signal Analysis from Measured Motor Currents for Defect Detection in Electromechanical Drive Systems.”
  183. Ugulino, Wallace, et al. “Wearable computing: Accelerometers’ data classification of body postures and movements.” Advances in Artificial Intelligence-SBIA 2012. Springer Berlin Heidelberg, 2012. 52-61.
  184. Schneider, Jan, et al. “Augmenting the senses: a review on sensor-based learning support.” Sensors 15.2 (2015): 4097-4133.
  185. Madeo, Renata CB, Clodoaldo AM Lima, and Sarajane M. Peres. “Gesture unit segmentation using support vector machines: segmenting gestures from rest positions.” Proceedings of the 28th Annual ACM Symposium on Applied Computing. ACM, 2013.
  186. Lun, Roanna, and Wenbing Zhao. “A survey of applications and human motion recognition with Microsoft Kinect.” International Journal of Pattern Recognition and Artificial Intelligence 29.05 (2015): 1555008.
  187. Theodoridis, Theodoros, and Huosheng Hu. “Action classification of 3d human models using dynamic ANNs for mobile robot surveillance.”Robotics and Biomimetics, 2007. ROBIO 2007. IEEE International Conference on. IEEE, 2007.
  188. Etemad, Seyed Ali, and Ali Arya. “3D human action recognition and style transformation using resilient backpropagation neural networks.” Intelligent Computing and Intelligent Systems, 2009. ICIS 2009. IEEE International Conference on. Vol. 4. IEEE, 2009.
  189. Altun, Kerem, Billur Barshan, and Orkun Tunçel. “Comparative study on classifying human activities with miniature inertial and magnetic sensors.”Pattern Recognition 43.10 (2010): 3605-3620.
  190. Nathan, Ran, et al. “Using tri-axial acceleration data to identify behavioral modes of free-ranging animals: general concepts and tools illustrated for griffon vultures.” The Journal of experimental biology 215.6 (2012): 986-996.
  191. Anguita, Davide, et al. “Human activity recognition on smartphones using a multiclass hardware-friendly support vector machine.” Ambient assisted living and home care. Springer Berlin Heidelberg, 2012. 216-223.
  192. Su, Xing, Hanghang Tong, and Ping Ji. “Activity recognition with smartphone sensors.” Tsinghua Science and Technology 19.3 (2014): 235-249.
  193. Kadous, Mohammed Waleed. Temporal classification: Extending the classification paradigm to multivariate time series. Diss. The University of New South Wales, 2002.
  194. Graves, Alex, et al. “Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks.” Proceedings of the 23rd international conference on Machine learning. ACM, 2006.
  195. Velloso, Eduardo, et al. “Qualitative activity recognition of weight lifting exercises.”Proceedings of the 4th Augmented Human International Conference. ACM, 2013.
  196. Mortazavi, Bobak Jack, et al. “Determining the single best axis for exercise repetition recognition and counting on smartwatches.” Wearable and Implantable Body Sensor Networks (BSN), 2014 11th International Conference on. IEEE, 2014.
  197. Sapsanis, Christos, et al. “Improving EMG based Classification of basic hand movements using EMD.” Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE. IEEE, 2013.
  198. Andrianesis, Konstantinos, and Anthony Tzes. “Development and control of a multifunctional prosthetic hand with shape memory alloy actuators.” Journal of Intelligent & Robotic Systems 78.2 (2015): 257-289.
  199. Banos, Oresti, et al. “Dealing with the effects of sensor displacement in wearable activity recognition.” Sensors 14.6 (2014): 9995-10023.
  200. Stisen, Allan, et al. “Smart Devices are Different: Assessing and MitigatingMobile Sensing Heterogeneities for Activity Recognition.”Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems. ACM, 2015.
  201. Bhattacharya, Sourav, and Nicholas D. Lane. “From Smart to Deep: Robust Activity Recognition on Smartwatches using Deep Learning.”
  202. Bacciu, Davide, et al. “An experimental characterization of reservoir computing in ambient assisted living applications.” Neural Computing and Applications 24.6 (2014): 1451–1464.
  203. Palumbo, Filippo, et al. “Multisensor data fusion for activity recognition based on reservoir computing.” Evaluating AAL systems through competitive benchmarking. Springer Berlin Heidelberg, 2013. 24-35.
  204. Reiss, Attila, and Didier Stricker. “Introducing a new benchmarked dataset for activity monitoring.”Wearable Computers (ISWC), 2012 16th International Symposium on. IEEE, 2012.
  205. Roggen, Daniel, et al. “OPPORTUNITY: Towards opportunistic activity and context recognition systems.” World of Wireless, Mobile and Multimedia Networks & Workshops, 2009. WoWMoM 2009. IEEE International Symposium on a. IEEE, 2009.
  206. Kurz, Marc, et al. “Dynamic quantification of activity recognition capabilities in opportunistic systems.” Vehicular Technology Conference (VTC Spring), 2011 IEEE 73rd. IEEE, 2011.
  207. Aeberhard, S., D. Coomans, and O. De Vel. “Comparison of classifiers in high dimensional settings.” Dept. Math. Statist., James Cook Univ., North Queensland, Australia, Tech. Rep 92-02 (1992).
  208. Basu, Sugato. “Semi-supervised clustering with limited background knowledge.” AAAI. 2004.
  209. Tüfekci, Pınar. “Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods.” International Journal of Electrical Power & Energy Systems 60 (2014): 126-140.
  210. Kaya, Heysem, Pınar Tüfekci, and Fikret S. Gürgen. “Local and global learning methods for predicting power of a combined gas & steam turbine.” International conference on emerging trends in computer and electronics engineering (ICETCEE’2012), Dubai. 2012.
  211. Baldi, Pierre, Peter Sadowski, and Daniel Whiteson. “Searching for exotic particles in high-energy physics with deep learning.” Nature communications 5 (2014).
  212. Baldi, Pierre, Peter Sadowski, and Daniel Whiteson. “Enhanced Higgs Boson to τ+ τ− Search with Deep Learning.” Physical review letters 114.11 (2015): 111801.
  213. “The Higgs Machine Learning Challenge”.
  214. Pierre Baldi, Kyle Cranmer, Taylor Faucett, Peter Sadowski, and Daniel Whiteson. ‘Parameterized Machine Learning for High-Energy Physics.’ In submission.
  215. Ortigosa, I., R. Lopez, and J. Garcia. “A neural networks approach to residuary resistance of sailing yachts prediction.” Proceedings of the international conference on marine engineering MARINE. Vol. 2007. 2007.
  216. Gerritsma, J., R. Onnink, and A. Versluis.Geometry, resistance and stability of the delft systematic yacht hull series. Delft University of Technology, 1981.
  217. Liu, Huan, and Hiroshi Motoda. Feature extraction, construction and selection: A data mining perspective. Springer Science & Business Media, 1998.
  218. Reich, Yoram. Converging to Ideal Design Knowledge by Learning. [Carnegie Mellon University], Engineering Design Research Center, 1989.
  219. Todorovski, Ljupčo, and Sašo Džeroski.Experiments in meta-level learning with ILP. Springer Berlin Heidelberg, 1999.
  220. Wang, Yong. A new approach to fitting linear models in high dimensional spaces. Diss. The University of Waikato, 2000.
  221. Kibler, Dennis, David W. Aha, and Marc K. Albert. “Instance‐based prediction of real‐valued attributes.” Computational Intelligence 5.2 (1989): 51-57.
  222. Palmer, Christopher R., and Christos Faloutsos. “Electricity based external similarity of categorical attributes.” Advances in Knowledge Discovery and Data Mining. Springer Berlin Heidelberg, 2003. 486-500.
  223. Tsanas, Athanasios, and Angeliki Xifara. “Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools.” Energy and Buildings 49 (2012): 560-567.
  224. De Wilde, Pieter. “The gap between predicted and measured energy performance of buildings: A framework for investigation.” Automation in Construction 41 (2014): 40-49.
  225. Brooks, Thomas F., D. Stuart Pope, and Michael A. Marcolini. Airfoil self-noise and prediction. Vol. 1218. National Aeronautics and Space Administration, Office of Management, Scientific and Technical Information Division, 1989.
  226. Draper, David. “Assessment and propagation of model uncertainty.” Journal of the Royal Statistical Society. Series B (Methodological) (1995): 45-97.
  227. Lavine, Michael. “Problems in extrapolation illustrated with space shuttle O-ring data.” Journal of the American Statistical Association 86.416 (1991): 919-921.
  228. Wang, Jun, Bei Yu, and Les Gasser. “Concept tree based clustering visualization with shaded similarity matrices.” Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on. IEEE, 2002.
  229. Pettengill, Gordon H., et al. “Magellan: Radar performance and data products.” Science252.5003 (1991): 260-265.
  230. Aharonian, F., et al. “Energy spectrum of cosmic-ray electrons at TeV energies.” Physical Review Letters 101.26 (2008): 261104.
  231. Bock, R. K., et al. “Methods for multidimensional event classification: a case study using images from a Cherenkov gamma-ray telescope.” Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 516.2 (2004): 511-528.
  232. Li, Jinyan, et al. “Deeps: A new instance-based lazy discovery and classification system.” Machine Learning 54.2 (2004): 99-124.
  233. Siebert, Lee, and Tom Simkin. “Volcanoes of the world: an illustrated catalog of Holocene volcanoes and their eruptions.” (2014).
  234. Sikora, Marek, and Łukasz Wróbel. “Application of rule induction algorithms for analysis of data collected by seismic hazard monitoring systems in coal mines.” Archives of Mining Sciences 55.1 (2010): 91-114.
  235. Sikora, Marek, and Beata Sikora. “Rough natural hazards monitoring.” Rough Sets: Selected Methods and Applications in Management and Engineering. Springer London, 2012. 163-179.
  236. Yeh, I-C. “Modeling of strength of high-performance concrete using artificial neural networks.” Cement and Concrete research 28.12 (1998): 1797–1808.
  237. Zarandi, MH Fazel, et al. “Fuzzy polynomial neural networks for approximation of the compressive strength of concrete.” Applied Soft Computing 8.1 (2008): 488-498.
  238. Yeh, I. “Modeling slump of concrete with fly ash and superplasticizer.” Computers and Concrete5.6 (2008): 559-572.
  239. Gencel, Osman, et al. “Comparison of artificial neural networks and general linear model approaches for the analysis of abrasive wear of concrete.”Construction and building materials 25.8 (2011): 3486-3494.
  240. Dietterich, Thomas G., et al. “A comparison of dynamic reposing and tangent distance for drug activity prediction.” Advances in Neural Information Processing Systems (1994): 216-216.
  241. Buscema, Massimo, William J. Tastle, and Stefano Terzi. “Meta net: A new meta-classifier family.”Data Mining Applications Using Artificial Adaptive Systems. Springer New York, 2013. 141-182.
  242. Used in: Ingber, Lester. “Statistical mechanics of neocortical interactions: Canonical momenta indicatorsof electroencephalography.” Physical Review E 55.4 (1997): 4578.
  243. Ingber, Lester. “Statistical mechanics of neocortical interactions: Canonical momenta indicatorsof electroencephalography.” Physical Review E 55.4 (1997): 4578.
  244. Hoffmann, Ulrich, et al. “An efficient P300-based brain–computer interface for disabled subjects.” Journal of Neuroscience methods 167.1 (2008): 115-125.
  245. Donchin, Emanuel, Kevin M. Spencer, and Ranjith Wijesinghe. “The mental prosthesis: assessing the speed of a P300-based brain-computer interface.”Rehabilitation Engineering, IEEE Transactions on8.2 (2000): 174-179.
  246. Detrano, Robert, et al. “International application of a new probability algorithm for the diagnosis of coronary artery disease.” The American journal of cardiology 64.5 (1989): 304-310.
  247. Bradley, Andrew P. “The use of the area under the ROC curve in the evaluation of machine learning algorithms.” Pattern recognition 30.7 (1997): 1145–1159.
  248. Street, W. Nick, William H. Wolberg, and Olvi L. Mangasarian. “Nuclear feature extraction for breast tumor diagnosis.” IS&T/SPIE’s Symposium on Electronic Imaging: Science and Technology. International Society for Optics and Photonics, 1993.
  249. Demir, Cigdem, and Bülent Yener. “Automated cancer diagnosis based on histopathological images: a systematic survey.” Rensselaer Polytechnic Institute, Tech. Rep (2005).
  250. Abuse, Substance. “Mental Health Services Administration, Results from the 2010 National Survey on Drug Use and Health: Summary of National Findings, NSDUH Series H-41, HHS Publication No.(SMA) 11-4658.” Rockville, MD: Substance Abuse and Mental Health Services Administration 201 (2011).
  251. Hong, Zi-Quan, and Jing-Yu Yang. “Optimal discriminant plane for a small number of samples and design method of classifier on the plane.”pattern recognition 24.4 (1991): 317-324.
  252. Li, Jinyan, and Limsoon Wong. “Using rules to analyse bio-medical data: a comparison between C4. 5 and PCL.” Advances in Web-Age Information Management. Springer Berlin Heidelberg, 2003. 254-265.
  253. Güvenir, H. Altay, et al. “A supervised machine learning algorithm for arrhythmia analysis.”Computers in Cardiology 1997. IEEE, 1997.
  254. Lagus, Krista, et al. “Independent variable group analysis in learning compact representations for data.” Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR’05), T. Honkela, V. Könönen, M. Pöllä, and O. Simula, Eds., Espoo, Finland. 2005.
  255. Strack, Beata, et al. “Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records.” BioMed research international 2014 (2014).
  256. Rubin, Daniel J. “Hospital readmission of patients with diabetes.” Current diabetes reports 15.4 (2015): 1-9.
  257. Antal, Bálint, and András Hajdu. “An ensemble-based system for automatic screening of diabetic retinopathy.” Knowledge-Based Systems 60 (2014): 20-27.
  258. Haloi, Mrinal (2015). “Improved Microaneurysm Detection using Deep Neural Networks”. arXiv:1505.04424 [cs.CV].
  259. Bagirov, A. M., et al. “Unsupervised and supervised data classification via nonsmooth and global optimization.” Top 11.1 (2003): 1-75.
  260. Fung, Glenn, et al. “A fast iterative algorithm for fisher discriminant using heterogeneous kernels.”Proceedings of the twenty-first international conference on Machine learning. ACM, 2004.
  261. Quinlan, John Ross, et al. “Inductive knowledge acquisition: a case study.” Proceedings of the Second Australian Conference on Applications of expert systems. Addison-Wesley Longman Publishing Co., Inc., 1987.
  262. Zhou, Zhi-Hua, and Yuan Jiang. “NeC4. 5: neural ensemble based C4. 5.” Knowledge and Data Engineering, IEEE Transactions on 16.6 (2004): 770-773.
  263. Er, Orhan, et al. “An approach based on probabilistic neural network for diagnosis of Mesothelioma’s disease.” Computers & Electrical Engineering 38.1 (2012): 75-81.
  264. Er, Orhan, A. Çetin Tanrikulu, and Abdurrahman Abakay. “Use of artificial intelligence techniques for diagnosis of malignant pleural mesothelioma.”Dicle Tıp Dergisi 42.1 (2015).
  265. Shannon, Paul, et al. “Cytoscape: a software environment for integrated models of biomolecular interaction networks.” Genome research 13.11 (2003): 2498–2504.
  266. Clark, David, Zoltan Schreter, and Anthony Adams. “A quantitative comparison of dystal and backpropagation.” Proceedings of 1996 Australian Conference on Neural Networks. 1996.
  267. Jiang, Yuan, and Zhi-Hua Zhou. “Editing training data for kNN classifiers with neural network ensemble.” Advances in Neural Networks–ISNN 2004. Springer Berlin Heidelberg, 2004. 356-361.
  268. Ontañón, Santiago, and Enric Plaza. “On similarity measures based on a refinement lattice.” Case-Based Reasoning Research and Development. Springer Berlin Heidelberg, 2009. 240-255.
  269. Higuera, Clara, Katheleen J. Gardiner, and Krzysztof J. Cios. “Self-organizing feature maps identify proteins critical to learning in a mouse model of down syndrome.” PloS one 10.6 (2015): e0129126.
  270. Ahmed, Md Mahiuddin, et al. “Protein dynamics associated with failed and rescued learning in the Ts65Dn mouse model of Down syndrome.” PloS one 10.3 (2015): e0119491.
  271. Cortez, Paulo, and Aníbal de Jesus Raimundo Morais. “A data mining approach to predict forest fires using meteorological data.” (2007).
  272. Farquad, M. A. H., V. Ravi, and S. Bapi Raju. “Support vector regression based hybrid rule extraction methods for forecasting.” Expert Systems with Applications 37.8 (2010): 5577-5589.
  273. Fisher, Ronald A. “The use of multiple measurements in taxonomic problems.” Annals of eugenics 7.2 (1936): 179-188.
  274. Ghahramani, Zoubin, and Michael I. Jordan. “Supervised learning from incomplete data via an EM approach.” Advances in neural information processing systems 6. 1994.
  275. Mallah, Charles, James Cope, and James Orwell. “Plant leaf classification using probabilistic integration of shape, texture and margin features.”Signal Processing, Pattern Recognition and Applications 5 (2013): 1.
  276. Yahiaoui, Itheri, Olfa Mzoughi, and Nozha Boujemaa. “Leaf shape descriptor for tree species identification.” Multimedia and Expo (ICME), 2012 IEEE International Conference on. IEEE, 2012.
  277. LANGLEY, PAT. “Trading off simplicity and coverage in incremental concept learning.”Machine Learning Proceedings 1988 (2014): 73.
  278. Tan, Ming, and Larry Eshelman. “Using weighted networks to represent classification knowledge in noisy domains.” Proceedings of the Fifth International Conference on Machine Learning. 2014.
  279. Charytanowicz, Małgorzata, et al. “Complete gradient clustering algorithm for features analysis of x-ray images.” Information technologies in biomedicine. Springer Berlin Heidelberg, 2010. 15-24.
  280. Sanchez, Mauricio A., et al. “Fuzzy granular gravitational clustering algorithm for multivariate data.” Information Sciences 279 (2014): 498-511.
  281. Blackard, Jock A., and Denis J. Dean. “Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables.”Computers and electronics in agriculture 24.3 (1999): 131-151.
  282. Fürnkranz, Johannes. “Round robin rule learning.”Proceedings of the 18th International Conference on Machine Learning (ICML-01): 146–153. 2001.
  283. Li, Song, Sarah M. Assmann, and Réka Albert. “Predicting essential components of signal transduction networks: a dynamic model of guard cell abscisic acid signaling.” PLoS Biol 4.10 (2006): e312.
  284. Munisami, Trishen, et al. “Plant Leaf Recognition Using Shape Features and Colour Histogram with K-nearest Neighbour Classifiers.” Procedia Computer Science 58 (2015): 740-747.
  285. Li, Bai. “Atomic potential matching: An evolutionary target recognition approach based on edge features.” Optik-International Journal for Light and Electron Optics 127.5 (2016): 3162-3168.
  286. Nakai, Kenta, and Minoru Kanehisa. “Expert system for predicting protein localization sites in gram‐negative bacteria.” Proteins: Structure, Function, and Bioinformatics 11.2 (1991): 95-110.
  287. Ling, Charles X., et al. “Decision trees with minimal costs.” Proceedings of the twenty-first international conference on Machine learning. ACM, 2004.
  288. Mahé, Pierre, et al. “Automatic identification of mixed bacterial species fingerprints in a MALDI-TOF mass-spectrum.” Bioinformatics (2014): btu022.
  289. Barbano, Duane, et al. “Rapid characterization of microalgae and microalgae mixtures using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS).” PloS one 10.8 (2015): e0135337.
  290. Horton, Paul, and Kenta Nakai. “A probabilistic classification system for predicting the cellular localization sites of proteins.” Ismb. Vol. 4. 1996.
  291. Allwein, Erin L., Robert E. Schapire, and Yoram Singer. “Reducing multiclass to binary: A unifying approach for margin classifiers.” The Journal of Machine Learning Research 1 (2001): 113-141.
  292. Brown, Michael Scott, Michael J. Pelosi, and Henry Dirska. “Dynamic-radius species-conserving genetic algorithm for the financial forecasting of Dow Jones index stocks.” Machine Learning and Data Mining in Pattern Recognition. Springer Berlin Heidelberg, 2013. 27-41.
  293. Shen, Kao-Yi, and Gwo-Hshiung Tzeng. “Fuzzy Inference-Enhanced VC-DRSA Model for Technical Analysis: Investment Decision Aid.” International Journal of Fuzzy Systems 17.3 (2015): 375-389.
  294. Quinlan, J. Ross. “Simplifying decision trees.” International journal of man-machine studies 27.3 (1987): 221-234.
  295. Hamers, Bart, Johan AK Suykens, and Bart De Moor. “Coupled transductive ensemble learning of kernel models.” Journal of Machine Learning Research 1 (2003): 1-48.
  296. Shmueli, Galit, Ralph P. Russo, and Wolfgang Jank. “The BARISTA: a model for bid arrivals in online auctions.” The Annals of Applied Statistics(2007): 412-441.
  297. Peng, Jie, and Hans-Georg Müller. “Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions.” The Annals of Applied Statistics (2008): 1056–1077.
  298. Eggermont, Jeroen, Joost N. Kok, and Walter A. Kosters. “Genetic programming for data classification: Partitioning the search space.”Proceedings of the 2004 ACM symposium on Applied computing. ACM, 2004.
  299. Moro, Sérgio, Paulo Cortez, and Paulo Rita. “A data-driven approach to predict the success of bank telemarketing.” Decision Support Systems 62 (2014): 22-31.
  300. Payne, Richard D.; Mallick, Bani K. (2014). “Bayesian Big Data Classification: A Review with Complements”. arXiv:1411.5653 [stat.ME].
  301. Akbilgic, Oguz, Hamparsum Bozdogan, and M. Erdal Balaban. “A novel Hybrid RBF Neural Networks model as a forecaster.” Statistics and Computing 24.3 (2014): 365-375.
  302. Jabin, Suraiya. “Stock market prediction using feed-forward artificial neural network.” Int. J. Comput. Appl.(IJCA) 99.9 (2014).
  303. Yeh, I-Cheng, and Che-hui Lien. “The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients.” Expert Systems with Applications 36.2 (2009): 2473–2480.
  304. Lin, Shu Ling. “A new two-stage hybrid approach of credit risk in banking industry.” Expert Systems with Applications 36.4 (2009): 8333-8341.
  305. Pelckmans, Kristiaan, et al. “The differogram: Non-parametric noise variance estimation and its use for model selection.” Neurocomputing 69.1 (2005): 100-122.
  306. Bay, Stephen D., et al. “The UCI KDD archive of large data sets for data mining research and experimentation.” ACM SIGKDD Explorations Newsletter 2.2 (2000): 81-85.
  307. Lucas, D. D., et al. “Designing optimal greenhouse gas observing networks that consider performance and cost.” Geoscientific Instrumentation, Methods and Data Systems 4.1 (2015): 121.
  308. Pales, Jack C., and Charles D. Keeling. “The concentration of atmospheric carbon dioxide in Hawaii.” Journal of Geophysical Research 70.24 (1965): 6053-6076.
  309. Sigillito, Vincent G., et al. “Classification of radar returns from the ionosphere using neural networks.” Johns Hopkins APL Technical Digest10.3 (1989): 262-266.
  310. Zhang, Kun, and Wei Fan. “Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond.” Knowledge and Information Systems14.3 (2008): 299-326.
  311. Reich, Brian J., Montserrat Fuentes, and David B. Dunson. “Bayesian spatial quantile regression.” Journal of the American Statistical Association (2012).
  312. Kohavi, Ron. “Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid.” KDD. Vol. 96. 1996.
  313. Oza, Nikunj C., and Stuart Russell. “Experimental comparisons of online and batch versions of bagging and boosting.” Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2001.
  314. Bay, Stephen D. “Multivariate discretization for set mining.” Knowledge and Information Systems 3.4 (2001): 491-512.
  315. Ruggles, Steven. “Sample designs and sampling errors.” Historical Methods: A Journal of Quantitative and Interdisciplinary History 28.1 (1995): 40-46.
  316. Meek, Christopher, Bo Thiesson, and David Heckerman. “The Learning Curve Method Applied to Clustering.” AISTATS. 2001.
  317. Fanaee-T, Hadi, and Joao Gama. “Event labeling combining ensemble detectors and background knowledge.” Progress in Artificial Intelligence 2.2-3 (2014): 113-127.
  318. Giot, Romain, and Raphaël Cherrier. “Predicting bikeshare system usage up to one day ahead.” Computational intelligence in vehicles and transportation systems (CIVTS), 2014 IEEE symposium on. IEEE, 2014.
  319. Zhan, Xianyuan, et al. “Urban link travel time estimation using large-scale taxi data with partial information.” Transportation Research Part C: Emerging Technologies 33 (2013): 37-49.
  320. Moreira-Matias, Luis, et al. “Predicting taxi–passenger demand using streaming data.”Intelligent Transportation Systems, IEEE Transactions on 14.3 (2013): 1393–1402.
  321. Hwang, Ren-Hung, Yu-Ling Hsueh, and Yu-Ting Chen. “An effective taxi recommender system based on a spatio-temporal factor analysis model.”Information Sciences 314 (2015): 28-40.
  322. Meusel, Robert, et al. “The Graph Structure in the Web–Analyzed on Different Aggregation Levels.”The Journal of Web Science 1.1 (2015).
  323. Kushmerick, Nicholas. “Learning to remove internet advertisements.” Proceedings of the third annual conference on Autonomous Agents. ACM, 1999.
  324. Fradkin, Dmitriy, and David Madigan. “Experiments with random projections for machine learning.”Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2003.
  325. This data was used in the American Statistical Association Statistical Graphics and Computing Sections 1999 Data Exposition.
  326. Ma, Justin, et al. “Identifying suspicious URLs: an application of large-scale online learning.”Proceedings of the 26th annual international conference on machine learning. ACM, 2009.
  327. Levchenko, Kirill, et al. “Click trajectories: End-to-end analysis of the spam value chain.” Security and Privacy (SP), 2011 IEEE Symposium on. IEEE, 2011.
  328. Mohammad, Rami M., Fadi Thabtah, and Lee McCluskey. “An assessment of features related to phishing websites using an automated technique.”Internet Technology And Secured Transactions, 2012 International Conference for. IEEE, 2012.
  329. Singh, Ashishkumar, et al. “Clustering Experiments on Big Transaction Data for Market Segmentation.” Proceedings of the 2014 International Conference on Big Data Science and Computing. ACM, 2014.
  330. Bollacker, Kurt, et al. “Freebase: a collaboratively created graph database for structuring human knowledge.” Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008.
  331. Mintz, Mike, et al. “Distant supervision for relation extraction without labeled data.” Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 2009.
  332. Mesterharm, Chris, and Michael J. Pazzani. “Active learning using on-line algorithms.”Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2011.
  333. Wang, Shusen, and Zhihua Zhang. “Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling.” The Journal of Machine Learning Research 14.1 (2013): 2729–2769.
  334. Cattral, Robert, Franz Oppacher, and Dwight Deugo. “Evolutionary data mining with automatic rule generalization.” Recent Advances in Computers, Computing and Communications(2002): 296-300.
  335. Burton, Ariel N., and Paul HJ Kelly. “Performance prediction of paging workloads using lightweight tracing.” Future Generation Computer Systems22.7 (2006): 784-793.
  336. Bain, Michael, and Stephen Muggleton. “Learning optimal chess strategies.” Machine intelligence 13. Oxford University Press, Inc., 1994.
  337. Quilan, J. R. “Learning efficient classification procedures and their application to chess end games.” Machine Learning: An Artificial Intelligence Approach 1 (1983).
  338. Shapiro, Alen D. Structured induction in expert systems. Addison-Wesley Longman Publishing Co., Inc., 1987.
  339. Matheus, Christopher J., and Larry A. Rendell. “Constructive Induction On Decision Trees.” IJCAI. Vol. 89. 1989.
  340. Belsley, David A., Edwin Kuh, and Roy E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. Vol. 571. John Wiley & Sons, 2005.
  341. Ruotsalo, Tuukka, Lora Aroyo, and Guus Schreiber. “Knowledge-based linguistic annotation of digital cultural heritage collections.” IEEE Intelligent Systems 2 (2009): 64-75.
  342. Li, Lihong, et al. “Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms.” Proceedings of the fourth ACM international conference on Web search and data mining. ACM, 2011.
  343. Yeung, Kam Fung, and Yanyan Yang. “A proactive personalized mobile news recommendation system.” Developments in E-systems Engineering (DESE), 2010. IEEE, 2010.
  344. Gass, Susan E., and J. Murray Roberts. “The occurrence of the cold-water coral Lophelia pertusa (Scleractinia) on oil and gas platforms in the North Sea: colony growth, recruitment and environmental controls on distribution.” Marine Pollution Bulletin 52.5 (2006): 549-559.
  345. Gionis, Aristides, Heikki Mannila, and Panayiotis Tsaparas. “Clustering aggregation.” ACM Transactions on Knowledge Discovery from Data (TKDD) 1.1 (2007): 4.
  346. Obradovic, Zoran, and Slobodan Vucetic.Challenges in Scientific Data Mining: Heterogeneous, Biased, and Large Samples. Technical Report, Center for Information Science and Technology Temple University, 2004.
  347. Van Der Putten, Peter, and Maarten van Someren. “CoIL challenge 2000: The insurance company case.” Published by Sentient Machine Research, Amsterdam. Also a Leiden Institute of Advanced Computer Science Technical Report 9 (2000): 1-43.
  348. Mao, K. Z. “RBF neural network center selection based on Fisher ratio class separability measure.” Neural Networks, IEEE Transactions on 13.5 (2002): 1211–1217.
  349. Olave, Manuel, Vladislav Rajkovic, and Marko Bohanec. “An application for admission in public school systems.” Expert Systems in Public Administration 1 (1989): 145-160.
  350. Lizotte, Daniel J., Omid Madani, and Russell Greiner. “Budgeted learning of nailve-bayes classifiers.” Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., 2002.
  351. Lebowitz, Michael. “Concept learning in a rich input domain: Generalization-based memory.”Machine learning: An artificial intelligence approach 2 (1986): 193-214.
  352. Yeh, I-Cheng, King-Jang Yang, and Tao-Ming Ting. “Knowledge discovery on RFM model using Bernoulli sequence.” Expert Systems with Applications 36.3 (2009): 5866-5871.
  353. Lee, Wen-Chen, and Bor-Wen Cheng. “An intelligent system for improving performance of blood donation.” Journal of Quality Vol 18.2 (2011): 173.
  354. Schmidtmann, Irene, et al. “Evaluation des Krebsregisters NRW Schwerpunkt Record Linkage.” Abschlußbericht vom 11 (2009).
  355. Sariyar, Murat, Andreas Borg, and Klaus Pommerening. “Controlling false match rates in record linkage using extreme value theory.”Journal of biomedical informatics 44.4 (2011): 648-654.
  356. Candillier, Laurent, and Vincent Lemaire. “Design and Analysis of the Nomao challenge Active Learning in the Real-World.” Proceedings of the ALRA: Active Learning in Real-world Applications, Workshop ECML-PKDD. 2012.
  357. Marquez, Ivan Garrido. “A Domain Adaptation Method for Text Classification based on Self-adjusted Training Approach.” (2013).
  358. Nagesh, Harsha S., Sanjay Goil, and Alok N. Choudhary. “Adaptive Grids for Clustering Massive Data Sets.” SDM. 2001.
  359. Kuzilek, Jakub, et al. “OU Analyse: analysing at-risk students at The Open University.” Learning Analytics Review (2015): 1-16.
  360. Siemens, George, et al. Open Learning Analytics: an integrated & modularized platform. Diss. Open University Press, 2011.