What is Machine Learning ? Full explain.

If you'd like to make sure that your model is working well and that model can generalize with new cases, you can try out new cases with it by putting the model in the environment and then monitoring how it will perform. This is a good method, but if your m gv odel is inadequate, the user will complain. You should divide your data into two sets, one set for training and the second one for testing, so that you can train your model using the first one and test it using the second. The generalization error is the rate of error by evaluation of your model on the test set. The value you get will tell you if your model is good enough, and if it will work properly. If the error rate is low, the model is good and will perform properly. In contrast, if your rate is high, this means your model will perform badly and not work properly. My advice to you is to use 80% of the data for training and 20% for testing purposes, so that it’s very simple to test or evaluate a model. In this chapter, we have covered many concepts of machine learning. The following chapters will be very practical, and you'll write code, but you should answer the following questions just to make sure you're on the right track. 1. Define machine learning 2. Describe the four types of machine-learning systems. 3. What is the difference between supervised and unsupervised learning. 4. Name the unsupervised tasks. 5. Why are testing and validation important? 6. In one sentence, describe what online learning is. 7. What is the difference between batch and offline learning? 8. Which type of machine learning system should you use to make a robot learn how to walk? In this chapter, you'll go deeper into classification systems, and work with the MNIST data set. This is a set of 70,000 images of digits handwritten by students and employees. You'll find that each image has a label and a digit that represents it. This project is like the “Hello, world” example of traditional programming. So every beginner to machine learning should start with this project to learn about the classification algorithm. Scikit-Learn has many functions, including the MNIST. Let’s take a look at the code: >>> from sklearn.data sets import fetch_mldata >>> mn= fetch_mldata('MNIST original') >>> mn {'COL_NAMES': ['label', 'data'], 'Description': 'mldata.org data set: mn-original', 'data': array([[0, 0, 0,..., 0, 0, 0], [0, 0, 0,..., 0, 0, 0], [0, 0, 0,..., 0, 0, 0], ..., [0, 0, 0,..., 0, 0, 0], [0, 0, 0,..., 0, 0, 0], [0, 0, 0,..., 0, 0, 0]], dataType=uint8), 'tar': array([ 0., 0., 0.,..., 9., 9., 9.])} de . Description is a key that describes the data set. . The data key here contains an array with just one row for instance, and a column for every feature. . This target key contains an array with labels. Let’s work with some of the code: >>> X, y = mn["data"], mn["tar"] >>> X.shape (70000, 784) >>> y.shape (70000,) . 7000 here means that there are 70,000 images, and every image has more than Measures of Performance If you want to evaluate a classifier, this will be more difficult than a regressor, so let’s explain how to evaluate a classifier. In this example, we'll use across-validation to evaluate our model. from sklearn.model_selection import StratifiedKFold form sklearn.base import clone sf = StratifiedKFold(n=2, ran_state = 40) for train_index, test_index in sf.split(x_tr, y_tr_6): cl = clone(sgd_clf) x_tr_fd = x_tr[train_index] y_tr_fd = (y_tr_6[train_index]) x_tes_fd = x_tr[test_index] y_tes_fd = (y_tr_6[test_index]) cl.fit(x_tr_fd, y_tr_fd) y_p = cl.predict(x_tes_fd) print(n_correct / len(y_p)) . We use the StratifiedFold class to perform stratified sampling that produces folds that contain a ration for every class. Next, every iteration in the code will create a clone of the classifier to make predictions on the test fold. And finally, it will count the number of correct predictions and their ratio . Now we'll use the cross_val_score function to evaluate the SGDClassifier by K-fold cross validation. The k fold cross validation will divide the training set into 3 folds, and then it will make prediction and evaluation on each fold. from sklearn.model_selection import cross_val_score cross_val_score(sgd_clf, x_tr, y_tr_6, cv = 3, scoring = “accuracy”) You'll get the ratio of accuracy of “correct predictions” on all folds.Tree classifers. The next image will illustrate the definition of a general target of collecting functions that is just to merge different classifers into a One-classifer that has a better generalization performance than each individual classifer alone. As an example, assume that you collected predictions from many experts. Ensemble methods would allow us to merge these predictions by the lots of experts to get a prediction that is more proper and robust than the predictions of each individual expert. As you can see later in this part, there are many different methods to create an ensemble of classifers. In this part, we will introduce a basic perception about how ensembles work and why they are typically recognized for yielding a good generalization performance. In this part, we will work with the most popular ensemble method that uses themajority voting principle. Many voting simply means that we choose the label that has been predicted by the majority of classifers; that is, received more than 50 percent of the votes. As an example, the term here is like vote refers to just binary class settings only. However, it is not hard to generate the majority voting principle to multi-class settings, which is called plurality voting. After that, we will choose the class label that received the most votes. The following diagram illustrates the concept of majority and plurality voting for an ensemble of 10 classifers where each unique symbol (triangle, square, and circle) represents a unique class label: Using the training set, we start by training m different classifers (C C 1, , … m ). Based on the method, the ensemble can be built from many classification algorithms; for example, decision trees, support vector machines, logistic regression classifers, and so on. In fact, you can use the same base classification algorithm fitting different subsets of the training set. An example of this method would be the random forest algorithm, which merges many decision ensemble ways using majority voting. To predict a class label via a simple majority or plurality voting, we combine the predicted class labels of each individual classifer C j and select the class label yˆ that received the most votes: y m ˆ = ode{C C 1 2 ( ) x x , , ( ) …,Cm ( ) x } For example, in a binary classification task where class1 1 = − and class2 1 = +, we can write the majority vote prediction. To illustrate why ensemble methods can work better than individual classifiers alone, let's apply the simple concepts of combinatory. For the following example, we make the assumption that all n base classifiers for a binary classification task have an equal error rate, ε. Additionally, we assume that the classifiers are independent and the error rates are not correlated. As you can see, we can simply explain the error statistics of an ensemble of base classifiers as a probability.

Comments