Data Science is a new and very promising area. Several sources indicate that we are in need of more data scientists than we can train. In this article I want to put courses that you can do to start your journey in the area.

All of them are offered by renowned professors and universities, and can be done for free over the internet. In addition, they have a very strong practical content, which will give you the conditions to start applying the demonstrated techniques in data of your interest.

The focus of these courses is on Machine Learning, which includes many of the most commonly used tools in data science.

## Machine Learning – Coursera

This is the most popular introduction to machine learning online course. There are several types of data scientists, but much of the work requires knowledge of models that learn from data. In this course it is possible to learn superficially how several algorithms work, and how to use them to solve real problems.

The course begins by explaining what Machine Learning is, goes on to explain simple models, such as linear regression, and builds the foundation for more complex, widely used models such as Neural Networks and SVMs. In addition, the opportunity to implement parts of these algorithms helps us better understand how they work, and even though you will probably never have to implement an algorithm from scratch, it will help you to know its characteristics and limitations to use them in practical situations.

It begins with supervised learning, and in the end take a brief tour of unsupervised learning topics and anomaly detection. In addition, there are videos with techniques and best practices to evaluate and optimize a model, avoid errors such as overfitting or underfitting, and modifications to make it possible to use these algorithms in data that does not fit in memory (the famous big data).

Professor Andrew Ng is one of the founders of Coursera, as well as a professor at Stanford University and chief scientist at Baidu.

It is a fairly practical course, and requires only familiarity with linear algebra. Within the course there are videos that review concepts that will be important. The language used for programming assignments is Octave. There are videos within the course that teach the basics about the language, just what is needed to complete the course.

This course is offered on demand, meaning it can be taken at any time.

## Statistical Learning – StanfordX

An introduction to Machine Learning from a statistical point of view. This course offered by two highly respected statisticians, Rob Tibshrani and Trevor Hastie, is very practical and brings a closer approach to the statistical concepts behind each model without so much focus on the computational part.

It begins with an overview of the Statistical Learning area, explains classification and regression problems, and basic tools for linear modeling. Then it takes us through methods of model evaluation, and techniques to optimize the model taking into account the generalization for new data. Finally, more advanced algorithms such as SVM and Random Forests are presented, as well as a brief passage through unsupervised learning methods.

This course uses the R language, which is widely used in the area of statistics. There are programming assignments, but they are geared towards the use of models, not implementation.

This course is usually offered once a year in mid-January.

## Learning from Data – EdX

This course offered by Professor Yasser Abu-Mostafa applies a more computational approach but, unlike Andrew Ng’s course, there is a good deal of theory, which helps to understand how the models work more fully.

In the first classes the teacher explains to us the concept of machine learning, and the mathematics that underlies the theory that guarantees us the possibility of using an algorithm to learn a task through data. The theory is presented from two points of view: VC Dimension, and Bias-Variance Tradeoff.

After that, it presents some models such as logistic regression, neural networks and SVMs. Some techniques are demonstrated to optimize the models so that they are useful in a practical application.

Finally, we are introduced to Kernels, which are very important variable transformations, mainly due to the success of the SVMs, and an overview of areas of study in Machine Learning that can be followed after the course.

Although there is no confirmation about a next session, it is worth quoting and all materials (videos and tasks) are available online. It does not require a specific programming language, it is possible to complete assignments in any language.

## Bonus – The Analytics Edge – EdX

This course, offered by MIT, Professor Dimitris Bertsimas and his team through the EdX platform focuses a lot on the application of machine learning methods using R. Much of the course is passed on examples of using the techniques, and the tasks are extensive, so that students can explore all the commands taught. Every week we have a new case study.

In addition to teaching methods of supervised and unsupervised learning, in the end they talk about optimization methods that, in addition to being interesting in themselves, are the basis of many machine learning algorithms.

One point that draws attention to this course and which, at the time of writing, is not offered in others, is that during the course one of the tasks is to participate in a competition, offered only to students of the course, on Kaggle. This is an opportunity to use the tools in a realistic case, having to create a solution using the knowledge acquired in the course.

This part of the competition is of utmost importance. From my own experience I say that nothing teaches more than having a dataset in front of you and having to decide alone what is the best direction to take to do the analysis.

## How to choose a course?

If you ask me which one should you do, I’ll answer all of them. Although much of the material is the same, each gives you a different view of Machine Learning.

With Andrew Ng you have a quick and superficial, rather practical, presentation of the algorithms. In Statistical Learning, although very practical, there is a greater concern with classical statistical concepts, such as p-value and confidence intervals.

In Learning from Data, it is possible to understand the theory that underlies Machine Learning, the mathematical reason for an algorithm to be able to learn through the data.

If you are willing to do all, I recommend doing them in the order that they are arranged in the article.