๐ Motivation
Data Science Fundamentals is a collection of selected resources aimed at providing a solid background for aspiring and junior Data Scientists. The objective is to create a list as small and powerful as possible. There are thousands of sources out there and it is sometimes difficult to focus on what's important.
I am afraid to say that there is no shortcut to becoming a professional Data Scientist. I don't believe in one-month bootcamps and I find that most MS programs miss some key topics. They are usually too focused on technology and ML/DL algorithms, and often forget about other important things such as communicating results or providing a broad picture of the role of a data scientist in a project.
My main motivation is to create a comprehensive list of resources that will allow future Data Scientists to gain a deep knowledge of a few core competencies from which they can build up their careers. If you have a strong background in any of these competencies, you may still find some other useful stuff on the list. The list is alive, I want to keep it short so I could replace one course if I find something better. Competences should remain mostly the same.
The core competencies covered are:
- Maths foundation: It is impossible to be successful doing a Data Science project without a solid understanding of Probability, Statistics and Linear Algebra. Machine learning might be useful to solve certain problems, but what you will always be using is statistics and algebra as a fundamental and general framework. It is the most time consuming competence to work on, but it will pay off.
- Communication skills: You won't be working in isolation so you will need to show results, explain processes to non-technical audiences or write project proposals. Or you might just need a colleague to help you with something and you will need to explain it as clearly as possible via email or instant message. Learn to talk. Learn to write.
- Data Science Workflow: Understand the iterative process in data science. The importance of understanding business problems. Help businesses ask the right questions to solve a problem. Identify the role of data scientists in a project.
- Tools of the trade: You will need git, shell scripting and SQL. And a programming language. Remember that these are just tools to solve problems. Nothing else. I do not recommend learning lots of languages and tools. Stick to a shell, SQL and R/Python, and once you feel comfortable with those you can move forward and learn something else.
- Business understanding: You will be working in a particular sector and you need to understand this context. And also the different roles in business. Also try to get an idea of why the project is important for the different departments of the company, how it will be impacting customers and Profit & Loss account.
- Ethics: A great power comes with a great responsability. And although maybe your power is limited in scope at the beginning of your career, everyone working with data should be aware of issues such as data privacy, data ownership, anonimity, fairness and biases.
Resources
There are two levels:
- Foundation flagged contents are essential to be mastered by Data Scientists.
- Recommended items are highly useful resources but may not be essential for very junior DS.
<aside>
๐ Hover over any item and click โคข OPEN
to obtain additional info like a link to videos, books and other materials.
</aside>