author Shuxin Yang time 2016-12-13
Be a Data Scientist

What skills do I need in order to be a data scientist?

According to Dave Holtz’s article (8 Skills You Need to Be a Data Scientist) , I am suitable for reasonably sized non-data companies who are data-driven. In order to get the job, i need to develop 8 skills of varying degrees of importance.

***** Very important
*** Somewhat important
* Not that important
  1. Basic Tools ***** Python/R and SQL.
  2. Software Engineering *** Hand data logging.
  3. Basic Statistics ***** Statistical tests, distributions, maximum likelihood estimators, etc. This will also be the case for machine learning, but one of the more important aspects of your statistics knowledge will be understanding when different techniques are (or aren’t) a valid approach.
  4. Machine Learning ***** More important is to understand the broadstrokes and really understand when it is appropriate to use different techniques.
  5. Multivariable Calculus and Linear Algebra *
  6. Data Munging ***** Deal with imperfections in data, including missing values, inconsistent string formatting and date formatting.
  7. Data Visualization and Communication ***** It is important to not just be familiar with the tools (like ggplot and d3.js) necessary to visualize data, but also the principles behind visually encoding data and communicating information.
  8. Thinking Like A Data Scientist ***** It’s important to think about what things are important, and what things aren’t. How should you, as the data scientist, interact with the engineers and product managers? What methods should you use? When do approximations make sense?

Learning Materials

  1. UDACITY: Free Online Classes & Nanodegree
    Machine Learning Engineer Nanodegree –Kaggle
    Data Analyst –facebook & MongoDB
  2. Coursera: Online Courses From Top Universities
    Machine Learning –Stanford
  3. Kaggle: Your Home for Data Science
  4. 天池大数据竞赛: 打造“数据众智、众创”第一平台
  5. Pro GIT: Fast Version Control