Q&A With Alan Feder // 5 Questions for a Data Scientist

Alan Feder Data Scientist.jpeg

I recently reached out to a good contact of mine, Alan Feder, who agreed to be the first in what I hope to be a series of interviews with leading data scientists working today. Alan is a Principal Data Scientist at Invesco, and also one of the nicest and smartest people I know.

Here’s his thoughts on what excites him about data science and where he sees the future of the field.

***

1) Why did you want to become a Data Scientist?

I was really excited about the possibility of using statistics and data to solve business problems. Data science is a field that is developing and changing all the time, and I was excited to take part in exploring new methods and tools and taking part in changing the field.

2) What advice would you give to a new graduate who wants to be a data scientist?

Focus on the "data" side of data science: learn how to gather data yourself, deal with complicated data, and turn an ugly, messy data set into a tidy one. Additionally, focus on improving your abilities on structured/tabular data, as well as time-series models, as many data science jobs focus much more on those data types than text data (NLP) or image data (computer vision).

Additionally, if there is a given topic you are particularly interested in or curious about, there are a ton of free and cheap online courses and tutorials, many of them are really good. I have learned a lot from these resources; it's more important to learn something than to receive a "certificate."

3) Given the recent explosive growth and hope of the data science field, do you think it is warranted or is the hype outpacing reality in some cases?

I think that the biggest contribution of the data science field could be to bring many of these methods and tools into areas that haven't traditionally used advanced methods, or coding in their data analyses and visualization. While it is true that there may be claims that are over-hyped, I think that the data science field will continue to grow while contributing to many industries and professions.

4) What tools or methodologies are you most excited about moving forward? Which of these do you think most data scientists will be using 3-5 years from now?

I'm very excited about the increasing use of Interpretable/Explainable Machine Learning. Just calling certain methods "black box" and moving on limits the usefulness of the data scientist. Many methods, both that explain other machine learning models (e.g., PDPs, Shap) as well as those that are more inherently explainable (e.g., GAMs) can truly transform our understanding of the work we do and how the data and models relate to real life.

Additionally, I am a big fan of the tidyverse within R. I think that the functions there do a really good job at "making sense" and allowing users to easily translate their ideas into code. I haven't found any data visualization packages that are better than ggplot2 and its extension packages (although Altair in Python is pretty good).

5) You've interviewed plenty of data scientists. What's the best advice you can give someone interviewing for a data scientist position? What's the biggest mistake you've seen someone make during an interview?

Make sure you can explain the non-technical aspect of any project. Why did you choose to work on this? How did you decide to choose the target? How will the model be used in the business, and in what way will it benefit others?

Additionally, make sure you can think about how a model was validated and how to ensure that you are not overfitting your model to training data.

Lastly, as general interview advice, know and understand everything on your resume. It's way better to "not know something" than to include methods and projects that you can't speak to, and don't clearly understand.


Matt Stabile1 Comment