A lot of career experts recommend creating a portfolio of relevant work when trying to land your first job in a field. In my opinion, it is not necessary to build a formal portfolio in order to get a data science job. However, there is a lot of benefit to having an easily accessible list of relevant projects that you are capable of discussing in an interview. In this post, I will explain how to build a useful “portfolio” without adding a lot of extra work to your job search.
The Github Portfolio
In my opinion, it is a bit foolish to invest a lot of time into building a formal portfolio with pretty pictures and three ring binders. However, there is a better approach to take that will save you time while also building technical skills and preparing you for interviews. You could call it the Github Portfolio.
What I recommend is to take a few technical projects (preferably data science projects) and push code for them up to one or more publicly available repositories in Github. I talk about the benefits of having a Github page in my Creating a Great Data Science Resume post. Github is the most commonly used SaaS tool for version control, and simply having a Github page will set you apart. Also, using Github allows you to put the version control system git on your resume!
Here’s what you need to do:
- Create a Github username if you don’t already have one.
- Do a basic git and Github tutorial. (You don’t need to understand git to make this portfolio, but a little bit of knowledge helps)
- Select 2 or 3 relevant projects that you are comfortable sharing. NOTE: It is not important that they be perfect or even “clean.” Just having them there is already a boost.
- Write basic README’s for each project. This does not need to be anything more than “This project was about _.”
- Push the code up to Github. You can do this using command line git, using the Github desktop app, or even copying and pasting manually on the Github webpage.
Now, put the link to your Github page on the top of your resume. There, you officially have a portfolio.
Selecting Projects to Include
You might be having trouble selecting which projects to include on Github.
My recommendation is not to be overly choosy about what you put there. Putting it on Github with a README might help you figure out how to clean it up a bit. Remember: just having a Github page is a boost for your resume. “Do you have a Github username?” is a common question people ask in internship interviews, because having a Github page shows that you are the right kind of nerdy.
If you do not have any projects to post on Github, then you need to get cracking. Putting something on Github does not have to be a major ordeal. You can even adapt some code from a tutorial like my tutorial on machine learning with R and make it your own.
Preparing for Interviews
One of the best things about having a Github portfolio is that you can reference the projects there during interviews. The key is to practice a few basic talking points about the project and anticipate common questions. For example,
Interviewer: “Can you tell us about a project you did using machine learning?”
You: “Yes. I wanted to build a model to predict cryptocurrency price drops. I collected the data from a crypto price API using Python, and then I built a model using random forests using pandas and scikit-learn. You can check out the code on Github.”
Interviewer: “Oh, interesting. What made you choose a random forest?”
You: “Well, I love random forests because of their flexibility, so I tried a random forest first. I tested logistic regression and a multilayer perceptron, too, but random forest provided the best accuracy with decent performance.”
You will want to be ready to discuss a few key aspects of your project. Doing well at this part of the interview is a huge plus, since it shows some depth and good communication skills.
- A short description of the project. What you did, why you did it. This should be short (like 10-30 seconds), because rambling is the single greatest interviewing sin.
- You should be able to explain some of the key decisions you made. Why did you choose a specific model? How did you evaluate the performance of the model or the recommendations you made? How did you get the data? Did you have to clean the data? How would you deploy the model at large scale?
- What would you do differently next time?
- What was unexpectedly difficult about the project?
- What did you discover that was surprising.
Don’t sweat it
It is not a requirement to have a portfolio of any sort. Don’t let not having a portfolio keep you from pushing into the job market. I definitely recommend creating a Github page, writing a few README’s, and practicing a few talking points. But do not let building a fancy portfolio keep you from the challenging and important parts of job hunting. Actually knowing a bit of data science, building a great resume, actively networking, and practicing your interview skills will make a bigger impact for most people.
Leave a Reply