“I haven’t heard back from any companies”
I hear a familiar story from a lot of aspiring data scientists: “I have sent out my resume to 25 companies, and I haven’t heard back from any of them! I have pretty good skills, and I think I have a pretty good resume. I don’t know what’s going on!”
Your resume probably sucks
My immediate conclusion after hearing your story: your resume probably sucks. If you are not getting any responses from any companies, and your skills are a reasonable match for the job description, then it almost certainly means that you are getting sabotaged by a bad resume.
What is the purpose of a resume?
The only real purpose of a resume is to get job interviews. That’s it. The purpose of a resume is not to:
- list all of your job experience
- list all of your technical skills
- show off your great educational background
Your resume should explicitly include only the exact items that will help you get a job interview.
What makes a good resume?
A good resume tells a story that is targeted to the job description and company. And furthermore, someone reading the resume should be able to understand that story in less than 20 seconds. If you keep these principles in mind, it is actually not too hard to write a decent resume.
Crafting your story
The first thing you need to do when creating your resume is to come up with your story. This part is a little tricky, but it is extremely important and even a little fun. Your story should be simple and compelling, and it should be a good fit for the job description. A good strategy for coming up with a great resume story is to think of two or three things that are interesting about yourself and tie them together. You should start out with a simple story in plain English (something I learned in Ramit Sethi’s excellent Dream Job course).
For example, a story that works for me is, “I am an experienced data scientist, I have a great math background, and I am good at explaining complicated stuff.” If I were applying to a more software development-focused data science job, a possible story for me could be, “I have experience building really fast and accurate machine-learning models in Python. I also understand big data technology like Hadoop.” For a more business-focused role, a story could be, “I have experience using stats and machine-learning to find useful insights in data. I also have experience presenting those insights with dashboards and automated reports, and I am good at public speaking.” When you come up with your story, don’t be afraid to try some different ones on for size. All of the three stories I just wrote are true about me. It’s all about positioning yourself the right way for the company.
What skills and technologies should I list?
People often ask me what skills and technologies they should have on their resume. There are really three main questions here.
- How proficient do I have to be before I put a skill or technology on my resume?
- Which things should I emphasize?
- Which things should I not include?
Question 1: What am I allowed to include?
My general rule of thumb is that you should not put something on your resume unless you have actual used it. Just having read about it does not count. Generally, you don’t have to have used it in a massive scale production environment, but you should have at least used it in a personal project.
Question 2: What should I emphasize?
In order to decide what to emphasize, you have two great sources of information. One is the job description itself. If the job description is all about R, you should obviously emphasize R. Another, more subtle, source is the collection of skills that current employees list on LinkedIn. If someone is part of your network or has a public profile, you can see their LinkedIn profile (if you can’t see their profile, it might be worth getting a free trial for LinkedIn premium). If all of the team members have 30 endorsements for Hive, then they probably use Hive at work. You should definitely list Hive if you know it.
Question 3: What should I not include?
Because your resume is there to tell a targeted story in order to get an interview, you really should not have any skills or technologies listed that do not fit with that story. For example, if your story is all about being a “PhD in Computer Science with deep understanding of neural networks and the ability to explain technical topics,” you probably should not include your experience with WordPress. Including general skills like HTML and CSS is probably good, but you probably do not need to list that you are an expert in Knockout.JS and elastiCSS. This advice is doubly true for non-technical skills like “customer service” or “phone direct sales.” Including things like that actually makes the rest of your resume look worse, because it emphasizes that you have been focused on a lot of things other than data science, and — worse — that you do not really understand what the team is looking for. If you want to include something like that to add color to your resume, you should add it in the “Additional Info” section at the end of the resume, not in the “Skills and Technologies” section.
What I have no experience?
If you have no working experience as a data scientist, then you have to figure out how to signal that you can do the job anyway. There are three main ways to do this: independent projects, education, and competence triggers.
Independent projects
If you don’t have any experience as a data scientist, then you absolutely have to do independent projects. Luckily, it is very easy to get started. The simplest way to get started is do a Kaggle competition. Kaggle is a competition site for data science problems, and there are lots of great problems with clean datasets. I wrote a step-by-step tutorial for trying your first competition using R. I recommend working through a couple of Kaggle tutorials and posting your code on Github. Posting your code is extremely important. In fact, having a Github repository posted online is a powerful signal that you are a competent data scientist (it is a competence trigger, which we will discuss in a moment).
Kaggle is the simplest way to complete independent projects, but there are many other ways. There are three parts to completing an independent data science project:
- Coming up with an idea
- Acquiring the data
- Analyzing the data and/or building a model
Kaggle is great, because steps 1 and 2 are completed for you. But a huge amount of data science is exactly those parts, so Kaggle can’t fully prepare you for a job as a data scientist. I will help you now with steps 1 and 2 by giving you a list of a few ideas for independent data science projects. I encourage you to steal these.
- Use Latent Semantic Analysis to extract topics from tweets. Pull the data using the Twitter API.
- Use a bag of words model to cluster the top questions on /r/AskReddit. Pull the data using the Reddit API.
- Identify interesting traffic volume spikes for certain Wikipedia pages and correlate them to news events. Access and analyze the data by using AWS Open Datasets and Amazon Elastic MapReduce.
- Find topic networks in Wikipedia by examining the link graph in Wikipedia. Use another AWS Open Datasets.
I mention a few other sample projects in Becoming a Data Hacker.
Education
Another way to prove your ability is through your educational background. If you have a Masters or a PhD in a relevant field, you should absolutely list relevant coursework and brag about your thesis. Make sure that you put your thesis work in the context of data science as much as possible. Be creative! If you really can’t think of any way that your thesis is relevant to data science, then you problem should not make a big deal out of it on your resume.
Competence triggers and social proof
Competence triggers are usually discussed in the context of interviews, but they play a particularly important role in data science resumes. Competence triggers are behaviors or attributes of a person that “trigger” others to see them as competent. In an interview, a typical competence trigger is having a strong, firm handshake or being appropriately dressed. There are a few key competence triggers that will really boost your resume:
- A Github page
- A Kaggle profile
- A StackExchange or Quora profile
- A technical blog
Why do these boost your resume so much? The reason is that data scientists use these tools to share their own work and find answers to questions. If you use these tools, then you are signaling to data scientists that you are one of them, even if you haven’t ever worked as a data scientist. Even better, a good reputation on sites like StackExchange or Quora gives you social proof.
Don’t worry about doing all of these at once. I absolutely think you should have a Github page, and you should post code from your independent projects there. If you have performed decently well in a couple of Kaggle competitions, then your Kaggle profile will be impressive, too. Answering questions on StackExchange or Quora can be a bit of a distraction from your real work, so it should not be a priority. And starting your own blog is great, but probably not necessary. As an alternative to a blog, you can focus on writing good documentation in a README in your Github repositories.
Resume rules of thumb
As you write your resume, there are a few basic rules of thumb to keep in mind.
- Keep it to one side of one page: Most recruiters only look at a resume for a few seconds. They should be able to see that you are a good candidate immediately, without turning the page.
- Use simple formatting: Don’t do anything too fancy. It should not be hard to parse what your resume says.
- Use appropriate industry lingo, but otherwise keep it simple: Again, this goes to readability.
- Don’t use weird file types: PDF is good, but you should probably also attach a DOCX file. You basically should not use any other file formats, because your resume is useless if people can’t open it.
Do I need to include a cover letter?
A lot of job applications say that a cover letter is optional. Typically, you should include a cover letter anyway. Make sure the cover letter is not too generic. Actually explain why you would be a good fit for the role and the company. Do a little research, and be positive. Remember the rule about resumes, though: don’t make the cover letter too long.
If you are just “casually” sending your resume to a current employee, it is okay to skip the cover letter. This is another example of how networking is a critical. The best way to get an interview is to be recommended by a current employee. If you can do this, then your resume will float to the top of the pile automatically.
My annotated resume
If you want to see what my resume looks like, here’s a link to it in Google Drive. It’s not perfect, but it doesn’t have to be. Keep that in mind. Remember, your resume is there to get you an interview. It is not your magnum opus. Happy hunting!
PS: Be sure to sign up for my email list if you want more content like this, and please leave any questions in the Comments.
[…] Topic Models. Computational Linguistics I: Topic Modeling. Newyorker. Hacker's guide to Neural Networks. Introduction to Latent Dirichlet Allocation. Topic Models Applied to Online News and Reviews. Topic Models. What is the intuition behind beta distribution? Menschen – Deutsch als Fremdsprache. Speaking a second language may change how you see the world. Machine Learning Tutorial: The Naive Bayes Text Classifier. Sentiment analysis. 3 Foods That Will Boost Your Energy Levels in 7 Days. 圈儿. How to Monetize Your Blog. Discrimination against Chinese Americans and Asians. Debunking the “Model Asian” Myth: Five Ways Asian-Americans Still Face Discrimination. Creating a great data science resume. […]