March 13, 2020
Data science has been one of the fastest growing fields over the past five years. Since starting Prolego I’ve reviewed more than 1,000 data scientists resumes, interviewed hundreds of candidates, and hired more than a dozen for my company or our clients. I’ve watched data science careers thrive … and others falter. As we enter this economic downcycle managers will begin identifying their most valuable people—the ones they will fight to retain in the event of layoffs.
What follows is advice for making yourself invaluable to any employer.
You may have the false impression that a chronic shortage of data scientists will insulate you from layoffs. This isn’t true.
There is a shortage of talent in every technical skill. We don’t have enough talented java programmers, Webflow designers, Cobol programmers, and data scientists. There are fewer people who can do machine learning, but there are also fewer jobs which require it. My clients have no problem hiring data scientists when they have the right recruiting strategy—there are a lot of very smart people attracted to this field.
Here’s the reality: Basic machine learning skills which allow you to rank high in Kaggle contests are not sufficiently differentiated—enough other people have them. Here are 3 approaches for becoming an exclusive contributor on your team.
I’ve met hundreds of data scientists who can clean data, perform EDA, train and evaluate models in Jupyter notebooks. Many data scientists come from a research or statistics background and are quite competent using basic Python libraries in notebooks. Far fewer can take code written in notebooks and extract it into standalone Python classes and methods. Very few can build these libraries with production-quality software.
(Obviously if you’re only comfortable with R, SAS, or MATLAB you are at a competitive disadvantage—stop making excuses and learn Python).
One of the biggest challenges our clients have is putting machine learning models into production. Data scientists who can write production-quality (or close to it) software make a huge impact on any project.
Python code like ...
is fine for rapid development in exploratory notebooks. But a professional software engineer only sees a horrifying example of magic numbers, ambiguous variables and code which is difficult to read. A data scientist who can put these basic functions into libraries with PEP8 standards will improve velocity for the whole team.
In case you’re wondering, good Python skills are a must-have for every engineer at my company.
Some projects benefit from leveraging cutting edge research or creating novel models (usually neural networks). Few data scientists have the research skills to:
But many companies ultimately discover they have intractable problems with high potential business impact. If you are skilled at solving them, you can carve out a very valuable role for yourself.
Some data scientists—usually those with a PhD in an experimental discipline—are drawn to this type of work. These engineers have spent years developing mathematical models and building the software and data to test them. We usually see these skills in people with advanced degrees in economics, physics, and astronomy.
Keep in mind that your employer won’t have the appetite for the years of toil which often accompany academic or lab research projects. We break these types of projects into 3-month milestones and systematically identify and remove risk.
Machine learning projects are complex and touch nearly every part of the organization. Model releases have to be coordinated with IT. Legal needs to be consulted about usage rights. Users need to understand false positives and negatives. The list is endless.
Too often the data scientist is tasked with these activities. Many don’t enjoy this type of work—and usually are not very good at it.
But some data scientists love being the connector who keeps the projects running on track. They are good at simplifying data science work and explaining the business impact for executives. Does this sound like you? If so, consider doubling-down and talking to your leadership about carving out a unique role in your company. We call this role an “AI Product Manager”.
You have a unique competitive advantage because most traditional project/product managers do not have your knowledge. They don’t know how to interpret what data scientists do or give them specific feedback to maximize their efficiency.
I don’t know how long the economic fallout from the Covid-19 virus will last. I don’t know if layoffs will result from this downturn as they have in the previous two.
But I do know that every large company is looking for ways to get more value out of their data science team. Start making yourself an invaluable team member and you’ll be prepared to thrive regardless of the market.
May 2, 2021
Your goal as an AI leader is to get your teams to think like pros. You want them to strategically look for ways in which AI can lift the entire business instead of just solving a narrowly defined problem. Your team should constantly seek ways to advance the bigger vision of becoming an AI-driven company. In this issue of FeedForward, I’ll describe the difference between how pros and amateurs think about AI.
March 31, 2021
In this video Justin Pounders, Director of Machine Learning and AI Research at Prolego, breaks down natural language generation (NLG) into its most basic components and describes how you can begin building out these components in your business. (And, no, it doesn’t depend on GPT-3!) He describes how NLG depends critically on two questions (WHAT you want to say and HOW you say it), the types of data you can feed into NLG systems, and a development path for being able to summarize multiple sources of data in plain English.
March 30, 2021
Like most engineers, I hate tedious work. That’s why I love the idea of automatic machine learning (AutoML). As much as I want to love AutoML, it’s been incorrectly framed as a substitute for data scientists. This confusion arises from a misunderstanding of what actually happens in machine learning projects.