Machine learning and AI are among the most popular topics nowadays, especially within the tech space. I am fortunate enough to work and develop with these technologies every day as a machine learning engineer!
In this article, I will walk you through my journey to becoming a machine learning engineer, shedding some light and advice on how you can become one yourself!
My Background
In one of my previous articles, I extensively wrote about my journey from school to securing my first Data Science job. I recommend you check out that article, but I will summarise the key timeline here.
Pretty much everyone in my family studied some sort of STEM subject. My great-grandad was an engineer, both my grandparents studied physics, and my mum is a maths teacher.
So, my path was always paved for me.
I chose to study physics at university after watching The Big Bang Theory at age 12; it’s fair to say everyone was very proud!
At school, I wasn’t dumb by any means. I was actually relatively bright, but I didn’t fully apply myself. I got decent grades, but definitely not what I was fully capable of.
I was very arrogant and thought I would do well with zero work.
I applied to top universities like Oxford and Imperial College, but given my work ethic, I was delusional thinking I had a chance. On results day, I ended up in clearing as I missed my offers. This was probably one of the saddest days of my life.
Clearing in the UK is where universities offer places to students on certain courses where they have space. It’s mainly for students who don’t have a university offer.
I was lucky enough to be offered a chance to study physics at the University of Surrey, and I went on to earn a first-class master’s degree in physics!
There is genuinely no substitute for hard work. It is a cringy cliche, but it is true!
My original plan was to do a PhD and be a full-time researcher or professor, but during my degree, I did a research year, and I just felt a career in research was not for me. Everything moved so slowly, and it didn’t seem there was much opportunity in the space.
During this time, DeepMind released their AlphaGo — The Movie documentary on YouTube, which popped up on my home feed.
From the video, I started to understand how AI worked and learn about neural networks, reinforcement learning, and deep learning. To be honest, to this day I am still not an expert in these areas.
Naturally, I dug deeper and found that a data scientist uses AI and machine learning algorithms to solve problems. I immediately wanted in and started applying for data science graduate roles.
I spent countless hours coding, taking courses, and working on projects. I applied to 300+ jobs and eventually landed my first data science graduate scheme in September 2021.
You can hear more about my journey from a podcast.
Data Science Journey
I started my career in an insurance company, where I built various supervised learning models, mainly using gradient boosted tree packages like CatBoost, XGBoost, and generalised linear models (GLMs).
I built models to predict:
- Fraud — Did someone fraudulently make a claim to profit.
- Risk Prices — What’s the premium we should give someone.
- Number of Claims — How many claims will someone have.
- Average Cost of Claim — What’s the average claim value someone will have.
I made around six models spanning the regression and classification space. I learned so much here, especially in statistics, as I worked very closely with Actuaries, so my maths knowledge was excellent.
However, due to the company’s structure and setup, it was difficult for my models to advance past the PoC stage, so I felt I lacked the “tech” side of my toolkit and understanding of how companies use machine learning in production.
After a year, my previous employer reached out to me asking if I wanted to apply to a junior data scientist role that specialises in time series forecasting and optimisation problems. I really liked the company, and after a few interviews, I was offered the job!
I worked at this company for about 2.5 years, where I became an expert in forecasting and combinatorial optimisation problems.
I developed many algorithms and deployed my models to production through AWS using software engineering best practices, such as unit testing, lower environment, shadow system, CI/CD pipelines, and much more.
Fair to say I learned a lot.
I worked very closely with software engineers, so I picked up a lot of engineering knowledge and continued self-studying machine learning and statistics on the side.
I even earned a promotion from junior to mid-level in that time!
Transitioning To MLE
Over time, I realised the actual value of data science is using it to make live decisions. There is a good quote by Pau Labarta Bajo
ML models inside Jupyter notebooks have a business value of $0
There is no point in building a really complex and sophisticated model if it will not produce results. Seeking out that extra 0.1% accuracy by staking multiple models is often not worth it.
You are better off building something simple that you can deploy, and that will bring real financial benefit to the company.
With this in mind, I started thinking about the future of data science. In my head, there are two avenues:
- Analytics -> You work primarily to gain insight into what the business should be doing and what it should be looking into to boost its performance.
- Engineering -> You ship solutions (models, decision algorithms, etc.) that bring business value.
I feel the data scientist who analyses and builds PoC models will become extinct in the next few years because, as we said above, they don’t provide tangible value to a business.
That’s not to say they are entirely useless; you have to think of it from the business perspective of their return on investment. Ideally, the value you bring in should be more than your salary.
You want to say that you did “X that produced Y”, which the above two avenues allow you to do.
The engineering side was the most interesting and enjoyable for me. I genuinely enjoy coding and building stuff that benefits people, and that they can use, so naturally, that’s where I gravitated towards.
To move to the ML engineering side, I asked my line manager if I could deploy the algorithms and ML models I was building myself. I would get help from software engineers, but I would write all the production code, do my own system design, and set up the deployment process independently.
And that’s exactly what I did.
I basically became a Machine Learning Engineer. I was developing my algorithms and then shipping them to production.
I also took NeetCode’s data structures and algorithms course to improve my fundamentals of computer science and started blogging about software engineering concepts.
Coincidentally, my current employer contacted me around this time and asked if I wanted to apply for a machine learning engineer role that specialises in general ML and optimisation at their company!
Call it luck, but clearly, the universe was telling me something. After several interview rounds, I was offered the role, and I am now a fully fledged machine learning engineer!
Fortunately, a role kind of “fell to me,” but I created my own luck through up-skilling and documenting my learning. That is why I always tell people to show their work — you don’t know what may come from it.
My Advice
I want to share the main bits of advice that helped me transition from a machine learning engineer to a data scientist.
- Experience — A machine learning engineer is not an entry-level position in my opinion. You need to be well-versed in data science, machine learning, software engineering, etc. You don’t need to be an expert in all of them, but have good fundamentals across the board. That’s why I recommend having a couple of years of experience as either a software engineer or data scientist and self-study other areas.
- Production Code — If you are from data science, you must learn to write good, well-tested production code. You must know things like typing, linting, unit tests, formatting, mocking and CI/CD. It’s not too difficult, but it just requires some practice. I recommend asking your current company to work with software engineers to gain this knowledge, it worked for me!
- Cloud Systems — Most companies nowadays deploy many of their architecture and systems on the cloud, and machine learning models are no exception. So, it’s best to get practice with these tools and understand how they enable models to go live. I learned most of this on the job, to be honest, but there are courses you can take.
- Command Line — I am sure most of you know this already, but every tech professional should be proficient in the command line. You will use it extensively when deploying and writing production code. I have a basic guide you can checkout here.
- Data Structures & Algorithms — Understanding the fundamental algorithms in computer science are very useful for MLE roles. Mainly because you will likely be asked about it in interviews. It’s not too hard to learn compared to machine learning; it just takes time. Any course will do the trick.
- Git & GitHub — Again, most tech professionals should know Git, but as an MLE, it is essential. How to squash commits, do code reviews, and write outstanding pull requests are musts.
- Specialise — Many MLE roles I saw required you to have some specialisation in a particular area. I specialise in time series forecasting, optimisation, and general ML based on my previous experience. This helps you stand out in the market, and most companies are looking for specialists nowadays.
The main theme here is that I basically up-skilled my software engineering abilities. This makes sense as I already had all the math, stats, and machine learning knowledge from being a data scientist.
If I were a software engineer, the transition would likely be the reverse. This is why securing a machine learning engineer role can be quite challenging, as it requires proficiency across a wide range of skills.
Summary & Further Thoughts
I have a free newsletter, Dishing the Data, where I share weekly tips and advice as a practising data scientist. Plus, when you subscribe, you will get my FREE data science resume and short PDF version of my AI roadmap!