Table of Contents
Intro
Machine Learning is booming.
More and more organizations are utilizing at least some part of the ML spectrum to improve their business.
That’s why many developers are considering exploring Machine Learning, at least to some extend.
I’m not necessarily talking about switching careers.
But in an environment where the ML ecosystem progresses so rapidly, it’s normal for the curios developer to become attracted.
The problem is, in contrast to learning a new mini framework or some MV* pattern, stepping into Machine Learning in more than a one-off weekend exercise.
Still, with dedicated efforts and the right mindset, you can lay a very valuable foundation. This, in turn, will help you decide if/how you’d like to develop your ML skills down the road.
Overcome the Self Doubt (Imposter Syndrome)
With Machine Learning, it’s pretty common to fall victim to the Imposter Syndrome.
It’s unfortunate that many developers just quit their ML initiative in a week or two. It’s just too easy to get overwhelmed with some heavy math and get into self-doubt paralysis.
For some reason, we, as professionals, forget what we’re mostly valued for – solving real-world problems. I believe this is one of the best driving forces when acquiring a new skill.
Machine Learning is not an exception.
We have to approach learning ML with the mindset of a Problem Solver.
This is what’s working great for me, and I hope to give you a clear path to follow by the end of this post.
In These Series
In this intro article, I’ll describe why I started exploring ML, the initial learning path I took, and some (hopefully) valuable takeaways from my journey so far.
In the upcoming posts, I will give a detailed overview of the online Courses/Specializations/Nanodegrees I took at Coursera and Udacity.
I’m confident that sharing my experience will be helpful for a lot of professional developers thinking about whether or not “ML is right for them.”
Let’s get started!
My Background and Motivation
I want to make it clear that I am not claiming to be a Machine Learning expert.
At the time of this writing, I am still heavily learning ML, experimenting with my own models and side projects.
I am a reasonably experienced (10+ years) back-end developer, occupied in the Distributed Systems spectrum in the Sports Analytics domain at Synergy Sports.
I’ve seen how ML can be utilized to extract information from a high volume of sports data. Being a sports fan, this has definitely attracted my attention.
That’s how, at the end of 2019, I decided to give Machine Learning a shot.
I knew that would be a long-term exercise, so the only self-commitment I made is to learn as much as I could throughout the whole year of 2020.
With a full-time working schedule, I’ve been spending 5-7 hours weekly.
I know this doesn’t sound like a lot of a dedicated timeframe for such a learning activity.
Still, as a Mini Habits proponent, I’ve learned that consistency is what makes the difference and brings results.
How Did I Get Started?
By the time I decided to get into ML, I’ve had already taken 30+ courses at Coursera.
I already knew the learning experience in the platform works pretty well for me.
Also, I’ve noticed that some of the most trending courses are ML-related.
So I did my research, selected an intro course to start with, and jumped right in.
One year later, I’ve completed the following Courses/Specializations/Nanodegrees.
- [Coursera] Machine Learning, Stanford, Andrew NG
- [Coursera] Deep Learning Specialization, DeepLearning.AI, Andrew NG
- [Coursera] Natural Language Processing Specialization, DeepLearning.AI
- [Udacity] Intro to Machine Learning with Pytorch
The last one is a Udacity “Nanodegree” program. I will do a full review of it in a later article.
These great learning sources helped me get comfortable with ML.
Suddenly, I was able to freely read (and understand) many of the papers/tutorials, experimenting with my own models and datasets.
The Wrong Mindset
There are certain misconceptions around Machine Learning that are just blocking many developers from making any meaningful progress.
If you have thoughts similar to the ones below, you are not alone:
“I can’t just start from scratch with something so different. I can’t be in a “junior” position again building a career from scratch. I also need to keep my current income!”
“I can’t leave what I’m good at and compete with those Ph.D. guys in their field.”
“I don’t have time for getting some formal education (university/college).”
“I’d better spend my learning time on a new language/framework/architectural design.”
Again, this is mostly the Imposter Syndrom talking here.
All of those doubts are entirely irrelevant. The sooner you get rid of them, the better. All they do is jeopardizing your progress without a real objective reason.
If you’re an experienced developer, you’ve learned tons of challenging stuff already. The process for getting into Machine Learning is very similar.
You already possess the right mindset. You just have to apply it yet again in your ML journey.
Again, consistency is king here, and you already know that!
The Perspective of a Problem Solver
The most remarkable thing about our profession is that we build cool stuff out of thin air.
If you come up with an exciting idea for a project, something you care about and enjoy, you can start working on it at the minute.
I can assure you that in your intro Machine Learning endeavors, the process will be much more important than the final result.
The thing is, the first time you try to solve some ML problem that isn’t straight out of a textbook, you pretty much don’t know what you’re doing – you still don’t have the required practice and intuition to build a good model for your use case.
For example, in my initial attempts to do something meaningful with ML, the challenges I wanted to solve weren’t even ML problems. A much better solution would have been a “standard” deterministic rule-based programming logic rather than building a predictive model.
However, during those early steps, I had to re-visit a lot of the learning materials, read a few papers, posted a bunch of (inadequate) questions on Reddit.
Step by step, I was getting more and more comfortable with a broader range of concepts.
What’s more important – I got great insights into some real and exciting ML problems that I could actually start solving.
Examples From Sports Analytics
Being part of Synergy Sports, I’ve recognized that it’s not only the sheer volume of data that’s important. It’s equally, if not more vital, how much meaning you can extract from it.
Let me give you just a few simple examples.
Information Extraction From Sports Commentaries
Have a look at the following paragraph from a football game commentaries:
“Goal! Chelsea 1, Swansea City 0. Oscar (Chelsea) from a free kick with a right footed shot to the bottom right corner.”
This small piece of text brings a lot of information for a human reader.
You immediately understand that Chelsea took the lead with a free-kick goal by Oscar.
The question is – how can a machine interpret this, so it extracts some structured piece of information?
After all, to the machine, this is just some pile of characters.
For example, can we somehow obtain a structured JSON like the one below?
{ "scorerName": "Oscar", “teamName”: “Chelsea”, “scoringType”: “FreeKick” }
You can definitely build the ultimate regex nightmare as an attempt to solve this problem.
There are better alternatives, though.
In Machine Learning, or Natural Language Processing to be precise, this is an example of a Named Entity Recognition (NER) type of problem – reviewed in the NLP Course of the Deep Learning Specialization.
With a big-enough annotated dataset, you can teach an ML model to extract the valuable information for you. (*)
(*) I don’t want to sound ignorant and present this as a silver bullet. The performance of the model depends on a lot of factors. For example, if the text for every event follows a common structure, it will be a lot easier to get good results than dealing with a completely free-text format.
Similar Players
Let’s say you have a massive dataset of football players with some searching capabilities for the end user.
Then you do a search for a specific player – for example, Robert Lewandowski.
What you want next is to compare him to a bunch of other players with similar characteristics.
It’s very hard to do that with custom programming logic.
The reason is that every player is characterized by a whole bunch of statistics – like goals, assists, passes, penalties scored, shots, crosses, tackles, and many more.
So, the similarity between players is something that’s almost impossible to define with a strict set of rules.
A Machine Learning algorithm in the similarity/clustering spectrum can be of great help in this scenario.
Such a problem is also related to the Recommendation Systems problem space, like Netflix suggesting similar movies to the ones you’ve just watched.
Find Your Own Challenge
Once you build up some knowledge and experience, I’m sure you’ll have tons of ideas of how and where to apply them.
This most important thing, though, is just to get started.
Let’s discuss that next.
The Cold Start and How to Get Going
I guess the “problem-solving” mindset makes some sense for you, but if you’re an absolute beginner in ML, it doesn’t help you at all.
The reason is that in the beginning, you don’t have any intuition about what kind of problems you can solve with Machine Learning.
I mean, for sure, you’ve heard about the AI system beating the best chess grandmasters or the Dota 2 champions, but how are you supposed to relate that to your problem?
The truth is trivial – in the beginning, it’s quantity before quality.
You just need to concentrate fully on the learning part for a while. There’s no getting around it.
As I mentioned already, what worked for me as a start was taking a couple of Coursera Courses/Specializations.
In the following articles, I will give a very detailed overview of both of those sources, but for now, let me just briefly describe what they offer and how they helped with my learning journey.
[Coursera] Machine Learning, Stanford, Andrew NG
This course will gently introduce you to a lot of foundational Machine Learning concepts.
Don’t get me wrong – some of the material will still be quite challenging, not only from a beginner’s perspective.
Andrew NG has the talent to increase the complexity gradually after you’re already armed with the required fundamentals.
You will hear some opinions that the course is outdated and doesn’t teach a lot of practical stuff.
I guess there’s some truth in that.
For example, when you start building your own ML models, you will most likely not use Octave as a programming language.
Still, I think this course is about the fundamentals, which makes it so valuable.
You will work your way through many core concepts – starting simply by multiplying matrices, writing your own cost functions, exploring algorithms like Gradient Descent and Backpropagation.
Also, I hear that Andrew NG might be updating the course soon, which he’s already doing for the Deep Learning Specialization. However, this is still not the case at the time of this writing (April 2021).
Here’s a brief list of topics covered in the course:
- Linear Regression
- Logistic Regression
- Regularization
- Neural Networks
- Machine Learning System Design
- Support Vector Machines
- Unsupervised Learning
- Dimensionality Reduction
- Anomaly Detection
- Recommender Systems
- Large Scale Machine Learning
- Optical Character Recognition
[Coursera] Deep Learning Specialization, DeepLearning.AI, Andrew NG
After you’ve built a lot of the necessary foundations from the previous course, this one will get you fully into the space of Deep Learning and Artificial Neural Networks.
One of the best parts is that you’ll get familiar with a huge spectrum of practical ML problems and their solutions. For me, these serve as a great reference point when I now get into a specific type of problem.
You will learn how to use various Neural Network architectures in a wide range of domains like Computer Vision and Natural Language Processing.
A few topics from the Computer Vision material:
- Object Localization
- Object Detection
- Landmark Detection
- Art Generation with Neural Style Transfer
- Face Recognition
And a few from Sequence Models and NLP:
- Speech Recognition
- Text Generation
- Jazz Music Improvisation
- Language Translation
- Trigger Word Detection
- Chatbots
The great news is that this Specialization was fully updated (April 2021) with all the samples and assignments using Tensorflow 2.
Also, the content was modernized to showcase some of the recent breakthroughs in the field.
This shows the authors’ dedication to maintaining this Specialization as one of the best online learning sources for Deep Learning.
Isn’t The Math Way Too Scary?
It can be – if the first step you take is an advanced Calculus course by Stanford.
By now, you should know this is entirely against my advice.
Now, I don’t want to neglect the importance of the math itself.
But trust me – it’s a matter of balance.
In the first stages of your learning process, you’ll definitely need to brush up on the basics in the following areas – Linear Algebra, Calculus, Statistics, Probability Theory.
But this will come naturally as you get introduced to a specific concept. You will learn the math in a lot more practical fashion than you’re probably used to.
Let The Math Sink In As You Go
One of the reasons Andrew NG’s courses are so successful is that he presents the required math with the right level of abstraction for the audience and the topic itself.
For every new concept he presents, he would first make sure you build an overall intuition of the problem and the solution.
In many cases, this intuition is a huge step towards making the underlying math a lot more understandable.
From my experience, one of the best supplementary sources to dive deeper into a specific math area is Khan Academy.
You Are More Prepared Than You Think
You are a Software Developer. You’ve worked on tons of projects. You’ve been part of building successful and profitable systems following all sorts of best practices.
All of this doesn’t just get lost when you step into ML.
The good news is, a lot of what you’ve learned from your hard work throughout the years is still applicable.
It’s Still Software Development
Right, for ML, you will need to learn a lot of theory, but in the end, when you sit on your laptop, you need to write some actual code!
All the coding/design principles you’ve learned are still valid – refactorings, design patterns, cohesion, modularity, (choose your fancy words for high-quality software modeling), etc…
As you’ve done plenty of times, you’ll need to learn a new library (like Tensorflow or Pytorch), its’ general structure, interfaces, response types, error handling.
You’ve been there a lot of times, haven’t you?
From Experiment to Production
From my observations, building and fine-tuning the ML model itself is probably not more than 20-30% of the whole development cycle.
Usually, you start with an experiment. You collect some data, build different models, fine-tune the hyperparameters.
You might achieve some quite good performance metrics, but there’s a problem – at this point, everything is just a big pile of python code in a Jupyter Notebook.
Then what?
What Comes After the Jupyter Notebook?
You already know the huge difference between experimenting/prototyping and integrating your solution as a cohesive part of the production system.
You need to adhere to all the well-known architectural principles – Reliability, Availability, Redundancy, Scalability, Fault Tolerance, Elasticity, Observability, etc…
The process of converting the ML models to a production-ready part of the system is referred to as MLOps.
From what I’ve seen, it’s pretty common for experienced developers moving into ML to start by dealing primarily with the Operations side of things like building the infrastructure and the processing pipelines.
This doesn’t mean they don’t do any actual modeling.
Depending on the organization, there might be some pure researching roles – people spending 100% of their time in Data Science.
In many places, though, the boundaries between the Researcher and MLOps roles are quite blurry.
This means you’ll bring all of your valuable skills as a senior developer to the table, but in parallel, you’ll be able to dig deeper into Machine Learning and expand your skillset.
Summary
In this article, I presented my thoughts about stepping into the Machine Learning field from the perspective of an experienced software developer.
Many folks out there are have all sorts of doubts. I’ve experienced them myself.
I hope that this post brought some fresh perspective helping you start your exciting learning journey!
Moving on, I’ll be giving some in-depth reviews of the best online ML courses I’ve taken.
I hope this was useful!
Stay tuned, and thanks for reading!