In this article, we will attempt to de-mystify all the jargon that is associated with artificial intelligence and data science. But first, we’ll need to understand the very basics of how an AI algorithm works to be able to better gauge what the content given below means.
Here are the most general steps of how we train an AI model (this will be true for almost all AI models):
- You have some instances of paired data
- You, as a human, know that
yis predictable with information given in
- Your model
Mis a function that takes as input
xand outputs (predicts) a value
- You compare the expected output
yand the actual output
y'and have a way to measure the error.
- The error gives the model feedback on how wrong it is, based on which it self-corrects itself according to some rules based on (almost always) calculus.
- Repeating the above with different
(x, y)pairs is called training a model
- You keep repeating this until the model
Mis accurate enough on the
(x, y)pairs that it hasn’t yet seen during training.
In pseudo code, this is how you train an ML model (almost always):
M = new Model() for all (x, y) in training_dataset: prediction = M(x) err = error(y, prediction) M.self_correct_according_to(err) if M.accuracy_on(unseen_x_y_pairs) > 0.95: print("We're done!") exit;
So, to do any sort of data science related work, you need:
- Data- Images, audio, excel sheets, EEG readings, signals, MRI Scans, or pretty much any information that can be represented digitally as numbers.
- A Model- In code, this will be a (generally differentiable) function that takes an input, does some calculations, and returns an output.
- A way to measure how bad the model is, so that it can correct and/or train itself.
- A way to measure how good the model is, so that it knows when to stop correcting/training itself.
With that out of the way, we can actually move on to understanding the differences between different areas in AI. There are two or more kinds of meanings each of these terms have.
- One is the actual, technical meaning.
- The other is what people actually mean when they say these words in every day language. This is similar to how Xerox is a company but people use the term interchangeably with photocopying.
If you haven’t noticed already, I’ve been very liberal with using these terms even throughout this article up until now. Since the meanings of all these terms is very fluid, I’ll try to be as general as possible while explaining them.
Let’s start with some definitions from Wikipedia (please remember the acronyms where applicable, it’ll make life easy)-
- Artificial intelligence (AI) : sometimes called machine intelligence, is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals.
- Machine Learning (ML): the study of computer algorithms that improve automatically through experience
- Deep Learning (DL): (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning
- Reinforcement Learning (RL): is an area of machine learning concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.
- Data Science (DS): is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data. Data science is related to data mining, machine learning and big data.
- Data Analytics: is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data
- Computer vision (CV): is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human visual system can do.
- Natural Language (NLP): is a subfield of linguistics, computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.
I’ll try and give some popular and commonplace examples for you to get an idea of all the places where these fields directly affect your life (if you’re not new to this, you’ll find these examples to be very well known):
- Instagram’s Explore section is curated using something called a recommender system that is developed with DL algorithms
- Google Photos recognizes your friend’s faces using CV algorithms which are a part of DL family of algorithms.
- Google Now understands what you mean when you give it voice commands using NLP and NLU (Natural Language Understanding)
- The AlphaGo algorithm that caught the news after beating the world’s best Go players was created by DeepMind using RL algorithms.
- Banks do credit scoring using ML algorithms.
- Regular decision making in big (and some small) companies is done using data analytics by analyzing huge volumes of data to gather useful insights.
- AI and Data Science are loose terms that people regularly use to talk about all of the above.
This figure is (mostly) a technically accurate overview of where everything overlaps and what is contained in what else, and where these terms differ:
(forgive my poor illustration skills and the use of Comic Sans)
Notice that AI has a portion that extends outward from everything else. This is because there are classic AI algorithms that don’t fall in any of these categories. Further, notice how I’ve included “Data Analytics” in almost every box. The simple reason is that all these techniques allow you to analyze data in some way or the other. You’ll see how this becomes a problem when companies put “Data Analyst” on their job descriptions and an applicant eventually realizes what they wanted was a guy who can perform SQL queries and not necessarily analyze huge amounts of glorious conversational data from Reddit using NLP. The opposite is also true. Another thing to notice is how NLP and CV extend into different areas and even outside of everything else. This is because even applying a simple brightness filter constitutes a CV technique and the algorithm for that doesn’t fall into any of these categories. The same logic applies for NLP too. Another important aspect is that the Data Science bubble seems to encompass pretty much everything because, well, all of these areas are just different variants of the Science of Data.
Here’s the problem- this is what most people actually (incorrectly) mean when they talk about these areas:
This is the categorization they use in industry, and sometimes even in Academia. Specifically, Machine Learning, Reinforcement Learning, and Deep Learning are mostly talked about as three very separate areas of study. This is why I originally said that all these terms are very fluid and depend a lot on context and organizations where they are used. It’s best to infer what you need in the moment!
But then, how do I choose what to do ?
Here’s some advice that will work (read: has worked) for a majority of people:
If you already know what domain you want to apply your skills to, this part becomes much easier than otherwise. So, first I’ll very briefly detail what career options you could have based on what you learn. If you have something in mind already, then this should help a lot.
There are two kinds of data: structured and unstructured:
- Structured Data: these are mostly CSV/Excel files with lots of rows and columns. Here, each data entry has a very specific meaning to us humans.
- Unstructured Data: Images, audio, etc. This kind of data can be very chaotic and doesn’t follow rules during creation. (Can you lay out the formal rules for an image to contain a cat? What does a red pixel mean by itself?)
You can apply both ML and DL algorithms to both structured and unstructured data. But, in practice, you’ll use more ML algorithms for structured data and more DL algorithms for unstructured data. (Here, ML means everything outside the DL and RL bubble but inside the ML bubble)
- If you know you want to go into finance / banking / investment or other related fields where spreadsheets will be your best friends, you’ll probably want to focus first on ML and data analysis.
- If you’re looking to work in robotics, do things like face detection, identify objects in an image, tag audio samples by their genre, make intelligent voice-enabled applications, etc., then you should be looking more into DL.
- If you’re fascinated by how virtual agents in games work and how to create intelligent agents in simulated environments, then RL is probably what you’ll be focusing on.
- You should NOT try to learn everything. One would think it’s obvious but too often a lot of people try and learn all of these things at the same time.
Now, if you’re still confused like I was (and most people are. It’s okay!), then the best way to go about this is to do a course which gives you an introduction to ML and a taste of everything else. By the time you complete it, you will have a much better idea of where to go next. In any case, you will have to learn the basics of ML before trying to get into DL or RL anyway. However, do not fret. It’s not more than a few hours of work if done the right way.
All of this sounds very fancy. Is it that hard?
Short answer: NO
Long answer: Probably, Yes.
At a conceptual level, almost all the time you’ll be able to distill all the math down to matrix multiplications, basic calculus and introductory probability theory. If you’re not planning to be a researcher in academia, then you won’t be needing to get into the conceptually hard parts in AI. Most things you’ll need to know to work in industry (not as a researcher in industry) will generally have an associated blog on Medium explaining it in simple language. So No, at a conceptual level needed for industry, ML isn’t hard to understand
There is a huge, ever growing (daily) breadth (not depth) of concepts that constitute Data Science. For most people (like me), this is actually an advantage as it keeps things interesting. For some though, it can quickly get taxing to stay updated all the time with the latest research papers and industry practices. In ML, it’s not just the frameworks and software, but also the very foundational concepts that keep getting revisits and major modifications all the time as researchers discover new phenomena. Then again, it highly depends on your career whether—and to what degree—you’ll need to stay updated. For the most part, you’ll just need to be current with how the popular ML/DL frameworks change over time and any major new techniques that show up. Following a few YouTube channels and some quality weekly newsletters should be enough (This will be covered in a later article). Just remember that things aren't necessarily getting conceptually harder every day, but the number of concepts keep increasing very quickly.
If you’re aiming to get an AI Research position at a big tech company (FAANG), or get a PhD/Postdoc at a reputed university abroad, then I’ll be honest: you’re probably in a for a few years (2-8 years post your undergrad based on your exact aims) of grueling hard work, disappointment, glory, more disappointment and more glory. And once you have the position or job you want, you will probably realize you never wanted it in the first place, after having consumed your 20s entirely in another rat race. I do not want to discourage anyone, but this is a very important thing to know and has been an important point of discussion between important people in AI including Andrew Ng, Chris Olah, Kaggle Grand-masters and many others, very recently. I will probably talk about this in detail later in the series.
Just give me the links to the online courses already!!
In the next article, we’ll talk about how to actually learn all things things that I said you should. I’ll also share some of what I believe to be the best resources on the internet. We’ll talk about how to choose online courses, resources, blogs and books, and things to generally stay away from (like Udemy). We’ll also talk about choosing the right frameworks and languages (PyTorch vs Tensorflow, Sklearn vs NVIDIA RAPIDS, Python vs C++) based on your specific needs. In the same article, or in a follow up, I’ll talk about setting up the hardware you need for ML/DL work.