As you might guess, I’ve been playing around in the AI/ML space a lot this summer, hence a lot of posts themed around this topic. We’ve covered questions to ask yourself before delving into an AI solution, the different roles you might work with in an AI community, and in this post, I want to talk common terms you’ll hear. The goal here is for you to be able to walk into an AI effort and at least be able to understand the language spoken, so I’m intentionally going to go as plain language as I can with these definitions.
This list of terms certainly isn’t all inclusive, but I wanted to pick out a handful you’re likely to hear in your work setting. We’ve got a lot of ground to cover, so let’s jump on in! (Btw… there’s no inherent order to these terms.)
Okay, while there genuinely isn’t an inherent order to the rest of our terms, I am starting off with this one intentionally. A predictive model is a statistical approach to be able to make very educated inferences based on past experiences. So, for example, if I wore a blue shirt to work 49 days in a row, what color of shirt would you expect me to wear on the 50th day? Not a trick question, you’d expect me to wear — drumroll please — another blue shirt! You naturally would infer that based on the pattern of me wearing blue shirts every other day.
The keyword there is infer. While it would be a very good guess that I’d wear another blue shirt, except I could throw a total curve ball and walk in that 50th day wearing a red shirt. So people over time have figured out how to emulate the same logic you used to guess “50th shirt = blue shirt” in computers using some advanced inferential statistical methods. You’d probably be surprised how far these methods have gotten us! But at the end of the day, that’s really all it is: fancy math done in a computer.
Keeping in mind that we use past information (data) to properly train a predictive model, it’s possible that the data you’re training your model on isn’t a good fit to infer future events. For example, if you wanted to create a predictive model to predict summer weather patterns in Illinois, you probably wouldn’t get good results if you only trained your model on data in the month of August. We actually had a relatively cool June this year, so a model trained only on August data would probably run hot too often. The model is overfit to match a very narrow scenario.
On the other hand, if you wanted to predict summer weather patterns in Illinois and used data from EVERY month, the data from winter months may skew your data unfavorably. In this case, your model is still doing a poor job but this time for underfitting reasons instead of overfitting.
Both cases aren’t ideal, and it’s very easy to run into either scenario with most different kinds of predictive models. The key is to tune your model properly to ensure a proper balance so that you’re getting the most optimal inferences out of your model.
Machine Learning (ML)
Perhaps one of the most difficult concepts to communicate to folks is the difference between artificial intelligence (AI) and machine learning (ML) because, yes, there’s a distinction. Basically, ML is one implementation of AI, and the reason they’re used so interchangeably is that I’d wager that ML is the most popular implementation of AI these days. Other implementations include robotics, computer vision, and natural language processing (NLP). (My friend Raj has made a great video about this whole topic. Go check it out here.)
Now… you may hear another implementation called deep learning. Essentially, deep learning is a much more computationally intense version of machine learning. A neural network is an example of deep learning. The key question to ask is, is deep learning really needed? Because I’d argue most business needs do NOT need a deep learning solution and could totally suffice with a more basic ML solution.
Artificial General Intelligence (AGI)
Okay, of all the terms on this list, this is probably the only one that is impractical. It’s impractical because it’s a conceptual idea that has never actually manifested in practice. Lots of people like to reference certain AI solutions passing the Turing Test — the general method of determining if an AI can think like a human being — but to date, no AI solution has ever reached a true Artificial General Intelligence (AGI) level.
So what is AGI? It’s this idea that we’ll one day be able to program a machine to have the exact same (or better) thinking/learning process as a human being. That point in history when machines will finally “surpass” human thinking is referred to as the singularity. I only call this definition out in this post just as a red herring to state that this is largely fanciful and only good for Hollywood movies. I personally am very skeptical that we’ll ever see an AGI in our lifetime, if one ever is created at all, so if you ever hear anybody legitimately talking about an AGI in a business context, know that they’re full of crap. *smiles*
If you’ve been around teenagers in the last few years, you’ve likely heard some variation of the phrase, “That’s so meta.” They usually use that when to referring to a situation when something has transcended the situation itself. Like putting a meme within a meme. Or me writing a post about me writing a post. That’s meta, fam. (Hey, I’m still in my 20’s for another 38 days.)
Okay, those are stupid examples, but simply put, metadata is data about — you guessed it — data! Metadata can be super helpful for a variety of reason. It can provide context around the data, and metadata lineage can even give you an “audit trail” of sorts for data. I recently completed a project totally free of metadata, and let me tell you… it was weird. So if you’re in the early stages of data collection, do yourself a favor and capture metadata early on. It’ll save you lots of time in the long haul!
For as smart as computers can be, computers are actually fairly dumb. Dumb in the sense that they have no intuition and are wholly reliant on brittle systems programmed by we humans. (Another reason I’m skeptical about AGI happening any time soon.)
When we read the words “August” and “august”, you and I can easily recognize that those are the same words. A computer sees that capital letter “A” versus the other lower case “a” and determines those are wholly different words. Not very helpful when we’re wanting to do analytics, right?
So data quality is simply ensure the “hygiene” of data is appropriate per the data field. Data quality can come in all sorts of flavors, including the data field is a consistent data type (e.g integer, single character), consistent spellings, and more. Ensuring high levels of data quality saves your data scientist friends a LOT of time!
This is a relatively new term but one that is growing in popularity pretty rapidly. One of the most popular implementations of a feature store can be found within Uber’s Michelangelo platform. If you can nail this concept down, you’ll really be ahead of the game.
If you think back to our definition of data quality, you’ll recall that data scientists and other advanced analytic users really appreciate high quality data as it saves them time from having to clean up the data themselves. High quality data is great, but sometimes, the data needs further curated for specific business needs. This process of creating data from other collected data is called feature engineering. Features because columns, attributes, and features are all the same terms used interchangeably.
(Mini rant: I hate that we use the word “feature” in this context. It doesn’t make sense to me, and it gets extra confusing when you try to talk about scrum’s definition of features. So literally, you could feature a feature about feature stores in your feature showcase. Ridiculous. Okay… rant over!)
Chances are that if you engineer a feature once, you or one of your data science friends will want to use that again, hence the concept of a feature store. Although… “Store” is a little bit of a misnomer in my mind because it’s the logic of the transformation that’s being stored, not the transformed data itself, ya know? (Except in a few outlying scenarios… you may have to “physicalize” those features.)
Anyway, if you want to learn more about feature stores, check out this great post.
Transfer learning is another emerging concept in the AI world, and I’ll be honest, I only know of folks using this concept in a very limited context. Still, I think it’s going to be one that will continue to grow, so I feel it’s important that y’all are at least aware of what it is.
Transfer learning is basically taking a machine learning model built for one domain and leveraging it (with maybe some tweaks) in another domain. The most popular example of transfer learning is with image classifiers. So, for example, I completed a deep learning project for Udacity not to long ago that leveraged the University of Oxford’s VGG neural network to classify different types of flowers.
And aside from deep learning image classifiers leveraging already-created models like VGG, I don’t know any other great examples of transfer learning. But I have a hunch that it will catch on… I suppose my only question is, what are the legal ramifications of using somebody else’s trained network for transfer learning if you don’t know the means by which that original network was trained? (Sorry if that got jargony as heck.)
That’s it for this post folks! I think I may have one or two more AI posts in me until we take a break for something else. And as always, I concurrently write more philosophically oriented posts on my Medium channel. Those posts are denoted with a darker-colored title card. Catch you in the next post!