Machine Learning
Diving Deeper into Machine Learning: Exploring Updates, Analysis, and Practical Insights on Our Blog
Want to Get People Excited About Your Machine Learning Project? Tell Them a Story
In 2007, two roommates struggling to make rent in San Francisco had a bright idea. A major design conference was coming up and hotel prices were soaring. What if, the roommates thought, they rented out air mattresses on the floor of their apartment to cash-strapped attendees — a kind of designers’ (air)bed & breakfast? It […]
Data Bias and What it Means for Your Machine Learning Models
We’d all like to imagine that the machines, systems, and algorithms we create are objective and neutral, devoid of prejudice, free from pesky human weaknesses like bias, and the tendency to misinterpret a situation. However, this simply isn’t the case. Sure, an automated machine learning tool or data science platform won’t privilege certain types of […]
Stay Inside The Lines: Coloring With Artificial Intelligence
After a few months with no side projects on my plate, I was eager to create something new. I’ve made it a routine to try and create AI that competes with my nephew in games he’s playing (just like in my previous posts). So, I asked my nephew what game he’s playing at the moment […]
Clustering — When You Should Use it and Avoid It
No matter what type of research you’re doing, or what your machine learning (ML) algorithms are tasked with, somewhere along the line, you’ll be using clustering techniques quite liberally. Clustering and data preparation go hand in hand, as many times you’ll be working, at least initially, with datasets that are largely unstructured and unclassified. More […]
Cracking The Stability of Feature Selection
One of the best ways to improve your production machine learning models is to improve the data that your model is trained on. Simple, right? Not when trying to do it at scale with the curse of dimensionality — various anomalies that begin to appear that wouldn’t naturally appear in low-dimensional settings — haunting you. […]
Extend Your Machine Learning Pipeline With Your Prediction Outcome
Close your eyes and imagine a machine learning production pipeline (come on, you can do it). What do you see? Oh yes, data sources; a data warehouse, a streaming queue, you know them. Now continue. What else do you see? Maybe an ETL service that is regularly cleaning, munging, and merging this conglomerate of bits […]
2020: The Year Data Dominates Data Science
Data has been a hot topic of discussion for years now. As organizations create and collect more and more data, it’s key that they use it to make actionable decisions and open up new lines of business. However, most organizations are only scratching the surface of what their data can do — typically using it […]
The Complete Guide to Decision Tree Analysis
In the world of machine learning, developers can create independent environments for projects easily. It only takes a few clicks to set and fit models in order to achieve solid results. Yet, many algorithms can be quite difficult to understand, let alone explain. Decision trees, while performing poorly in their basic form, are easy to […]
Challenges in Maintaining Real-Time ML Models Using Multiple Data Sources
The field of machine learning is currently experiencing a rapid and booming expansion that seems to have no end. It seems like every other day, the ML community publishes a new and exciting paper that shows the latest state-of-the-art algorithm or software for a use case that two weeks ago had a not-so-state-of-the-art solution. These […]
Using Data Science to Predict the Next Hit Song (Part 2)
In part one of this two-part series, we explored basic models and data enrichments for our hit song classifier. In this article, we will try to push our model a little more by attempting to improve its performance through better data exploration, enrichment and feature engineering. Before we get started, let’s recall the context. The […]
Using Data Science to Predict the Next Hit Song (Part 1)
It comes as no surprise that the music industry is tough. When you decide to produce an artist or invest in a marketing campaign for a song there are many factors to take into account. What if data science could help with this task? What if data science could predict whether a song is going […]
Top 10 Evaluation Metrics for Classification Models
It’s important to understand that none of the following evaluation metrics for classification are an absolute measure of your machine learning model’s accuracy. However, when measured in tandem with sufficient frequency, they can help monitor and assess the situation for appropriate fine-tuning and optimization. Here are a few values that will reappear all along this […]
Demystifying Feature Selection: Filter vs Wrapper Methods
Feature selection algorithms are increasingly growing in significance. In this article, we will cover (and compare) two popular feature selection methodologies – Filter and Wrapper.
Interpretability and explainability (Part 2)
The whole idea behind interpretable and explainable ML is to avoid the black box effect.
Interpretability and explainability (Part 1)
The whole idea behind interpretable and explainable ML is to avoid the black box effect.
The ultimate machine learning model deployment checklist
While there is some room for error while integrating models into production environments, there is also a very good probability that these issues will eventually lead to disaster. And that’s exactly why we have created this pre-model deployment checklist.
Who’s the painter?
Better features, better data
The spectrum of complexity
Demystifying the old battle between transparent, explainable models and more accurate, complex models.