Table of Contents
Another year gone, and another round of retrospectives and looking back at what was and is coming. Data science had quite the 2020, but like most tech fields, it was also full of terms, buzzwords, and acronyms. Buzzwords come and go, and sometimes they take on a life of their own. Others get used so much they lose their meaning. Some are just ridiculous, but there you go. In the spirit of the new year, let’s take a look back at all the acronyms, buzzwords, and terms that dominated data science in 2020:
TBA (The Best Acronyms)
DQ (Data Quality) – One of our personal favorites at Explorium, DQ is all about quality over quantity. As 2020 unraveled, companies went from simply amassing as much data as they could, and instead realized that DQ is more important than simply having all the data. Less is more when it gives you better results.
CFDS (Customer-Facing Data Scientist) – We might be biased here — after all, we love Explorium’s CFDS team — but more and more, companies are looking for services that give them not just data science platforms, but the experts and data scientists that can help them unlock their platforms’ true potential. CFDS are vital to helping organizations succeed in their ML transformation.
BERT/ELMO – Our data scientists all agreed that this was a great year for models named after fictional characters. BERT (Bidirectional Encoder Representations from Transformers) and ELMO (Embeddings from Language Models) were major breakthroughs in the NLP field, providing new ways to represent text and better question answering and natural language inference.
MARGE/BART – No, we’re not talking about the Simpsons. MARGE (Multilingual Autoencoder that Retrieves and Generates) is a new model developed by Facebook that allows users to generate text (words, sentences, and paragraphs) by analyzing patterns in existing texts. BART (Bidirectional and Auto-Regressive Transformer) is an auto-encoder designed to help pre-train reinforcement models, and it has become a go-to for sequence-to-sequence algorithms.
DL (deep learning) – DL seems to have become 2020’s “AI” as the most commonly used acronym by non-data scientists. This year saw the model’s popularity soar as more companies rushed to embrace its powerful, unsupervised approach, even though, as our data scientists put it, “in many cases using DL is like using a five-ton hammer on a small nail.”
The buzzwords that shaped 2020
The good:
- MLOps – the field took a major step towards greater scalability and adoption thanks to processes and practices that helped automate large portions of the development and deployment process.
- Data engineer – as data science continues to evolve, the importance of data is becoming more clear, and 2020 saw a new emphasis on engineers who could work directly getting the most out of datasets.
- AutoML – more than just machine learning (ML), 2020 saw the process automated and delivered to the masses. AutoML is a major reason why the sector has gone mainstream this year.
- Explainability – Building complex models is great, but without the visibility into how they work, they’re not as good as they can be. Explainability became a major goal of data scientists looking to understand their models and make them better.
- Data discovery – In a year where data became a precious resource, our collective focus was on how to find the right types. Data discovery is a driving trend, and finding the best tools to do it is a major edge headed into 2021.
The controversial:
- Deepfakes – one of the products of better ML algorithms was the ability to create incredibly convincing videos that superimpose people’s likenesses on others. While it’s great, the technology has created serious ethical concerns about how it’s used and what it could produce.
- Citizen data scientist – This one was controversial among Explorium’s experts. On the one hand, data science is more accessible than ever before, allowing complete beginners to dive in quickly. On the other, it seems like there’s no clear consensus on just what “citizen data scientists” are.
The overhyped:
- Deep learning – We get it, deep learning is the future…except when it isn’t. The rush to hop on the new technology bandwagon means a lot of people are diving into deep learning without thinking about whether they actually need it. As our data science team so eloquently put it: “in most use cases, using DL is like using a 5-kilo hammer on a small nail. It’s not just overkill, it’s simply the wrong tool”
- AI – The truth is, AI has been a buzzword for a few years, but it seems like we still like to overhype it. AI is a large field, and it includes ML, but saying you have AI doesn’t mean you do.
- AutoML – We know, we know. We put this one up above but hear us out. It’s true that it has been instrumental in bringing data science to the world, but not every new product that features some ML is AutoML, and calling it so makes it harder to find the valuable tools from those that are just hopping on a hot trend.
Another year, another buzzword
2020 was definitely a roller coaster ride for data science and the world, but it was certainly not boring. The field saw major innovation, and as we re-emerge from the Coronavirus pandemic, it will only get better. With new milestones and technologies coming up in the next year, we’ll see which buzzwords and acronyms make up our list in 12 months!