What is Anomaly Detection and Why is it So Important?
How can companies stay competitive in an increasingly complex business and technological landscape? Heightened visibility into metrics within business operations is key to gaining valuable insights and intelligence to generate actionable decisions.
Data on its own offers little value. Its benefits can only be realized through processing, analyzing, and harnessing the information to derive meaning.
Powerful emerging technologies such as artificial intelligence and machine learning provide invaluable tools for organizations to access vital information and insight from their data, as well as identify anomalies.
This information is crucial for an organization to maintain agility, maximize productivity, enhance efficiencies and compete in today’s marketplace. Anomaly detection is applicable in any industry vertical, as well as a variety of domains and circumstances.
Any dramatic, abrupt change in behavior from the expected pattern can point to an anomaly such as a sudden disruption in a supply chain. If alerted to this vital information in real-time, a company would know to pivot its current logistical operations to adapt to the interference to maintain healthy and profitable business performance.
What is an Anomaly Within a Data Set?
An anomaly within a data set is an outlier that deviates significantly from expected behavior or patterns within the data and stands out for its unusual or abnormal qualities.
Anomalies can be detected as part of a time series. A time series refers to a duration of time in which data is collected, and each data point within the data set corresponds to a measure of time.
For example, if you were to notice an unexpected surge in the amount of visitors to your website over the duration of a week in comparison to your normal rate for a week’s time, this atypical increase would be considered an anomaly.
However, anomalies can also be detected without being related to a specific context of time. For instance, instead of focusing on data collected over a particular duration, you may target unusual data points that deviate from other clusters of data points.
For example, if you were to compare household incomes in a neighborhood and a strong majority of the households had a median income between $90,000 to $100,000 and you noticed two houses in the neighborhood with a median household income of $180,000, these two households would be anomalies within the data set.
Various circumstances can point to anomalies within data sets such as an unexpected increase in traffic, troubleshooting within a company’s website or atypical activity due to fraudulence, cybersecurity threats or a cyberattack.
It is vital that organizations are alerted when anomalies are detected to ensure that they can react, adjust and adapt to any issues that arise in real time.
The Three Different Types of Anomalies
Point Anomalies aka Global Outliers: An individual data point is considered a point anomaly when its value deviates far outside from the range of the rest of the data set. For instance, an unusually high amount of money charged to a credit card to one day could detect an instance of fraud on the account.
Contextual Anomalies: A contextual anomaly exists when a data point is considered an anomaly only within a specific context. For example, it’s normal for sales of sunscreen to spike in the hot summer months, but a dramatic increase in sales of sunscreen may be considered odd for other times of the year.
Collective Anomalies: A data set would contain a collective anomaly when a group of similar data points are considered unusual or anomalous from the rest of the data set. For example, if various, related instances of atypical activities were detected on a website, this collective anomaly could suggest there may have been a cyber leak, cybersecurity threat or hacking incident.
Here’s an example from a company that uses Shopify, a popular ecommerce platform to sell their products online.
The image depicts the company’s total sales for the period of May 26th to June 2nd. On June 2, the company’s sales plummeted, reflecting a dramatic decrease in sales from May 31st to June 2nd. This major decline in sales would be considered an anomaly within this dataset.
This drop in sales was caused by code on the website that was accidentally placed in the wrong location. The company found this out when they realized they had no new orders the next morning.
Had the company had visibility into this information sooner, it would have been able to investigate the problem, pinpoint the reason behind the atypical activity and ultimately, prevent a huge loss in sales.
Anomalies do not always have to reveal negative or troublesome information; they can also present opportunities for organizations to leverage growth and derive increased value based upon factors that are already yielding positive effects and changes.
The above graph depicts total sales for the same company from the period of May 1st to 26th. On May 26th, the company experienced an unusually high spike in sales, skyrocketing from approximately $7K the day before to around $29,000 the next day, an increase of approximately 23% in sales.
With this insight, the company would be able to take advantage of the opportunity to capitalize on events that may have caused this spike to continue generating further sales and profits.
What are Effective Methods of Detecting Anomalies?
To effectively detect anomalies, you would first have to decide what business metrics are important to your business to monitor and which thresholds would be considered anomalous to your normal operations.
There are different tools and approaches that can be used to facilitate effective anomaly detection. For example, a company many be interested in monitoring conversion rates, page load speed, total users or new users, bounce rate, or average session duration that a user spends on the company’s website.
There are many different anomaly detection methods an organization can implement.
No matter which method a company ultimately uses, it is crucial that it is alerted to unexpected, atypical changes in business behavior or patterns in real time to adjust and adapt accordingly.
An organization may want to decide these thresholds itself or can use tools to help automate the process in order to maximize its ability to identify anomalies.
For example, advanced, emerging technologies such as artificial intelligence and machine learning can identify patterns within data to predict what an outlier would look like.
Google Analytics and Google Ads use machine learning algorithms to detect anomalies using a Bayesian state space-time series model, which is a sophisticated model built by forecasting outliers based on historical data. Using this statistical method, Google Analytics gains insight into trends and seasonal patterns of a business to set parameters of a time series data set.
For instance, if your company sells holiday ornaments, by applying machine learning model like the one mentioned above, GA Insights would have learned that an uptick in sales around the holiday season is in accordance to your normal business patterns and would not alert you to this increase. However, if your company’s sales behavior surges outside of these expected parameters, GA Insights would alert you to this change.
Detecting Anomalies in Google Ads
You can also detect anomalies using Google Ads. You can use Google Analytics to glean insights from Google Ads using the direct integration available between those two platforms, but you can only achieve a high-level overview.
To gain a more comprehensive understanding of your performance metrics, you will need to monitor Google Ads directly in the Google Ads platform. Effectively operating Google Ads is far from turn-key, and most companies would need to hire an expert to manage the account or outsource the position.
There is a long list of metrics a company would need to monitor to effectively manage their data to detect anomalies inside Google Ads. These include the following metrics:
- Conversion Rate
- Total Cost
- Average Position
- Ad placement
- Demographics (A more detailed version of demographics is available in Google Analytics)
While you can gain a strong understanding of the above metrics in Google Ads, it is not the best platform to detect anomalies. Filtering through various reports can be a complicated process and it can be challenging to compare reports from one day to another.
To successfully track the data, you will need to create your own script, a task that be difficult and more time-consuming than you might imagine, even if you do have the coding skills to write it.
The other option is to monitor the data manually using spreadsheets, which can be a monotonous and repetitive process. Not only is it a drain on resources and time, this method of tracking data can be unreliable as it is dependent upon someone manually tracking the data, resulting in in usable data, and possibly erroneous insights. It is crucial that the process is automated to be able to derive the most value from the data.
How to Automate the Process of Anomaly Detection
Below are key components of this process:
1) First, data from Google Ads must be pulled into a report as a means of being able to analyze, measure and monitor the data
2) A ruleset must be defined for each item to identify outliers or atypical data
3) Finally, if anomalies are detected, notifications for this information should be automated.
If you are able to successfully develop a script or are able to plug one into Google Ads, it will still need to be maintained and updated as Google is always refreshing its software. Therefore, the best method is to find a turn-key solution that is easy to use and hassle-free like GA Insights, which will monitor Google Analytics, Google Ads or Google Search Console in a few clicks.
However, if you are still interested in writing your own script, here is a guide of a few examples of beginner scripts to get you started.
How to Manage Anomalies Using Data from Google Ads
Gaining insight is only the first step; to realize value from this knowledge you must how to implement this information to generate the strongest profits and ROI for your company.
You can set bids on various devices (desktop, tablet, mobile). If you, for example, receive an anomaly notification informing you that one of your ads on your mobile video games is booming, you may be encouraged to reduce spend on your ads for desktop and tablet, and increase spend on your mobile ads.
Demographic targeting monitors metrics including gender, age and income. If a report is published explaining the benefits of your product to people aged, 55 and over, which results in an unusual surge of sales in your 55 and older demographic, you would want a notification of this anomaly so you know to increase your ad spend for that demographic.
Time Targeting (Dayparting)
Time and days of the week are also important factors in monitoring data to detect anomalies achieve the strongest ROI. Using Google Ads, you can control which days of the week as well as which times your ads will show. If you for example, have a product that experiences an unexpected spike in sales during football season, you would want to be alerted to this information, so you know to increase ad spend for that time.
Alternatively, if sales for your product experience an unusual drop later in the night, you would want insight into this information so you would know to reduce spend on ads for later in the evening or not show your ads at those times at all.
Finally, location is another vital factor in effectively monitoring and tracking data to identify anomalies. You can use Google Ads to control at while locations in the world your ad will be shown. For example, imagine that you own a Shopify store that offers neon leggings, and a movie comes out in Australia featuring a character who wears neon leggings.
Normally, you would probably have no way of knowing this movie even existed. However, in Google Ads you would receive a notification alerting you to a surge in sales of neon leggings in Australia, and would therefore, know to increase your ad spend in that location.
Using a Platform like GA Insights to Detect Anomalies
Detecting anomalies using various methods can be an overly complicated and heavily technical process, but when you use a platform like GA Insights, it is much easier and hassle-free.
You can set your thresholds yourself to decide what maximum or minimum a metric would have to reach for an alert to be triggered. Alternatively, you can let Artificial Intelligence set the thresholds.
Artificial intelligence and machine learning draw upon historical data sets to learn patterns of your normal business operations to identify what an anomalous activity would look like. With the ability to learn over time, AI’s accuracy will heighten over time as well, to ensure a company is alerted to these outliers in real time and can therefore, react accordingly.
Harnessing the full potential of your data and detecting anomalies are paramount to maintaining agility and maximizing profits, productivity and realizing value and growth.
Imagine the possibilities and new heights of performance your company can reach by identifying anomalies when they exist in real-time, so you can adapt and adjust to ensure consistently healthy business patterns.