Overview
In the context of DataMilk performance reporting, outliers are extreme values or exceptional transactions that can significantly skew the calculated performance uplift for metrics such as revenue, conversion rate, and average order value (AOV). This article explains why outlier removal is essential and outlines the methodology used to identify and remove these atypical transactions, ensuring a more accurate representation of DataMilk's effectiveness.
Outlier removal for DataMilk performance reporting
DataMilk measures and statistically calculates performance uplift for various metrics like revenue, conversion rate, and average order value (AOV). This calculation has been irregularly but substantially distorted by exceptionally large orders. In order to mitigate this undue effect, DataMilk uses a standard statistical data preprocessing step, the identification, and removal of outliers (extreme values). Here we discuss the why and the how.
The challenges of rare, large orders
Smart Components improve the user experience and on the population of store visitors, they lead to a measurable increase in revenue. Implicit in this is that revenue uplift is achieved on visits where the user experience plays a role in a purchasing decision.
DataMilk customers have reported various types of conversions which fall outside of this scope:
- fraudulent transactions,
- purchases by resellers,
- purchases by influencers, sponsoring agreements.
Assuming that a user comes to the store with such an intent, DataMilk Smart Components are unlikely to play a substantial role in conversion. These users, just like any other, are assigned randomly either to the Smart Traffic or to the Original Traffic, with a probability set by the Smart Traffic ratio. Because each is a part of either one or the other group, each will influence the statistics of their assigned group. Because they are randomly assigned, over a long time horizon their impact is unbiased: they contribute to the Smart Traffic as much as the Smart Traffic ratio is, and to the Original Traffic the complementary proportion. In this regard they behave like common users. If the Smart Components do not influence their purchasing decisions, then the most that can happen is that they dilute the measured and statistically calculated revenue uplift performance.
A problem arises by the confluence of two factors:
- The above transaction types are usually very large orders, up to 10-50 times larger than common users.
- A subtype is when many orders are placed by a single device, which is recorded as a single visitor.
- They are rare in the sense that in the aggregation period used for reporting purposes, e.g. 90 days, there are only a handful of them.
Example:
Imagine a shoe retailer where a reseller places a purchase order for ten pairs of shoes (different sizes of one model), then a second order for another ten pairs, and perhaps some more. This user is inevitably either in the Smart or in the Original Traffic (they cannot be in both if the user is using a single device), and their spending will tilt the statistics to one side or the other.
Over a long time period, there would be additional resellers placing orders in both groups, proportionally to the Smart Traffic ratio. But the fixed 90-day aggregation period may well not be long enough to accumulate a sufficient number of resellers. In small samples, large biases frequently arise: for example, we can have 50%:50% traffic ratio set, and 5 resellers making a purchase in the 90-day aggregation window, such that 4 of them are in the Smart Traffic, 1 is in the Original Traffic. This would make the performance of DataMilk look better than it really is (among common retail users). Over a long period of time, there would also be 90-day windows when the Original Traffic receives more resellers. In that window, DataMilk's performance would look worse than its long-term average performance.
Fraudulent transactions might happen, for instance, when criminals want to convert the value of a stolen credit card to tangible goods via large purchases. These transactions might be accepted at first, get transmitted to DataMilk, then get reverted later, without transmitting the fact of reversion to DataMilk.
It is worthy of notice that rare but large purchases as a group tend to perturb the AOV and revenue uplifts, but they do not impact the conversion rate uplift much. This is exactly because there are few such visits, but they are characterized by high AOV.
DataMilk does not want to count these visits in the Smart Traffic, on which the billing is based. Nor does DataMilk want these visits to have an impact on the uplift calculations for conversion rate, revenue, and gross profit because they make DataMilk's uplift performance reporting less representative of the substantial bulk of users. These few but large conversions regularly caused seemingly random swings from one reporting period to the next in the performance metrics across DataMilk's partner stores.
How DataMilk removes outliers
DataMilk uses both attention data and conversion pixel data to identify orders which do not fit into the regular transaction flow of the specific store. We call these collectively outliers. DataMilk removes these orders, and in some cases, the corresponding visits from its performance reporting.
Implications
Due to this data cleaning procedure, total revenue numbers between DataMilk and third-party analytics platforms such as Google Analytics may have some discrepancies. Additionally, DataMilk may count fewer sessions than other platforms.
DataMilk retains the right to change its outlier removal methodology to improve the service without notice.