Overall Evaluation Criteria (OEC)

From GM-RKB
(Redirected from Overall Evaluation Criteria)
Jump to navigation Jump to search

An Overall Evaluation Criteria (OEC) is an evaluation criteria of an A/B test.

  • Context:
  • Example(s):
    • Bing: Sesssions/User ...
    • Netflix: viewing hours (or content consumption) as their strongest proxy metric for retention
    • Coursera: test completion or course engagement as proxies for their key metric.
  • Counter-Example(s):
  • See: Guardrail Metric.


References

2017

  • http://exp-platform.com/Documents/2017-08%20ABTutorial/ABTutorial_5_designingMetrics.pptx
    • QUOTE: OEC vs KPIs (Key Performance Indicators).
      • KPIs are lagging metrics reported monthly/quarterly/yearly at the overall product level (DAU, MAU, Revenue, etc.)
      • OEC is a leading metric measured during the experiment (e.g. 2 weeks) at user level, which is indicative of long term increase in KPIs
    • Netflix: Because it is a subscription-based business, one of Netflix’s key metrics is retention, defined as the percentage of their customers who return month over month. Conceptually, you can imagine that someone who watches a lot of Netflix should derive a lot of value from the service and therefore be more likely to renew their subscription. In fact, the Netflix team found a very strong correlation between viewing hours and retention. So, for instance, if a user watched only one hour of Netflix per month, then they were not as likely to renew their monthly subscription as if they watched 15 hours of Netflix per month. As a result, the Netflix team used viewing hours (or content consumption) as their strongest proxy metric for retention, and many tests at Netflix had the goal of increasing the number of hours users streamed.
    • Coursera: Coursera is a credential-driven business; in other words, they make money when users pay for credentials (certifications) after completing a course. One of their key metrics might be the number of credentials sold, or the revenue generated from credential purchases. You might be skeptical about this metric, though, and for good reason: because Coursera courses are often 13-week college classes, measuring how a design changes this metric would take far too long to be practical...Course completion is predicted by module completion, which is in turn predicted by test completion and how often a user engages with a course. So Coursera measures test completion or course engagement as proxies for their key metric, allowing them to cut down the time to measure the impact of a design change dramatically.

2016

2007

2006

  • (Ulwick, 2005) ⇒ Anthony W. Ulwick. (2005). “What Customers Want: Using Outcome-driven Innovation to Create Breakthrough Products and Services.” McGraw-Hill Companies,

2001

  • (Roy, 2001) ⇒ Ranjit K. Roy. (2001). “Design of Experiments Using the Taguchi Approach: 16 Steps to Product and Process Improvement.” John Wiley & Sons,
    • QUOTE: [[Overall Evaluation Criteria (OEC)|Overall evaluation criterion (OEC). For most industrial projects, we are after more than one objective. If the design is adjusted to maximize the result based on one objective, it may not necessarily produce favorable results for the other objectives. A compromise might have to be obtained by analysis of the overall evaluation criterion (OEC), which is obtained by combining into one number evaluations under various criteria.