evaluation metrics for multiple values per session

Question

I have an application that executes my foo() function several times for each user session. There are 2 alternate algorithms that i can implement as "foo" function and my goal is to evaluate them based on execution delay .

The number of times foo() is called per user session is variable but will not exceed 10000. Say delays values are:

Algo1: [ [12, 30, 20, 40, 24, 280] , [13, 14, 15, 100], [20, 40] ]
Algo2: [ [1, 10, 5, 4, 150, 20] , [14, 10, 20], [21, 33, 41, 79] ]

My question is whats the best metric to pick the winner ?

possible options

average from each session, and then evaluate cdf
median from each session and then evaluate cdf
anything else ?

score 0 · Answer 1 · answered Dec 30 '19 at 09:19

Here is a suggestion:

Standardise everything (if you ommit this than some big number like 9999 can ruin everything), than take average value per user session. Than, optionally, mutliply this number by x/10 for example, where x is the sample size in the use session (think of it like evidence where more samples add more confidence) and finally average by number of sessions for the algorithm.

score 0 · Answer 2 · answered Oct 23 '20 at 13:02

0

It is common to look at 90th or 99th percentile latency in computer systems.

A user won't notice the difference between a couple of milliseconds of lag but if a function occasionally takes several seconds that is very noticeable.

answered Oct 23 '20 at 13:02

Brian Spiering

20,142
2
25
102

evaluation metrics for multiple values per session

2 Answers2