8

I am developing a system that is intended to capture the "context" of user activity within an application; it is a framework that web applications can use to tag user activity based on requests made to the system. It is hoped that this data can then power ML features such as context aware information retrieval.

I'm having trouble deciding on what features to select in addition to these user tags - the URL being requested, approximate time spent with any given resource, estimating the current "activity" within the system.

I am interested to know if there are good examples of this kind of technology or any prior research on the subject - a cursory search of the ACM DL revealed some related papers but nothing really spot-on.

3 Answers3

5

Well, this may not answer the question thoroughly, but since you're dealing with information retrieval, it may be of some use. This page mantains a set of features and associated correlations with page-ranking methods of search engines. As a disclaimer from the webpage itself:

Note that these factors are not "proof" of what search engines use to rank websites, but simply show the characteristics of web pages that tend to rank higher.

The list pointed may give you some insights on which features would be nice to select. For example, considering the second most correlated feature, # of google +1's, it may be possible to add some probability of a user making use of such service if he/she accesses many pages with high # of google +1 (infer "user context"). Thus, you could try to "guess" some other relations that may shed light on interesting features for your tracking app.

Rubens
  • 4,097
  • 5
  • 23
  • 42
5

The goal determines the features, so I would initially take as many as possible, then use cross validation to select the optimal subset.

My educated guess is that a Markov model would work. If you discretize the action space (e.g., select this menu item, press that button, etc.), you can predict the next action based on the past ones. It's a sequence or structured prediction problem.

For commercial offerings, search app analytics.

Emre
  • 10,481
  • 1
  • 29
  • 39
3

I've seen a few similar systems over the years. I remember a company called ClickTrax which if I'm not mistaken got bought by Google and some of their features are now part of Google Analytics.

Their purpose was marketing, but the same concept can be applied to user experience analytics. The beauty of their system was that what was tracked was defined by the webmaster - in your case the application developer.

I can imagine as an application developer I would want to be able to see statistical data on two things - task accomplishment, and general feature usage.

As an example of task accomplishment, I might have 3 ways to print a page - Ctrl+P, File->Print, and a toolbar button. I would want to be able to compare usage to see if the screenspace utilized by the toolbar button was actually worth it.

As an example of general feature usage, I would want to define a set of features within my application and focus my development efforts on expanding the features used most by my end users. Some features that take maybe 5 clicks and are popular, I might want to provide a hotkey for, or slim down the number of clicks to activate that feature. There is also event timing. Depending on the application, I might want to know the average amount of time spent on a particular feature.

Another thing I would want to look at are click streams. How are people getting from point A to point B in my application? What are the most popular point B's? What are the most popular starting points?

Steve Kallestad
  • 3,128
  • 4
  • 21
  • 39