Rui Lopes' blog:ground: May 2007

Wednesday, May 23, 2007

User Agent detection and segmentation

Web browsers indentify themselves with a particular string. It's called the User Agent string. In order to better fit contents to different scenarios, trying to segment the whole spectrum (or, at least, a big chunk) of user agents proved to be a daunting task.

While working on a prototype for an adaptation engine for Web based documents, I came across the need of finding out which type of Web browser is requesting a document. While nowadays some efforts are being made to ease this task, such as UAProf and WURFL. However, being pragmatic, an ubiquous availability of these two technologies may take several years to take off. Therefore, the simplest way of doing it today is sniffing the request's HTTP header, and looking for a User Agent field.

From the HTTP 1.1 specification, Web browsers and other Web user agents (e.g., crawlers) may identify themselves with a specific string on the header of each HTTP request. The production rule for this header is:

User-Agent = "User-Agent" ":" 1*( product | comment )
product = token ["/" product-version]
product-version = token

What's the meaning of this expression? Basically, it states that this header field should start with the string User-Agent:, followed by a product name and its version, or a comment. This must appear at least one time, at most... infinite times. Hence, user agents identify themselves with almost arbitrary strings, as long as they comply with the production rules. Headache warning.

Despite existing a huge amount of Web browsers available in the market, my adaptation engine should indentify them according to their segment, such as desktop browsers, mobile browsers, etc. But, thanks to HTTP's loose user agent rule, putting browsers correctly on their segment is really hard (read: cumbersome, error prone, nearly impossible).

Here's a quick sample of user agent strings from miscellaneous browsers, taken from a huge list found elsewhere:

Internet Explorer 7: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; WOW64; .NET CLR 2.0.50727)

Firefox 2.0: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-GB; rv:1.8.1) Gecko/20060918 Firefox/2.0

Sony Ericsson K610i builtin Web browser: SonyEricssonK610i/R1CB Browser/NetFront/3.3 Profile/MIDP-2.0
Configuration/CLDC-1.1 UP.Link/6.2.3.15.0

Pocket PC Internet Explorer: Mozilla/4.0 (compatible; MSIE 4.01; Windows CE; PPC; 240x320)

Once again, pragmatics tell me that tailoring the Web towards each single device is unfeasible. Hence, it should be possible to define different segments and associate each User Agent string to the appropriate segment, through a set of heuristics. Even when WURFL, UAProf, or even more recent work from W3C's Mobile Web Initiative Device Description Working Group becomes widespread, segmenting the Web end-points - browsers - into a treatable set of characteristics will continue to be useful.

Going back to the User Agent strings mumbo jumbo, my initial proposal relates to distinguish between the mobile and desktop landscapes, and it goes something like this (beware - pseudo-code algorithm):

function user_agent_segment(string ua_str)
{
  switch (ua_str)
  {
    case /MSIE/ except /PPC|PocketPC|Windows CE/:
    case /Gecko/:
    case /KHTML/:
    case /Opera/ except /Mini|Mobile|Wii/:
      return DESKTOP;
    default:
      return MOBILE;
  }
}

The simple, yet crucial, aspect of this algorithm relates to detecting desktop browsers at first, since the (useful) desktop browser landscape is narrower (in comparison to the wildwest style huge range of User Agents on mobile phones). From there, one may just detect specific substrings.

If your keen on this topic, please feel free to implement, test, extend, and improve the algorithm. My (mid-term) goal lies on expanding it in order to detect and diferentiate mobile phones, ultra mobile PCs, and desktop environments (at least). Also, it could be somewhat interesting to extrapolate input mechanisms (i.e., modalities) available - e.g., if a mobile phone is detected, we may infer a numeric pad (and possibly arrow/cursor keys) as the available input modality. This way, navigation on a Web site may be tweaked in order to facilitate user interaction, thus leveraging the user's experience and increasing one's satisfaction.

Wednesday, May 16, 2007

Afterthoughts

The last two weeks were really really interesting. So interesting that my initial plans on blogging while at the conference went straight to the garbage bin, especially coped with the average 5 hours sleep I was able to get there.

The first four days at Banff were simply beautiful. Surrounded by the rocky mountains at the Douglas Fir, I managed to wake up every morning and go skiing with a bunch of good friends from all around the world on the Sunshine Village ski resort. In one word: beautiful. I even managed to do some black diamond runs, no way I could've imagined myself doing it... But as a matter of fact, I did :) Awesome!

Returning to reality, being a volunteer for WWW and presenter for W4A, some things had to be done. Moving to the jawdropping Fairmont Banff Springs Hotel, with an insane window view towards mountains, river, golf course, snow, forest... simply beautiful. It was a good omen, I thought. And I was right. It was.

W4A started. There were several interesting presentations over Web 2.0 technologies and how to leverage them regarding accessibility. Some more technical, others geared towards research, but nevertheless interesting. My presentation went fine, despite some anxiety (oh, so typical of me...), got a bunch of interesting questions.

On the second day, as an assigned volunteer for W4A's room (thanks John!), I got to see the rest of the conference, and think about my own research goals. That's what conferences are for.

Heading to WWW itself, I was pleased to have the chance of hearing and seeing lots of great presentations on Browsers and User Interfaces, advances on standards from W3C's Technical Track, some cute demos on the Developers' Track, and loads and loads of Information Retrieval and Semantics. Concerning the plenary keynotes, they were simply great. Hearing Tim talking about WSRI - Web Science Research Initiative - and viewing those simple state charts summarizing the whole research process for the emergence of the Web Science field (and other Internet gimmicks such as e-mail) was definitely insightful. And viewing the Web as a set of linked data (as Brian reported), coupled with some chat I've managed to make with Brian Kelly, Peter Brusilovsky, people from the DAISY Consortium, and others, I managed to flourish several thoughts and research directions for my PhD work.

Summing it all up: it's a must-go-to conference every year, it's a must-go-to place to be on vacations when possible!

Tuesday, May 1, 2007

WWW2007 and W4A

Today (May 1st) I'm heading off to WWW2007, at Banff, Canada. The beautiful Banff Springs Hotel will host this huge conference, where I'll be part of the volunteers staff, helping out when and where necessary.

Also, on May 7th, I'll be presenting at W4A - Web4All - some progressions on my research work towards getting my PhD degree. Hopefully, I'll come back with lots of interesting ideas to improve my work.

The Bonus: since the conference starts at the 7th, I have five days of skiing and hanging out with lots of great people I met last year at WWW2006 in Edinburgh, some sort of a (well deserved?) vacation period :)

Concerning this blog... I guess I'll try to post some things while at the conference (live updates maybe?). After all, this is the Web conference!

Rui Lopes' blog:ground