Saturday, July 18, 2009

Experimenting with SheevaPlug: prologue

Recently I've received my brand new SheevaPlug Development Kit. What is this exactly? Basically, it's a plug computer that runs on a low-power ARM (I think) CPU, has 512MB RAM, 512 flash, USB 2.0 and GigaBit. It only lacks a video/sound out (hopefully just to trigger creativity from all of us ;).

It has already a kickass community doing a lot of experimenting with it: media servers, backup servers, or even home automation stuff. I fell in love with it - its potential -, and its price ($99 + dementia-rated customs overheads in Portugal. tip: around the same amount of extra taxes. ouch).

I have read a lot about what can (not) be done with a plug computer, and I'm eager to start working with it. But first things first. The development environment has been created mostly for linux. And I'm running OS X. Thus I've downloaded VirtualBox and I'm currently in the process of installing Ubuntu on it, just to have a proper development environment kicking.

The next post about SheevaPlug will talk about the installation process of the development kit and getting something visible to boot.

The new old school DIY

I have a need to express myself. Lately I've been interested in exploring the DIY culture. That is, "Do it yourself". With so many things explored on the Web, lots and lots of ideas have sprout out of my mind. Some already implemented, others only on the paper, and others still kept in the back of my head.

So, I like computers, music, and I'm not afraid of getting my hands dirty with software/hardware (especially a mix of both). Thus, these ideas express my thinking and eagerness of really doing something myself, just for the kicks. As technological progress has been keeping its pace, we're not stuck with HAM Radio stuff. Moore's law is a blessing in the 21st century DIY movement too. Smaller hardware + better software = faster prototyping. Welcome to the new old school DIY.

Enough talk. Here's what I've been up to as of now (July 18th 2009):

  • Software-based sound generation machines:

    • an interactive tweet (as in bird chirp) generator, a beep pattern creator (kind of a drum machine, but with beeps), and a pretty configurable noisebox, all implemented in Processing;

    • an iPhone accelerometer-based theremin.



  • An arduino Mega is already waiting for some junk to be plugged, in order to transform the sound generation machines into something more physical.

  • Just got a SheevaPlug to hack some media server things, as well as some home automation stuff.

  • Hacking a bit with javascript something that could remotely resemble with Demoscene things.

  • experimenting with multi-touch + processing.js (yes, javascript) interactive art on the iPhone.



Since I'm lazy, none of these projects have been made publicly available, yet. Ping me if you feel like contributing to some (or all) of these ideas, even if they stay at the vaporware level :)

Stay tuned, 'cause I'll be posting about these projects (whenever I feel like blogging). And that's all for a 1:48am Saturday rant.


P.S. sorry for not posting for so long, but you can keep the pace of my musings on my twitter account

Monday, December 15, 2008

whit.me: the Whats, the Whys, and the Hows.

Well, not so long ago, I've launched a service for URL shortening, whit.me. If you don't know that much about URL shorteners, please head to Wikipedia for a crash course. The canonical (and most famous) example for this type of services is TinyURL.

Thorough reviews on URL shortening services can be found elsewhere on the Web. This begs for the question: why another URL shortener, since they appear like mushrooms? Lets start by analysing what does whit.me has in common with all of them:


  • Short URLs: well, it's the main purpose of the service. whit.me's have a typical length of 21 chars (e.g., http://whit.me/XXXXXX), but they can be reduced to a 16 chars count;

  • Automatic redirection: by default, following a whit.me URL will automatically redirect to the linked Web page;

  • Manual redirection: to prevent URL obfuscation (which can lead to annoyances such as spamming or rick-rolling);

  • Custom aliases: to help making short URLs more user-friendly;

  • Bookmarklet: always useful, to lower the entry barrier on interacting with URL shortening services;

  • APIs: JSON (and JSONP) APIs provide simple endpoints to help flourishing an ecosystem built on whit.me's shoulders.



Now, why might (should?) you change from your preferred/favourite URL shortening service to whit.me? Some highlights:


  • High availability: whit.me sits on top of Google App Engine, which ensures a high quality of service for whit.me and, consequently, ensures that whit.me short URLs will not suffer from link rot, or even service unavailability;

  • Multiple URLs: few services allow linking to multiple Web pages. It has often been one of the key problems with the Web, on-to-many links (in comparison with other hypertext systems);

  • Annotation: additionally to the URLs, one can add a simple text note to enrich the context of the linked Web pages;

  • Integration in Web sites: By embedding a script into any Web page, existing whit.me URLs become active, by displaying a drop-down menu in-situ with all the URLs (view example);

  • iPhone-friendly: URL redirection pages also have an iPhone friendly user interface, which can be used e.g. for the creation of start pages for Web navigation (and properly bookmarked to the Home screen).



Such a simple service can, of course, be further expanded with other features (some present in competing services) such as spam detection, link analytics, personal link management, etc. (in no relevant order) Naturally, whit.me will evolve in the future to cover these features. I have planned several other features (not present in any URL shortening service, as far as I can tell), which will increase its value from the perspective of all users (those who create/manage URLs, and those who just click on them). If all goes as expected, whit.me will be much more than simply URL shortening, but indeed a nice platform for information/navigation management for the Web-savvy. More on this later on (you're free - and more than welcome - to follow me in twitter to get updates on this as soon as they come out).

Developing this type of services is really simple, since there's no special magic or voodoo required to master. This was quickly hacked with a set of technologies: Google App Engine, Python, jQuery, and, of course, all outputting and manipulating the ubiquitous HTML+CSS Web combo. Some (probably) interesting tidbits/hacks in Python and App Engine's APIs have given rise to a nice simple framework that someday might be extracted and refactored into a small stand-alone project on itself.

However, reaching the sweet spot on User Experience is a difficult task. Keeping the user interface (UI) simple and attractive is not a trivial task, since whit.me supports more features than the common URL shortener. I believe that, after some iterations, it has reached UI stability. From now on, all features added to whit.me will probably take a while to be launched, to ensure that User Experience is maintained or, ideally, improved.

Stay tuned!

Tuesday, June 3, 2008

Writing Workflow for Scientific Articles

I'm a researcher. An important part of my work concerns writing peer-reviewed scientific papers, in order to expose my work in different scientific venues, such as symposiums, conferences, journals, and books.

Having an excellent research work, with excellent results and findings, is insufficient to have a paper accepted. An important part of this process relates to exposing your ideas, your results. And writing papers is really hard task. It's a mix of sweating to find the proper words and to put them in the proper places, with a fluid sequence of ideas and explanations. It's almost an art form, despite some fairly dogmatic (common-sense?) items that must be present, such as state-of-the-art review, introduction and conclusions, etc.

To achieve a an accepted quality in the writing process (assuming that the actual content is scientifically relevant, of course), I typically perform a well-defined set of tasks: research on existing (and relevant) state-of-the-art work (hand-in-hand with the development of the research work and results gathering/analysis), organise high-level ideas into concepts, drill, cite work, read, annotate, and iterate until reaching the desired result (or, more often than not, reaching the deadline).

This fairly complex and exhausting process can be leveraged a bit by using the right tools at the right time, in order to shift my focus towards Getting Things Done. That is, not worrying about crashing document editors, text formatting, citing format, print+comment+rectify/improve. Just focus on structure on my ideas and write them in a coherent way.

Furthermore, the sheer amount of research work that is published every year in related venues makes it increasingly difficult to find needles in haystacks. That is, find that research article in the piles of paper sitting in the desk, unorganised or, at best, stored in shelves. Obviously, this process doesn't scale. It's an evident role for digital technologies, specially for bibliography and citation management tasks.

I think that several researchers can relate to these scenarios. Hence, all of this blabber leads to my suggestion of a workflow optimised for scientific articles writing tasks, tailored to the best software I could find. On OS X. I'm not sure if some of the software I'll be talking about in the rest of this post has counterparts in other platforms. If so, please feel free to comment and contribute with some thoughts and links.

LaTeX



No researcher in her/his own mind writes scientific articles with other software (unless it's specifically prohibited). LaTeX, a set of extensions to the TeX typing system, where one focuses just on document structure (i.e., abstract, sections, etc.) and on content itself. LaTeX files are plain text files. They are parsed and processed with LaTeX software through one of the several flavours wildly available on the Web, resulting on either a PostScript document (.PS) or a universally accepted PDF.

Despite some problems of LaTeX, such as (oft) lack of WYSIWYG software (due to its typesetting compiler-alike nature), the results are of high-quality and WYSIWYP (What You See Is What You Print). Add that to the almost ubiquitous availability of LaTeX templates on conference/journal websites, a really good automatic bibliography formatter (BibTeX), coupled with an almost dauntingly comprehensive number of utilities, and you've got yourself a must-have typesetting software for scientific papers.

There are several LaTeX distributions at one's disposal, for every platform. My preferred choice on OS X goes to MacTeX, since it is geared toward OS X's look&feel on supportive tools, as well as correct integration with the OS (read: it just works out-of-the-box).

So, LaTeX will be the centre on which the rest of my software choices gravitate around.

Papers



As explained earlier, managing state-of-the-art and other relevant sources of information can be daunting. Either at a physical level (stacks of real printed paper) or at the digital (folders), managing and searching through all papers to find that particular one you're looking for (and with a paper submission deadline lurking in the corner) is just cumbersome.

Papers will help you on this (too obvious name for a software!) It's a really good software to manage, organise, and usefully leverage your entire collection of PDFs laying around in the hard drive. It integrates with well-known scientific digital libraries, including ACM, IEEE Explore, arXiv, among many many others (and it's plugin based for repositories integration).

Despite the fact that one has to pay a license to use it (€29, not that expensive), trust me on this one, it's worth the money. With Papers I can tag (i.e., multi-category), annotate, and search through my own repository within the program, as well as through Spotlight.

One more thing. It affords exporting papers' metadata into the BibTeX format. This way, I can manage everything related to what I have to cite in a single program. It is the right hammer to the right nail.

Scrivener



At some point is time to put thoughts, ideas, and results into words. As I previously said, it's not easy. Almost no one can write a paper top to bottom, from the first word to the last. It's an iterative process that starts invariantly with organising ideas in a coherent line of thought. That's when Scrivener comes to help.

Scrivener is a tool targeted to all writers that exploits the typical workflow of drafts, loose notes, and combining them into a consistent piece. It's fairly similar to scientific writing, minus some issues that I'll describe later on. One of its killer features is the full-screen editing mode. I've written an essay before about the benefits of full-screen applications, ergo Scrivener fits perfectly into this line of thought. It hides all other apps, animations, popups, and everything that might stand in the way of the writing process. This way, one focuses just on what's supposed to be done: writing that paper.

This software also supports researching tasks (lato sensus), including searching the Web, bookmarking Webpages, as well as annotating text drafts. I do not advise performing all of these tasks within Scrivener. To put it simple: use it just to organise your ideas, structure your text in different drafts, and that's it. Papers and other software listed in this essay will streamline research and annotating tasks in a better way.

Oh, and did I mention that Scrivener exports into the LaTeX file format?


TextMate, Skim, and pdfsync



After having the core texts for the paper converted to LaTeX, one has to delve into details and typeset it. Editing it in a generic brand or non-specialised text editor is something from the last century. With the current days of syntax highlighting and IDEs, a lot of choices are available.

Furthermore, LaTeX is a command line oriented software package. And that's how it should (continue to) be. However, one typically wastes too much time opening a shell and running a set of commands to typeset LaTeX documents.

To complete the workflow I've described earlier, this detailing and improving process includes annotating the paper with comments, highlights, strikes, underlines, etc. Since I'm talking about an all-digital workflow, the process of annotating and editing must be as simple as possible, mimicking the print-annotate-edit traditional process.

All of this can be easily avoidable with a tailored LaTeX text editor plus some useful tools.

While other choices are available, my personal belief is that the workflow is better supported and streamlined with TextMate.

TextMate is an all purpose text editor mostly targeted to programming tasks. It was popularised by the Ruby on Rails guys, as a simple, lightwight, and GTD-friendly text editor (I strongly agree with this opinion.) It also provides a comprehensive support for different programming languages and, as you surely have figured out, supports LaTeX out-of-the-box.

Through a bunch of keystrokes , the typeset tasks are instantly launched, and a user-friendly window presents possible errors and warnings that might occur. Citations are easily managed, and syntax highlighting provide visual cues to LaTeX keywords.

Within the iterative process of improving the paper one's writing, the back and forth reading, annotating, and editing process can be really tiresome. Therefore, to mitigate such problem, two other tools can help getting back on track on the main task: finishing the paper. These tools are pdfsync (which is already bundled in LaTeX distributions) and Skim.

pdfsync provides the core support for swinging between the typeset PDF and the LaTeX source (with a fairly good granularity). Setting it up just requires adding a \usepackage{pdfsync} on your LaTeX preamble. After typesetting, a marker will appear on the PDF, representing the position your cursor is located within the LaTeX source.

Vice versa, Skim supports the other direction (PDF towards LaTeX), since OS X's default PDF reader does not afford this functionality. Skim is supported by TextMate, which can be setup with just two mouse clicks.

As a bonus, Skim has built-in PDF annotation tasks just like Adobe products, with the added bonus of being free (as in beer) and really lightweight.

OmniGraffle



The last thing I'll be talking about in this essay concerns creating vector-based figures. One of the beauties of PDF (and PS, for that matter) is that it's a vector-based file format. It means that it's resolution independent. Consequently, it is desirable that, whenever possible, all figures embedded into the paper are vector-based as well.

My preference for creating figures is OmniGraffle. It's a lightweight and easy-to-use piece of software, that provides intelligent guides to create vector-based figures that are coherently aligned, dimensioned, and eye-candy. Remember that a good figure can be worth one thousand words. A poor quality figure (e.g., misaligned shapes) conveys an amateurish approach to the work, which can be negatively reflected in the peer reviewing process. High quality graphics do help improving the paper's overall quality. After using it, you'll be constantly reminded that it's an excellent piece of software when you have to use Microsoft Visio or any other diagram software of lesser quality.

Add to that fact that it supports 100% vectorised PDF exporting - which can be directly embedded into LaTeX files, and you've got a high quality research paper ready to be submitted, peer-reviewed and, hopefully, accepted!

Ending remarks


I hope this info will help you lowering the burden on the logistics of writing scientific papers. While I'm not an expert on all of these topics, all of this comes from my 4 to 5 years of experience working as a researcher. Once again, this is not an exhaustive list of software and workflow. It's just my own experience being described.

I'm sure there is a lot of things that I may have missed, and better software out there. I'm still missing two significant pieces of software that can integrate seamlessly into my workflow: WYSIWYG table and equation editors, and integrated into TextMate. It would be great to select a table or an equation, and edit it without having to know a bunch of macros.

Therefore, please feel free to comment, and make corrections and suggestions. I believe it's important that researchers spend their time on researching, not wasting it on avoidable pitfalls in the writing process.

And now, back to that pesky paper I'm writing...

Tuesday, May 27, 2008

On open data from research experiments

Open data. In the spirit of my instalments on opening data on social networks (part one, two, and three), I've actively promoted some data I've gathered in the context of a paper published at W4A.

The raw data concerns an accessibility assessment of nearly 8000 Web pages, and is licensed under a Creative Commons Attribution License. While this data is provided in CSV format, I'm currently working on making these type of assessments readily available as linked data, thus allowing (hopefully) more insightful discoveries of Web accessibility at large scales.

You can find the data, software, and associated publications list - side by side with some descriptive texts - in my PhD's work Web page. Use the data at your will, but do not forget to give credit where due :)

Wednesday, April 30, 2008

Back from WWW2008

It happened once again. The most important conference on the Web was hosted in Beijing, China.

Fortunately, I was able to attend it once again, but only after an entire week on vacations. I have to say that I loved China. A huge country, filled with great, kind people. I had the chance of having three buddies with me on this trip, which was great!

Shanghai. For those who know (and like) New York City, you will feel comfortable in Shanghai. A very cosmopolite city, filled with skyscrapers. Lots of excellent street food (even for a vegetarian like me). Lots of bargaining, which I translated into a shiny new Canon EOS 40D for half the price, and a brand new Sony Cyber-shot T300. You can see their quality in my Flickr account.

Beijing. The great city in the north. A HUGE city, may I say. Had the pleasure of travelling to the Great Wall, more specifically to the Mutianyu section.

Conference days. I presented on W4A, which I was attributed with the best paper award. Great news! On the second day I presented on WebEvolve, the Web Science Workshop. The next three days were enjoying the main WWW conference, with excellent keynote speeches, excellent papers (some Google guys presented there a PageRank for images), and excellent food.

I deeply recommend everyone to go to China when possible. And WWW, well, it continues with its excellency on research and industry.

Friday, April 11, 2008

WWW2008

THE Web conference will happen once again. This time, it'll be hosted in Beijing, China. It's the main venue for both researchers and industry that have a stake in the Web's past, present, and future.

I'll be there too, presenting my humble research work both at W4A and WSW. If you're attening, please feel free to drop by!