HUGUK HBase at Mendeley Presentation

Last Friday I gave a talk on how we use HBase at Mendeley, which goes into more detail on the work I’ve been doing over the last year. It’s a summary of how the datamining team started out using MySQL and how and why we moved to HBase for most of our data storage and processing. It also describes some of the work we do in the datamining team.

You can find more details of the talk and a video of the presentation here http://lanyrd.com/2010/huguk7/sxbt/

The afternoon generally went very well, I met many intresting people in London using Hadoop and HBase. There were also some people from California stopping over in London from a conference in Brussels. So it was great to meet people like Jonathon Gray who is one of the HBase committers working at Facebook and learning how they are using it in their new messaging service, Stack who is another HBase commiter from StumbleUpon, and Tom White who started the whole Hadoop world off. It’s a great community and an intresting time to be part of it.

For more from this event you can see all the slides here http://lanyrd.com/2010/huguk7/ and learn more about the UK Hadoop users group I’ll now be runnning http://huguk.org/

Advertisements

Science Hack Day, Mendeley YQL Tables

It’s been a long time since my last post, I’ve mostly been busy down in London working at Mendeley and it’s gone very quickly. This has recently lead me to the Science Hack Day last weekend where I met some great people, learnt many things and hack on some YQL tables for Mendeley’s new api for their research data.

YQL is quite a cool piece of technology which basically lets you treat the web as a big SQL table, so you can effectivly join twitter to a google search, then grab some usage statistics from the Mendeley api for example.

I’ve put the Mendeley YQL tables up online here to play with in the YQL console here. With this you can do queries like :-

SELECT * FROM mendeley.search WHERE query = “information retrieval”

And then start joining our different api calls together like this :-

SELECT title, year, stats FROM mendeley.details WHERE id IN (select id from mendeley.tags(50) where tag = “genetics”) | sort (field=”year”, descending=”true”)

I’m trying to get them in the community tables in github soon so that they can be more easily used by everyone, and I don’t have to keep them up on dropbox (even how amazing that is for rapid prototyping of YQL tables).

Aego 2 Speaker Innards and Repair

About 6 years ago I brought a pair of Aego 2 speakers from Acoustic Energy, who make some very good speakers, which were the best for the price at the time and still are today. After awhile the left channel started to stop working intermittently and I found that tweaking a jack in the front input worryingly fixed it. I found that eventually this didn’t even fix it reliably anymore so I decided to take it apart and try to fix it…

IMG_4298 IMG_4296

Whilst taking it apart I found the speakers were very well built and surprisingly simple inside, apart from the circuits.

IMG_4303 IMG_4290

I found that three of the solders for the front jack input on the front panel had broken! This was probably from the many times it had been moved around the country as I’ve lived in various places. The broken solders would have caused this problem because the input from the back goes though the front input so it can be switched off when a jack is inserted in the front, but with the broken solders it was cutting out more than intended. I hoped that just re-soldering them would fix the problem I was having.

IMG_4313

Once I’d re-heated the solder on each of the points and added a bit more solder they all joined properly again and the problem I was having before had gone. So after putting it all back together I’ve got a great pair of speakers working fully again.

M.Sc. thesis and job hunting

Well it’s been a long time since I last posted! about 12 weeks which has mostly been working on my M.Sc. thesis and finding a job for when I finish in Edinburgh at the end of August. Both are done now so I’ve got a bit more free time again (which will be mostly enjoying the Edinburgh festivals for a week!).

The thesis changed slightly from influenza tracking to trying to forecast the belief of the population about the recent swine flu outbreak. This ended up looking at ways of extracting and aggregating information from Twitter and blog posts then trying to forecast the value of a prediction market. Prediction markets along with text mining are both very interesting so I’ve learned a lot and enjoyed the whole project. I’m going to post about the work and what I found from it in a post soon, summarising the interesting bits from 70 pages!

On the job hunting front I have a job at Mendeley down in London starting mid September which I’ll hopefully blog about when I start there. The work they do is very interesting trying to bring science up to speed on the web, they describe themselves as “Last.fm for research” which kinda gives the scope of their goals. So I’m leaving academia finally but not quite, as the work I am doing will be very much helping many people doing research around the world.

Infulenza tracking project

Not too long till now until I start working on the project to try and track influenza through blog posts. I’ve updated the project page to include a link to my proposal for a few more details, I’ll hopefully update that properly once my exams are over.

The recent swine flu will also bring some interesting points to the project, like that fact I’ve just searched for the spelling now so how does Google know that? There has been a bigger buzz in the media, on blogs, and sites like twitter than relates to the actual spread of the disease which has been very quite slow and not that frequent, compared to the size of populations. Maybe a model of media news sources will also be needed to find what proportion of the noise on the web is really a flu signal.

I also found a site called DIYcity which started a project called SickCity a few months ago which they accelerated work on with the outbreak of swine flu. With this they try to track the trends of a range of illnesses over cites in the world, which as they found is a very hard task to do! I’m going to see if I can help them with their goals a bit as there aims overlap quite a bit with my summer project, and it would be nice to have some use-able data for people at the end of it. We’ll see where things go once I start working in June, stay tuned.

The DLNA experiance

Carrying on from my previous post, this blog post http://gxben.wordpress.com/2008/08/24/why-do-i-hate-dlna-protocol-so-much sums up my experiance with DLNA quite well.

So basically stay away from DLNA if you want to stay sane. Just build a HTPC and use something like XMBC instead, you’ll get a far better experiance from doing so.

The Samsung LE46A756R

We’ve just got this Samsung LCD TV at home and I’ve been seeing what it can do so I’ll give it a quick review here. Visually it looks very clear and sharp, even from some SD sources, so it’s quite impressive. It also comes with the HTZ310 Home Theatre System which is a up-scaling DVD player and 5.1 surround system in-one. The player looks great with DVDs and can also play mpeg-4 from discs or usb sticks (but not usb drives…). The included speakers sound really good and it supports both Dolby Digital and DTS signals.

The inputs on the TV consist of 4 HDMI, 1 Component and 2 Scart, although only one of the scarts are RGB the other is just compositive.  The TV has optical and phono outputs for sound but the optical output only sends a stereo signal and not Dolby Digital or DTS that can come in via the HDMI ports. This is quite annoying as an advantage of HDMI is that it can carry both video and sound but this TV won’t output the sound part of the signal. This means if you have more than one source of digital sound, Cable + Console, then you need another device to switch between them. As it comes with the Home Theatre System that only has one optical input this would need to be a optical switch, though if you go another amp that supported pass through video you could use that to switch both the video and audio instead.

The TV also has DLNA support so it can connect to media servers on the network and play music, pictures and videos from them. We did this before using XBMC which meant that it had quite a lot to live up to in terms of UI and format support. Unfortunately the format support is quite limited, partly due to the DLNA protocol, so I got SD videos and pictures to work well but HD videos and music were far more complicated. As there was not much to be done to improve this we’ve decided to create a HTPC using an old PC to play media back. I’m thinking of using a HD 2600 Pro agp card with HDMI output to do AVC and VC-1 decoding and the LC12B case from SilverStone. Then on top of this we can use the windows version of XBMC which seems to be moving very fast and still has the best UI, although Windows Media Centre 7 is also tempting. I’ll post about how that goes once it’s up and running.


About Me

I'm a student at Edinburgh University studying Artificial Intelligence. Find out more about me and my projects on my website

Twitter Updates

Advertisements