Don’t tell anyone yet, but we’ve got a big project going live soon which aims to help the NHS save an awful lot of money. It turns out that by altering GPs’ prescribing behaviour for a few drugs, swapping generic for proprietary forms where appropriate, it’s possible to save hundreds of millions of pounds a […]
From the Bethnal Green Ventures Demo Day – my 5 minute version of what we do, told mostly through the medium of Muppets. Not the most technically deep talk I’ve ever done, but I’m actually pretty pleased with this as a very quick teaser on what we do and what it’s for. Now to work on the […]
(if you now have an earworm, congratulations, we are of the same generation) In October you’ll have a few chances to see the very rare and shy Team Mastodon in public. On October 2nd, Fran will be talking at O’Reilly Strata London on How to Make Big Data Massively Greener (contact us for discounted tickets for the […]
A friend asked me this week what the difference is between using Hadoop and its related ecosystem for data storage and analysis, and using a traditional Data Warehouse. You might want to skip this post if you’re already way ahead on this topic, but for everyone else, I thought I’d try and clarify… A Data Warehouse is […]
Hive is a SQL-like interface onto Map Reduce. It feels nice and familiar to analysts who are used to thinking in a SQL paradigm, but it has some nasty gotchas that can make jobs verrrrrry slow or make them fail altogether. Either way, you waste a lot of time, blood pressure, and machine hours. I […]
We added Rackspace calculations to our footprint data this week, as we’re keen to start running our Hadoop clusters over more infrastructure (in the greenest possible way) and we’re sure you are too. I was getting a bit despondent about our sustainable options for running big computation, so was super excited to learn after talking […]
We show live footprint estimates for right now on our dashboard at https://www.mastodonc.com/dashboard, but it’s also pretty interesting to visualise the different locations over time to get a feel for the size of fluctuation to see how the horse race between time zones and temperatures plays out. This chart shows some real ratings for May. It’s got some […]
Yet another GigaOm article with a linkbait headline but genuinely interesting content, which is well worth a read – “The controversial world of clean power and data centers”. It’s great to see this issue becoming increasingly mainstream, and in particular businesses starting to be more and more clear about the fact that being efficient is distinct […]
GigaOm published this interesting and provocative article last week, headlined ‘Why the days are numbered for Hadoop as we know it’. It’s actually a pretty good argument, despite the sensational headline. The tl;dr, as I interpret it, is that: Hadoop is an excellent solution for management and processing massive datasets on commodity hardware – hence makes big […]
An interesting thought on waterfall, agile and long term vision came up on the London Java Communitymailing list. Richard Gomes asked, quite rightly, if agile and lean reject long term planning and vision. People who know me will know that I don’t agree. For me, the problem with waterfall is that we end up waiting far too long […]
Subscribe via RSS