Showing posts from 2014

Sending out Storm metrics

There are a few posts talking about Storm's metrics mechanism, among which you can find Michael Noll's postJason Trost's post and the storm-metrics-statsd github project, and last but not least (or is it?)  Storm's documentation.

While all of the above provide a decent amount of information, and one is definitely encouraged to read them all before proceeding, it feels like in order to get the full picture one needs to combine them all, and even then a few bits and pieces are left missing. It is these missing bits I'll be rambling about in this post.

Dependency Injection - The good, the bad and the ugly

The Good
Dependency injection (DI, a.k.a IoC - inversion of control) is a well known technique to increase software modularity by reducing coupling between modules. To provide the benefits of DI, numerous DI frameworks have arisen (Spring, Guice, Castle Windsor, etc.) all of which essentially give you "DI capabilities" right out of the box (these frameworks tend to provide a whole lot more than just "DI capabilities", but that's not really relevant to the point I'm about to make). Now, to remove the quotes around "DI capabilities", let's define it as a DI container - a sack of objects you can manipulate using a provided API in order to wire these objects together into an object graph that makes up your application.

I've worked on quite a few projects employing Spring, so it will be my framework of reference throughout the rest of the post, but the principles and morals apply just the same.

Continuous Deployment - Not Without Modularity

When asked about what's a module, no two people give the same answer. The interesting thing is, it doesn't keep modularity from being one of most pronounced words when it comes to the desirable set of attributes a software system should ideally have. It's somewhere just next to scalable and robust, in no particular order.

IntelliJ - You auto complete me (2)

A short one.

I've already posted about IntelliJ in the past, and judging by my enthusiastic PR for IntelliJ IDEA someone might mistake me for a share holder. Well, I'm not, though come to think of it, it might actually be a good IDEA (pardon the pun). I just love working with IDEs that understand developers and make things easier on them, and IntelliJ does just that. I've recently been brushing up my (automatic) refactoring skills, and IntelliJ has some really awesome stuff to offer in that department. Truth be told, there are some issues here and there, but hey, I have my bad days too.

All in all, for me it's done the unbelievable job of making programming in Java ... kinda fun, not a simple task by any means. I even managed to enjoy editing a bash script the other day, I mean, that's unprecedented! All that before I even had a chance to try out the Ultimate edition, as I'm working with their (free!) Community version.

Speaking of community, check out this Obs…

Finding a needle in a Storm-stack

(Crossposting from Outbrain's techblog)

Using Storm for real time distributed computations has become a widely adopted approach, and today one can easily find more than a few posts on Storm's architecture, internals, and what have you (e.g., Storm wiki, Understanding the parallelism of a storm topology, Understanding storm internal message buffers, etc).

So you read all these posts and and got yourself a running Storm cluster. You even wrote a topology that does something you need, and managed to get it deployed. "How cool is this?", you think to yourself. "Extremely cool", you reply to yourself sipping the morning coffee. The next step would probably be writing some sort of a validation procedure, to make sure your distributed Storm computation does what you think it does, and does it well. Here at Outbrain we have these validation processes running hourly, making sure our realtime layer data is consistent with our batch layer data - which we consider to b…