The Long Tail Paradox - You don't need a tail
August 09, 2006 at 09:15 AM | categories: python, oldblog | View CommentsThis is something I've been itching to write for a while, and I'm not sure I've got my analogies and descriptions right yet, but the gist is right more or less right (perhaps less than I'd like), the long tail sucks, viva le long tail...
The long tail - or rather Zipf's Law - has been getting a lot of
publicity recently, which is nice in a way, but also suprising in
others. I first came across the long tail 8 years ago when I started
work at the Janet Web Cache Service in a variety of papers from the 3rd
web caching workshop (which had been held in Manchester just before I
started working there). The one thing that struck me was the fact that
Zipf's law is a fundamental aspect of human behaviour (as fundamental
as fire).
As a result, when people started putting catalogues on the internet
people, whole new communities - business communities - started to see
the effects of the long tail of human behaviour on their profits.
Intuitively we all understand the long tail - we know that we want
things that suit us, that match our desires.
However the reason why "hits" and "hit culture" took off is simple
to explain - we all want quality. We don't want to choose from
20,000,000 things - this for me is the nightmare of the record store
(online or real). Ask me what music I like and I'll say "good stuff".
Don't ask me if it's garage or rap or indie, I don't know. I don't care
about that subdivision, that label. I like cool stuff. I can point at
stuff by Queen, stuff by Rob Dougan, Pictures at an Exhibition, the
theme tune from twin peaks, Ernie the Milkman, Lilly the Pink, and so
on. I don't classify my likes. As a result music stores suck for me (of
almost all kinds).
Expanding that choice by even double, let alone 10 fold or 20 fold,
leaves me cold. I'm aware that this makes it more likely that it is
_possible_ for me to find something I'll like, but my ability to find
it decreases as that choice size increases. This is Mooer's law in
action. The usefulness of the system to me decreases as the amount of
information increases.
I'm not alone. (I might be one datapoint, but I'm not arrogant
enough to assume on a planet with 6 billion people to believe I'm that
different from other people)
That's why top tens are good. It's why if I'm going on a long flight
at an airport, I'm pleased that they have a top ten (or a variety of
top tens). It's why I'm pleased that shops tend to operate on a
principle of "survival of the fittest". If a book is good, it's likely
to stay on the shelves (through restocking). If it's not, it's likely
to disappear or get covered in dust. The smaller the shelf space, the
more important those decisions become. Too much dust and you go out of
business. Sure, the books available are less likely to be a good fit
for me, but they're also more likely to be closer to the head than the
tail (The tail being where the likelihood of it sucking for me
increases).
This for me is the real issue. Why do you see zipf distributions?
Because by and large the values as to what is good are shared by many
people, we do tend to have similar likes on some levels to other
people. That is why places like Amazon are particularly good, they
don't just operate a long tail - every online bookstore does that -
they allow people to gain insight into what's going on in that long
tail. Similarly, Google News has the ability to look at what thousands
of journalists worldwide have chosen to write about and chosen to
publish. They then allow you to look at the head of that snake, by
time, date, a search or a combination of all of these.
These services make the head & body of the long tail visible,
which in essence is what we all want anyway, though personal to us.
The long tail exists because our tastes all subtley differ, and we all
want quality. What you think is terrible, I might think is great. I
still remember seeing Spawn at the cinema, how much fun I found it, and
how good a film I find it, and yet, I'm still to find another person
who agrees. The phrase "so bad it's good" is a cliche, and with good
reason. If I say "The Matrix" however, you find lots of people agreeing
that think its a cool film.
As a result, the head of the snake is useful. The head of the snake
is a means of navigating yourself to content that lots of other people
who may share similar tastes to you think is good for some reason. If
you make the place to choose from attractive to a wide audience who
choose from the wide variety of content, then the head of that snake
will be attractive to that wide audience. And that's why hit culture
took off. As long as everyone was choosing from the same pot and the
reporting on that pot was accurate, then the top 10, top 40, top 100
was useful. That's why the top 40 in your local supermarket might be
more relevant to you than a general top 40.
The real interesting aspect of things like recommendation engines is
that they're personalising this snake. They're turning the snake into a
hydra, and each head is a real user.
However, the interesting point is this: caching makes sense. Caching
to be effective has to identify the head of the snake. By identifying
the head of the snake, but still making available the tail the cache is
useful, but provides a time benefit to the user and cost benefit to the
provider with regard to the content. What does this mean in the context
of a long tail? It means that small stores can exist, and can stock a
wide variety of useful content, and can even use simple heuristics to
make money. This is essentially what web caching does after all.
And why does caching make sense? It identifies the head & body
of the snake, allowing you take advantage of the fact that the head and
body have equal business or bandwidth value as the entirety of the
tail, which is a choice set, many, many, many, many, times larger.
Now, I'm not a business person (by choice), but I'm savvy enough to
realise this: if a web cache (fixed amount of choice of storage) can
cope with the vagaries of an effectively infinite choice zipf
distribution, and still turn a profit (ie be worth running), surely the
same can be true of a business. You don't have to say "we'll stock
everything", merely being able to get everything, and be able to serve
the high quality stuff (as chosen by that audience) is sufficient.
Furthermore, it's entirely likely that given a sufficiently "good"
recommendation engine, that the amount you stock can be kept small.
The paradox of the long tail is this: you don't need a tail to take
advantage of it, a virtual tail is sufficient - as long as you're
willing to change your body and head to match the whims and desires of
those choosing. If you can provide insight into that long tail, and
shift content into a local store - and turn that tail into a body &
head, then you increase the value of your proposition to the audience,
and they will move your store along the long tail of online stores
further towards the head, than the tail of online stores.
After all, if you could go into a store on the high street and say "give me something cool to listen to", and they did, and every time you went there not only did they give you something cool, but it got cheaper with time, surely you'd go back? You'd stop caring about the size of the tail, as long as you could get at it.