Making connections

The web is a failed information management system.

What is odd about that statement is not that the attempt has failed – I don’t think I have ever heard of any other fate for an information management system – but that the fact of the attempt has been so completely forgotten.

Information is everywhere, of course. The public web, or at least some parts of it, is densely populated with links. Following a chain of them long beyond the answer to any question you might have started with is the road trip of the internet age. But beyond a still-thin surface layer, many end points of links remain resolute no through roads.

Communications network

The idea of hyperlinks long predates the web. The hypothetical Memex engine dates back to 1945 and a recent article in the Atlantic takes the story back to the nineteenth century. More recently, everybody knows that Tim Berners-Lee invented the world wide web,1 but there is much less understanding of what it was he thought he was inventing.

Berners-Lee described the problem he was trying to solve in his famous paper proposing a new information management system for CERN:

CERN is a wonderful organisation. It involves several thousand people, many of them very creative, all working toward common goals. Although they are nominally organised into a hierarchical management structure,this does not constrain the way people will communicate, and share information, equipment and software across groups.

The actual observed working structure of the organisation is a multiply connected “web” whose interconnections evolve with time. In this environment, a new person arriving, or someone taking on a new task, is normally given a few hints as to who would be useful people to talk to. Information about what facilities exist and how to find out about them travels in the corridor gossip and occasional newsletters, and the details about what is required to be done spread in a similar way. All things considered, the result is remarkably successful, despite occasional misunderstandings and duplicated effort.

A problem, however, is the high turnover of people. When two years is a typical length of stay, information is constantly being lost. The introduction of the new people demands a fair amount of their time and that of others before they have any idea of what goes on. The technical details of past projects are sometimes lost forever, or only recovered after a detective investigation in an emergency. Often, the information has been recorded, it just cannot be found.

The solution he described combined technology and usability, recognising from the outset that people would use something which was attractive and useful:

The aim would be to allow a place to be found for any information or reference which one felt was important, and a way of finding it afterwards. The result should be sufficiently attractive to use that it the information contained would grow past a critical threshold, so that the usefulness the scheme would in turn encourage its increased use.

That’s a fine ambition which became an information revolution and it’s pretty clear that the ‘critical threshold’ was passed quite a while back. But the initial problem Berners-Lee described still sounds uncannily familiar today, and is still a long way from being solved. As I wrote a while back:

One of the purposes of this blog is to help me find things I half remember thinking five years ago. I have no equivalent tool at work for finding my thoughts, let alone anybody else’s. That’s an important reason why so much energy is devoted to the reinvention of wheels.

There has been a flurry of recent coverage for the brave study by the World Bank which shows that a third of their policy reports are never downloaded and almost 90% are never cited (though if I have understood their methodology correctly, my citing their paper on the citation of papers would not be counted as a citation, so the precise numbers should not be taken too seriously). But although the coverage has included wry comments about the fact that a report about how pdf documents are little read is itself a pdf document, I haven’t see any recognition of a more fundamental problem. The introduction to World Bank report is a statement of why knowledge and the sharing of knowledge matter, including (with the emphasis in the original):

Internal knowledge sharing is essential for a large and complex institution such as the Bank to provide effective policy advice. Bottlenecks to information flows create inefficiencies, either through duplication of efforts and diverting resources from knowledge creation itself.

With that thought it mind, it turns out that that report does not link to any of the published material it refers to. It has a long list of references, many of them to other papers by the World Bank itself, but in virtually all cases they are textual descriptions, not links.2 It’s a dead end not because it’s a pdf, though that doesn’t help, but because it is constructed as an end point, not as a node in a network.

I have laboured that point a bit not because I care greatly about the information management practices of the World Bank, but because I suspect they are distinctive more in the visibility of what they do, than in the doing of it.

Most of the material I see in my working life is self-contained and very little of it makes explicit connections to other information.3 There are two big reasons for that (as well, no doubt, as a host of smaller ones).

The first is technical. You can only link to something if you know where it is now. There is only any point in linking to anything if you can be confident that it will still be there next week and next year (and in some cases, next decade). That requires information to have a permanent, canonical location at an appropriate level of granularity and for the arrangement of information to be more durable than the arrangement of work.

The second is cultural. You will only link to something if doing so is seen as valuable (and if doing so both is and is perceived to be easy to do). Links are most likely to be seen as valuable by people who might choose to follow them. Following links is easy for somebody reading on a screen, but impossible for somebody reading on paper. Reading on a screen is easier if the material is designed to be read that way, not just in layout but in information richness.  So there is little chance that links will flourish in an environment where most information is designed for presentation on paper (even if it is actually sometimes consumed on screen).

Any solution to the information management problems of organisations needs to address both the technical and the cultural issues. The technical solution is necessary, but wholly unsurprisingly, it falls very far short of being sufficient. Even with the network in place to support a much more web-like approach, we cannot hope to consume information that way until we start producing it differently.

But if we succeed, there are prizes well worth having here, which go far beyond better information retrieval. As Tim Berners-Lee speculated a quarter of a century ago:

In providing a system for manipulating this sort of information, the hope would be to allow a pool of information to develop which could grow and evolve with the organisation and the projects it describes. For this to be possible, the method of storage must not place its own restraints on the information. This is why a “web” of notes with links (like references) between them is far more useful than a fixed hierarchical system.

The need hasn’t changed in the last twenty five years. Perhaps we should try the solution.

  1. Apart from the people who persist in believing that he invented the internet.
  2. There are precisely two clickable links, both to posts in the same blog – but bizarrely the links are to the blog’s homepage rather than the specific posts being cited, so even those don’t help as much as they should.
  3. The one big exception to that is emails which contain long chains of their predecessors, but the less said about that the better.

A bus company with a train set

Quick question: what’s the dominant form of public transport in London?

And an irresistible second quick question: what is wrong with this picture?

TfL beta site top banner

We will come back to the second question, but if your answer to the first was the tube, you can be forgiven. That’s the most distinctive, most high profile part of what Transport for London provides, as well as being where most of the money goes. But the heavy work of moving people around London is done by buses: there are about twice as many bus journeys as tube journeys and there are almost 20,000 bus stops against a mere 270 underground stations.

One reason for being misled into thinking that the tube is all that matters is that that is what TfL itself seems to think. I have written before about how the ticketing system treats buses as an afterthought and the poor information design of bus arrival signs, but a fairly cursory look at the TfL website shows the depth of its assumption that the tube is what matters.

Contrary to how it might appear, this is not actually a TfL bashing post, it’s a complex information management post. The state of the underground is easy to communicate. The state of the bus network is considerably harder both to establish factually and to communicate clearly.

TfL - Live travel newsLet’s start with live travel news. The screen is dominated by the list showing the status of each underground line and a large map of service disruptions – even when there are no disruptions on the map (click on any of the screenshots to get a fullsize version). Other forms of transport are accessible through tabs across the top – with buses getting a tab less than half the size of the tediously named Emirates Air Line. There is a live bus arrivals link at the bottom of the page, but it’s off the bottom of every screen I have ever used, and I had never noticed it before taking the screen shot.

TfL - Live travel news - busesClicking the buses link takes us to a rather muddled screen. Leaving aside the tube planned works calendar and the tube improvement plan (but is there really nothing which might have been said about buses in those spaces?), there is a link to live departure boards (which is a generic page dominated by the tube despite having already established a primary interest in buses). More promisingly, you can put in a bus route and check for disruptions, though the result of doing so is more than a little strange:

There are currently no relevant disruptions or this is not a valid route

Since it doesn’t seem unreasonable to assume that TfL knows what bus routes it runs, they should presumably be able to tell which of those alternatives is correct, and be willing to share that knowledge.

But perhaps I am being unfair.  TfL is developing a new version of its website, currently in public beta, perhaps that will provide better navigation and information. The beta certainly looks much smarter and fresher, but on what is available so far, the primacy of the tube appears to be alive and well.

TfL status update betaTfL home page betaThe beta home page has a  smart status display which focuses on lines with problems rather than giving equally prominence to lines with no problems. The status update page is just as dominated by the tube as the old site – even the row of tabs linking to other transport modes has been replaced with a drop down menu. And rather tellingly, both the url structure and the breadcrumb trail firmly position buses (and everything else) as subordinate to the tube as default.

Default status updateStatus update 2

So much for the website. What about the information?

It is of course massively easier both to gather and to present information about trains than about buses. A simple status indicator for each line doesn’t take up much space and already tells you quite a lot, drilling down from that can quickly tell you all there is to know. That approach cannot possibly work for hundreds of bus routes and thousands of bus stops. When things are disrupted, trains still sit neatly on their tracks, but buses can wander around all over the place. That means it’s also generally much harder to describe what is going on instead in a way which is simple and comprehensible. A search on route 2, for example, turns up this gem:

Until December 2013, bus stopping arrangements will be changed due to major works. For southbound 2 36 185 436 N2 and N136 buses, please use stop L in Vauxhall Bridge Road. For northbound 2 16 36 52 82 and 436 buses, and westbound 148 buses please use stop H in Wilton Road or Stop Q in Grosvenor Gardens. For eastbound Route 11 211 N11 and N44 buses, use stop S in Buckingham Palace Road, to the west of Victoria Station. For northbound Route 24 buses, please use stop J in Wilton Road, and for southbound Route 24 buses, please use stop U in Vauxhall Bridge Road. For southbound Route 44 170 C1 C10 and N44 buses, please use stop R in Buckingham Palace Road, to the west of Victoria Station. For eastbound Route 148 buses, please use stop N opposite Westminster Cathedral. For northbound Route C2 buses, please use stop Q in Grosvenor Gardens. Route 507 will start from Victoria Station and operate via Rochester Row, towards Waterloo only, to Horseferry Road.

Apart from the fact that only one sentence of that is relevant to the route I searched on, it’s pretty hard to make sense of any of it without pretty detailed knowledge of roads and bus stops around Victoria. There is a link to a map of the route, but as it’s marked “(does not show disruptions)” that’s not a great deal of help

It’s easy to carp, of course. Providing enough information to be useful but not so much as to confuse is a tricky balance to get right, and relying solely on words to do it makes it harder still. Hard is not the same as impossible though. There’s plenty of scope for improvement through better information design. But while buses are treated as a perennial afterthought, the problem may not get the focused attention it needs.

Semi-intelligent bus location mapIn other areas, TfL has recognised that it is better at transport management than information design. There has been an explosion of creativity since they opened their data to third party developers, but that doesn’t include the critical information about where the buses actually are. Matthew Somerville has made a splendid attempt at interpolating location from bus stop arrivals, but it’s closer to conceptual art than a practical tool. Since TfL undoubtedly does know where its buses are, it would be far better to allow access to the information than to be reduced to inferring it inaccurately. With some really smart programming, that might even allow for emergent disruption reporting, with diversions appearing on a map because that in practice is what buses were doing rather than because the disruption had been published.

And so finally to the second question: what’s wrong with the picture at the top of the post? No points at all for recognising what it is a picture of. My first thought was that it was a bodged stitch up of more than one photograph, which seemed both complicated and pointless and so a bit unlikely. But then I realised that the image had been flipped – with the most obvious clue being that the traffic is on the wrong side of the road. Maybe that’s a sign that TfL has ambitions to copy Sweden.  Or maybe not.

History, weak

It’s history week at the Cabinet Office, a series of internal events designed to remind the current generation of policy makers both that there is always something to learn from history and that their work will become history in its turn. It being Cabinet Office, there are ways of emphasising history not open to every organisation: we sat in the room from which Churchill went out onto the balcony to announce victory in Europe to the crowds below.

But it was a couple of tables at the back of the room which prompted this post. Casually strewn across them (but not so casually that white cotton gloves were not strewn around as well) was an eclectic set of historic documents. One group were records from 1984, on their way to the National Archive to be released next year under the thirty year rule. They were in files which were visually indistinguishable from those produced decades earlier and which would continue to be produced for a decade or so longer

And I was reminded of a post I wrote three years ago about the end of the file as a unit of work organisation and the implications for our ability to know what we know. I think it still bears reading.

If progressing the work continues to diverge from creating records of what has been done, the raw material of history may be thinner in future than it has been for centuries (and history here means medium term institutional memory as much as it does the work of historians). That problem will not be solved by exhortations to do better filing: it will be solved, if at all, by tools which support what people are trying to do in the short term while quietly adding what may be needed for the longer term – which is easier said than done.

Three years on, I have seen nothing which makes me think that problem is going to go away, though I would be delighted to be told that I am wrong. Historians and policy makers will both need new skills and new tools to operate effectively in that world, with landscapes much less clearly mapped than they once were.

That’s not really the end of history, of course. As I said back then,

History will, of course, look after itself. It always has. But the future history of our time will be different from our histories of past times, and that will not be because we have an eye to the future, but because we are always relentlessly focused on the present.

The Guardian pwned my blog

Update:  Since posting this this morning, I have had two people contact me from the Guardian – one in a comment to this post and one by email.  As a result, I am reassured that what I experienced was a bug they are keen to fix rather than indifference to the context in which Guardian material might find itself.  The email response suggested that the most recent version of the plugin – 0.3 – already fixed the problem.  I am not sure that’s quite right, so continue to advise extreme caution – but the intention is clearly there to make the plugin work as I argued it should.

I am removing the Guardian wordpress plugin which I wrote about a couple of days ago. It has a couple of major flaws, and I would discourage anyone from using it until they are fixed.

The Guardian is perfectly entitled to manage the presentation of its own material. The terms and conditions for the use of its data leave no scope for doubt of their absolutely fixed intention of keeping that control (even if  the language of those terms and conditions feels slightly at odds with the concept of an open platform).  Nowhere in those extensive conditions though does it state that the Guardian claims the right to extend that control to the host blog.  But that is what the plugin does.

As I noted before, embedding a Guardian article brings with it a title for the blog post of which the article forms a part – but only a part – tags and an excerpt.  None of those were what I wanted for the post I wanted to write, so I deleted them all.  Not ideal from my point of view, but it was, I presumed, an attempt to be helpful.  Having set them to what I wanted them to be, I now discover that Guardian plugin has taken it upon itself to change them all back again. I don’t find that acceptable.

It gets worse.  My next act was to deactivate the plugin.  That caused it to remove the Guardian article – which is fair enough. It’s not hard to identify the text which belongs to the Guardian.  It begins:

<!– GUARDIAN WATERMARK –>

and ends:

<!– END GUARDIAN WATERMARK –>

It could hardly be much clearer – but the plugin takes no notice of that, and instead completely deletes the entire post, including all that I had written.

It’s not that the Guardian doesn’t expect bloggers to put their own context and commentary round articles: their own documentation makes clear that that is exactly what they expect.  And the use case of doing nothing more than republishing articles strikes me as an odd and unlikely one. But regardless of that, the entire text is swept away.

I hope there is nothing more here than carelessness either in design or in testing, but I am going back to the old fashioned way of quoting and linking, following the advice in one of the comments on the Guardian page about the plugin:

I really fail to see the point of this plug-in. If I want to post excerpts from Grauniad articles on my wordpress blog, I copy and paste. I can change anything I like; Idon’t need an effing key; I don’t have to put up with any ‘…ads and performance tracking…’; and I decide what gets deleted, not you…

Small pieces, joined not quite loosely enough

Here’s a small cautionary tale of unintended consequences. It explains why the particularly eagle eyed will have seen a post on the blog this morning which quickly disappeared – though not quite quickly enough to stop it propagating round the web.

Over the weekend, I installed the new Guardian wordpress plugin, more out of curiosity than because I thought I had much use for it. But then I came across an article about repurposing and representing text.  The temptation to repurpose and represent it was irresistible, so I wrote a couple of introductory paragraphs and thought no more of it. Then on the bus to work this morning, I remembered that I hadn’t actually posted it, and used my phone to change its status.  So far, so good.

Then I checked on the published version of the post. There it was, on the mobile version of the site (which uses the WPtouch theme) – but although the title was right, the words were not mine – in fact I did not recognise them at all.  They referred to the Guardian article, but did not come from it. I couldn’t work out what had happened and my bus stop was approaching, so I unpublished the post and went to work. But although the post had been live for no more than a minute or two, that was time enough for the RSS feed to have been picked up by the Google Reader account which drives Public Sector Blogs, which generates a tweet which tells the world (or that rather small corner of it which takes an interest in such things).

The strange words turn out not to be quite so mysterious after all.  The version of the article on the Guardian website has an introductory sentence which does not appear in the body text – the words above the byline in the screenshot.  It turns out that the Guardian plugin uses that text to populate the ‘Excerpt’ field – and since that field is one I never use and is collapsed in my normal view of the wordpress dashboard, I had no idea it was there.  The WPtouch plugin uses that short excerpt to populate the home page view of the blog on a small mobile screen.  All perfectly sensible, no harm done, a very minor storm in a very small tea cup.

But there is – I think – something interesting which comes from all of this.  It is that my understanding of what the Guardian is trying to do with its plugin is radically different from their understanding.

From the point of view of the Guardian, I assume, they are seeing a new way of syndicating their articles.  For them, perhaps, the article and thus its metadata are what really matters.  It makes perfect sense to force extract text, tags and a title on to the blog post in which their article is embedded, because the post is essentially the article.  And it makes sense not because they are bullies, but because they are trying to be as helpful as they possibly can be.

From my point of view, I know, I am seeing a new way of illustrating my blog posts.  For me, it is my blog post which really matters – not because of any intrinsic superiority, but because if all I wanted to do was point to articles on the Guardian’s website, pointing to them is all I would do.  So the chances of the preamble to the article being the most appropriate excerpt for the post as a whole are vanishingly small, and the idea that the Guardian has the right to pre-empt my chosen title suggests that they see themselves as rather more important than I do.

The Guardian also requires their article to appear in full, with links, copyright notice, tracking codes and adverts left intact and uninterrupted – in effect to require the blog owner to cede control over the space in which their article is reproduced. I don’t have a problem with that requirement, and for anyone who does, the simple solution is of course to link to articles rather than reproducing them.

But I would like to see the same respect and lack of interference with my content from them as they expect from me.  It’s early days, the version number of the plugin has climbed from 0.1 to 0.3 over the last 48 hours, there is plenty of opportunity – and I don’t doubt plenty of willingness – to tweak and improve.

All of this in the context of being strongly sympathetic to the Guardian Open Platform, partly because it is fascinating watching a newspaper trying to reinvent itself in real time, but even more because, as I wrote last month, the approaches the Guardian is pioneering have much wider implications, not least for public service providers.  Some of these same issues about the syndication of content interests of the different parties involved were behind some of the discussion today at NESTA’s digital disrupters event, for example.

Normal service will now be resumed, with the post which caused all the trouble this morning appearing shortly after this one.

Information on full power

The final version of the Power of Information Taskforce report is out, with recommendations in six main areas:

  • enhancing Digital Britons’ online experience by providing expert help from the public sector online where people seek it;
  • creating a capability for the UK public sector to work with both internal and external innovators;
  • improving the way government consults with the public;
  • freeing up the UK’s mapping and address data for use in new services;
  • ensuring that public sector information is made as simple as possible for people to find and use;
  • building capacity in the UK public sector to take advantage of the opportunities offered by digital technologies.

No chance to read it yet, let alone compare it with the original draft (which is still available with all the comments on it), so I am still at the level of first impressions – which of course matter a lot, not least for all those who will never read the whole thing.  On the substance, it looks first rate:  it has a clear and coherent set of recommendations, each of which is cogently and succinctly argued.

The one apparent weakness is the executive summary.  It harks back to a distant time when a summary was exactly that, with none of this 'executive' nonsense tagged on the front:  if you read it, you have a sense of what is in the report.  But it isn't written as a hook to pull in somebody who doesn't already know why they should be interested.  There's an argument for not scaring the horses too much:  the full implementation of all the taskforce recommendations would add up to a radical change in the way government does business.  But the recommendations won't get implemented without communicating a sense of excitement and a sense of why these changes are unavoidably the right things to be doing.

Maybe that needs to be a separate and slightly different document – but I am pretty sure that it is a necessary part of the marketing drive which is needed to make all this work.   As I observed on the draft in a different context, there's a need to get the reading right as well as the writing.

Readerly texts and writerly texts

The Power of Information Taskforce has published its report.  Or rather it has published a beta version of its report, in a format which not only allows but strongly encourages comments to be made on the draft over the next couple of weeks before it is is formally submitted to Cabinet Office ministers.

POIT wordle

That’s a fairly radical approach, and one which is worth a bit of reflection in its own right.  As a way of encouraging engagement it is clearly working:  the comments are building up, questioning everything from the punctuation to the fundamental principles.  More interestingly still, some of the comments are starting to build on one another, creating an engagement which is different in structure to conventional consultation responses as well as in medium.  All of that is made to work through a very finely crafted wordpress theme which makes the process painless and transparent.

All in all, this is a splendid and positive step forward, illustrating how a little bit of imagination coupled with a little bit of ingenuity can create new possibilities.  But there is always room to be better still, and I have a doubt, a reflection, and a couple of niggles.

Continue reading

Customer insight for writers

The writer already knows what he or she is trying to communicate. The only way to judge writing, and thereby improve it, is to learn from people who are confused by it, who draw the wrong conclusion. You don’t assume that they failed, quite the opposite, you try to learn how you failed. And then you incorporate that learning into your process.

From Dave Winer

Not just a hammer

In the spirit of paying attention to what I am paying attention to, I can’t help noticing that emails are still feeling oppressive.

Dave Pollard has the answer:

To all employees:

Beginning August 1st, you will no longer be able to send an e-mail to another employee of our organization. After some study, we have concluded that such e-mails are almost never the most efficient or effective way to obtain, provide or exchange information. In fact, we estimate that as much as 20% of our employees’ time is wasted reading, writing and answering e-mails, beyond the time that it would take to communicate the same information using more appropriate means.

Instead of email, the hapless troops are enjoined to use:

  • Instant Messaging
  • Desktop Video & Screen-Sharing
  • E-Library
  • Instant Survey
  • E-Learning

Each in its proper place, each being used for what it can do best.  But all of those come after the still simpler and more powerful idea that it is a central part of every employee’s obligations to make themselves available for helpful conversations with everybody else.

Read the whole thing, then reflect on the creative energy to be unleashed by implementing it.  The self assessment is then less about the extent to which email is used inappropriately, much more about what other possibilities the organisation affords, both technically and culturally.  As with any other change, there being another way which is clearly better is a prerequisite, but having the tools won’t create the change.  At least that’s what I would assume.  For the moment, I would settle for having some of the tools, so that sending an email doesn’t always have to be the answer, regardless of the question.