One of the key aspects of the Linking You project is to come up with the ‘right’ way of organising an institution’s URI structure, mapping a virtual collection of spaces in a way which makes sense. For this blog post I’m going to tackle the University’s corporate web site, that is the public-facing, shiny clean, ‘sell the University’ website which is only ever looked at by prospective students, parents of students, prospective staff and marketing/PR folks. A full-blown dissection of the University’s ideal subdomain structure for online services is a whole other blog post.
This is a clean-slate approach with no preconceptions of how things work or are currently arranged, just an attempt to draw up a logical way of breaking down content. In this I’m also going to be shockingly ruthless in getting rid of what I think is irrelevant to the University’s website, which means an awful lot of people are going to be upset when they realise that their 26-paragraph essay on how their department has recently invested £4,000 in new doors has been cast into the pit of internet oblivion (apologies if anybody has really put an essay on their department’s new doors on the corporate site, but it wouldn’t surprise me). Before I get going, I’m going to reference a popular online comic called XKCD. Look at this image, understand it, and all will become clear.
I considered a wide variety of ways to organise my thoughts and tackle the problem of sifting through content before finally opting – in true computing student style – that it would be easiest to try visualise what we were doing in an algorithmic manner. I’d put the ideal user flow into a graphing tool and see what came out, since if there were any clear groups or flow paths then they would be sensible places to start organising things. Out came Graphviz and I began to build an ideal user flow, based mostly on my own experience of trying to find information.
10 minutes in, and it doesn’t look pretty. I’d not even finished putting things in and it was already looking like a scribble of epic proportions.
It looks like we may need to abandon the notion of ‘nested’ URIs ((Uniform Resource Indicators, otherwise known as ‘addresses’)) as much as possible, and instead opt for a simple “type/identifier” method. In line with the principles of REST ((REpresentional State Transfer, a type of machine interface methodology for exchange of data)), these URIs should also be happy with appended actions to make life easier for those trying to get data. So, let’s get started with a few key types of information.
Academic Bits
First up, let’s focus on the academic side. The big areas are faculties, schools, courses, modules and research. I’ve done a bit of cursory hunting and it looks like these don’t nest at all in any kind of historical sense – whilst at the moment it may seem fairly clear that modules belong to courses, courses to schools, schools to faculties and research bolted on the side it’s actually nothing like. Modules may exist in multiple courses, and indeed multiple courses across multiple schools. Entire courses can shift between schools, schools can move around between faculties as they’re reorganised and research is still out on its own, except for some which is associated with schools and some with faculties.
So, at a top URI level we now have /faculty, /school, /course, /module, and /research. There’s also the possibility (for historial accuracy and to compensate for strange naming conventions) of /unit which would map directly to /module, and /department which would map straight to /school.
Identifiers within these categories need to be simple and immune to change as much as possible. Ideally speaking the identifiers should be immune to minor name changes and reorganisation. So, for example, Media, Humanities and Technology would live at /faculty/mht. The Lincoln School of Performing Arts lives at /school/performingarts, a BSc in Golf Science and Development should be /course/C604 (the UCAS course code), the Computer Games Production module is /module/CGP2001 and an item of research into the migration patterns of muscovy ducks could be /research/muscovy_migration. How simple is that? Use some .type extensions to specify machine readable information and we can create magic such as /course/C604.xcri, or /faculty/mht.json.
Under each of these, we also need the ability to add /nouns which relate to drilling down further, or /actions for specific things we want to do regarding the identifier. So /faculty/mht/contact will provide us with contact details, /research/muscovy_migration/repository will give us a list of anything published in the repository regarding that research, and /course/C604/fees will tell you about the costs relating to that subject. We may even have (in the future) things such as /course/C604/apply.
Something such as /faculty/hlss/courses will provide a list of all the modules in that course, but this doesn’t mean that they should keep drilling down to eternity. Otherwise we could end up with /faculty/hlss/courses/C604/staff/jbloggs – this is clearly nonsense since /course/C604 is – as I explained earlier – logically distinct from the fact it currently belongs in the HLSS faculty; and Joe Bloggs the lecturer has no logical connection to the course or the faculty. Links should reflect the canonical identifier for things wherever possible.
At a top level we should also offer /nouns for large collections, so /faculties will (obviously) list all the faculties, /courses will offer our course finder (a glorified search engine) and so-on. Some of these should allow for certain search limitations to be given through the URI as well, for example /courses/postgraduate. Some actions such as /contact will apply University wide, so they should also be available at a top level.
News
The style of URI structure employed on the academic side can extend across the entire website without any difficulty. We should therefore create a /news section, allowing us to have items such as /news/linking-you-rocks. Note, however, that the slug (the “linking-you-rocks” bit) should be locked at publish time and not be allowed to be changed. We should also, in the case of news, allow some sensible /nouns to take the place of the /identifier, so things such as /news/tag/zombies will show us all news articles tagged with “zombies”, or /news/by/nijackson will show you anything posted by me. All places where there is any kind of list of news items must have an RSS ((Really Simple Syndication, a standard for sharing news and content around the web)) feed available by appending .rss to the URI, for example /news.rss or /news/by/nijackson.rss. I’m constantly annoyed by the lack of RSS feeds regarding University services, and this is an ideal opportunity to standardise it.
Undergraduate, Postgraduate and International
This is a fairly simple one to solve. /undergraduate, /postgraduate and /international are top-level about pages, which then link back out to relevant bits of the site such as /courses/undergraduate. They also can have specific nested pages such as /postgraduate/apply or /international/visas to provide more information.
Other Stuff
The corporate website has a massive collection of ‘other stuff’, such as the VC’s welcome, blurbs about our charitable status, information on executive board and governors, guides on how to use our logo and typography and much more which I either can’t neatly pigeonhole, or which I simply can’t figure out why it’s there. For these, I propose we create a /about section to allow these arbitrary documents to be stored without trouble. Similarly to news, these can be given slugs which are relevant to the document (for example /about/governors or /about/charity).
Obviously it’s impossible for me to redesign the entire URI structure of all the existing content, since we have so much. Still, I think I’ve laid out a framework for sensible structure which helps avoid the excessive ‘tree’ structure we’re used to by scrapping it altogether, and instead opting for much higher level organisation of content. The only downside to this (if you consider it as such) is that this type of system requires a CMS ((Content Management System)) to back it up. Personally, I consider forcing the web content into a bespoke CMS will dramatically help with ongoing maintenance by forcing people to consider the content over the organisational structure, but that’s a subject for another project entirely.