Other University of Lincoln services URIs

Following on from my post about the URI structure in our existing corporate website I’m now taking a brief look at a number of other websites and web applications that we have at the University, WordPress, Blackboard (blackboard.lincoln.ac.uk), SharePoint 2003 (internal – portal.lincoln.ac.uk, external – visit.lincoln.ac.uk) and Posters at Lincoln (posters.lincoln.ac.uk).

WordPress
blogs.lincoln.ac.uk

We have an active blogging community here at Lincoln with over 400 registered blogs running on a WordPress MU install. WordPress has built in friendly URIs (permalinks in WordPress terminology) so I was expecting to see a good proportion of blogs that had a good URI structure.

I wrote a script – https://gist.github.com/890378 – which simply grabs the permalink structure for each registered blog so that I can see an aggregated view of the settings that have been set up for each blog.

Here are the results:

Permalink structure Number of blogs
/%year%/%monthnum%/%day%/%postname%/ 404
/%postname%/ 8
no permalink structure 3
/%year%/%postname%/ 2
/articles/%postname%/ 1
/%year%/%monthnum%/%postname%/ 1
/%monthnum%/%year%/%postname%/ 1

96% of blogs are running with the default permalink structure which basically means you have URIs that looks like http://example.blogs.lincoln.ac.uk/2011/03/22/hello-world. In terms of readability this does result in URIs that can be easily understood. As for predictability of a specific post’s URI then I think you’re perhaps better off doing a search from the blog’s home page. I did a Google search for “thoughts on WordPress permalinks” and there seems to be a consesus that the default permalink structure is “good enough” SEO wise however there are performance gains by using it over %postname% because it reduces the number of queries WordPress has to do internally to return the correct post.

Blackboard
blackboard.lincoln.ac.uk

The interface of Blackboard comprises of a HTML frameset page which if you read the source code actually explains itself:

“The Blackboard Academic Suite environment includes a header frame with images and buttons customized by the institution and tabs that navigate to different areas within Blackboard Academic Suite. Clicking on a tab will open that area in the content frame. Web pages containing specific content, features, functions, and tools are accessed from the tab areas.”

Basically this means that the home page, regardless of whether or not you’re signed into Blackboard or not is always http://blackboard.lincoln.ac.uk/webapps/portal/frameset.jsp and it also means you can’t directly link to a Blackboard resource (as I’m about to without screwing up the interface). You could say this doesn’t matter at all from an SEO stand point because resources aren’t externally available however I’m can’t help sympathising with someone trying to explain how to access a Blackboard resource over the phone; I’d imagine it would go something like “click on this, now that, now sign in, now click the top link in the list on the left, now select you the course you want, then this, that and you should now see what you want”, as opposed to “just click on the link I’ve just sent you”.

Taking the above into consideration (i.e. that Blackboard links are essentially irrelevant to the end user) it is interesting to see what some Blackboard URIs look like:

Blackboard Module URI
Announcements http://blackboard.lincoln.ac.uk/bin/common/announcement.pl?action=LIST&context=mybb&scope=_all
View Grades http://blackboard.lincoln.ac.uk/webapps/gradebook/do/student/viewCourses
Community http://blackboard.lincoln.ac.uk/webapps/portal/tab/_3_1/index.jsp
Logout https://blackboard.lincoln.ac.uk/webapps/login?action=logout
Help http://blackboard.lincoln.ac.uk/webapps/portal/frameset.jsp?tab_id=_40_1
Example course http://blackboard.lincoln.ac.uk/bin/common/course.pl?course_id=_46268_1
Course blog http://blackboard.lincoln.ac.uk/webapps/blackboard/content/listContent.jsp?course_id=_46250_1&content_id=_444545_1&mode=reset
Supervisor wiki http://blackboard.lincoln.ac.uk/webapps/lobj-wiki-bb_bb60/wiki_home/Handler?course_id=_32711_1&content_id=_457614_1

A few URIs seem mildly related to their content such as the example course and the gradebook however others like the community home page almost seem random.

SharePoint 2003
portal.lincoln.ac.uk (interal) visit.lincoln.ac.uk (external)

Our SharePoint 2003 installation is our institution’s internal content repository and intranet. Every department and faculty has it’s own “site” (read: section) and providing you know exactly what you are looking for (searching doesn’t work) then it is generally quite useful. Unlike Blackboard however SharePoint is not built using frames and so you can give out direct links to content, for example:

Resource URI
ICT department https://portal.lincoln.ac.uk/C15/CS/default.aspx
University Resource https://portal.lincoln.ac.uk/C17/UniversityResources/default.aspx
First aiders https://portal.lincoln.ac.uk/C11/C0/First%20Aiders/default.aspx
External news https://portal.lincoln.ac.uk/External%20News/default.aspx
FreeCycle https://portal.lincoln.ac.uk/C13/C18/Freecycle/default.aspx

At first glance the SharePoint URIs look as random as Blackboard’s however I’ve had explained to me why this is. Basically SharePoint is made up of “sites”. In the 2003 version the first 20 sites can be named whatever you want, for example “External News” or “University Resources” however after that SharePoint insists on using folders that count up, prefixed with the letter “C”, e.g. C0, C1, C2, and so on up to C19. Inside these “sites” again you can have another 20 directories named whatever you want and then the C directories start. This basically means that sites are capped at 40 directories per directory, half of which you can alter the name. Confusing yes. Logical no. I’ve been informed that this is no longer the case in SharePoint 2007 onwards.

As a result you can potentially have sites that have a nice friendly URI structure (if you discount the /default.aspx at the end) e.g. https://portal.lincoln.ac.uk/Examples/HelloWorld/default.aspx. However for sites which are granted C-directories (such as the ICT department) the URL loses all contextual relevance. Again I think the best bet for users is to follow links through to the resource (or if they’re feeling brave, try searching for the content).

Posters at Lincoln
posters.lincoln.ac.uk

Posters at Lincoln was one of the first sites I worked on when I started working for the Online Services Team here at the University. It’s development brought about the Common Web Design and a number of other projects we’ve worked on over the last year. Therefore forgive me if I’m a bit bias in this overview.

The site is split up into “groups”, such as ICT Department or Marketing and Communications, and “campaigns”, which are posters created for different events and public notices.

We designed the URI structure to be SEO friendly and semantically relevant with URLs like:

Resource URL
Home page http://posters.online.lincoln.ac.uk/home
About page http://posters.lincoln.ac.uk/about
All campaigns http://posters.lincoln.ac.uk/all
Marketing and Communications group http://posters.lincoln.ac.uk/group/comms
Get Satisfaction campaign http://posters.lincoln.ac.uk/campaign/getsatisfaction
Science Fair campaign http://posters.lincoln.ac.uk/campaign/Science

As you can see, the URI structure is very simple and clean in contrast to some of the other examples mentioned above. This is partly down to the fact that the framework we built the website in, Codeigniter, has sexy URIs support built in, but also because it’s trivial to make Apache serve up extension less URIs.

This brief overview has hopefully outlined some of the differences between the URI construction of some of the online services we use at Lincoln.

Hierarchical Structures: Do we really need them?

One thing that has come out of our looking at URI structures thus far is that they tend to be heavily hierarchical, based around historical organisation attempts. This means that initially we get vaguely sensible URIs such as /mht/computing, but once we start getting down into the depths of course information it falls apart. Some things exist in multiple hierarchical ‘nodes’, making for an exciting experience when you need to decide which ‘node’ is the primary one, and should therefore have the content under it. Does a module on audio technology belong in mht/computing/audio, or mht/media/audio?

When it comes to navigating the internet this isn’t really a problem since we can link to wherever we feel like; there’s nothing inherently wrong with mht/computing and mht/media both having a link to mht/media/audio. However, this does then create a strange disconnect between computing and media – anybody who looks at the address bar is going to wonder “hang on, why am I now in the land of Media?”. Similarly on printed documentation (such as course notes) you’re going to have a load of computing students wondering why they’re being told to go to a media URI.

The problem with course-content is less apparent. Initially it looks like a lot of the content can be tidily nested, such as fees/international or accommodation/moving-in. Unfortunately it doesn’t take a lot of thought to break this model either; why should fees/international not be international/fees?

My favourite solution for this is a radical one which (in effect) totally breaks the existing model of our web content. Remove all nested content, except where there’s a clear ‘type/identifier’ model which can be followed. This leads to URIs like the following:

  • faculty/mht
  • school/computing
  • school/media
  • unit/audio
  • international-fees
  • moving-in

I’m curious to know what problems people can see with adopting this model, other than the fact that it will almost certainly require to be database/CMS backed.

Changes to Top Level Domains

A rehash of some quick thoughts I posted to our project mailing list. Please do read the responses. I’ll follow up with a more considered post in the near future.

David sent me this story about the changes to TLDs. Very interesting. I didn’t realise the flexibility that will be available. As far as I can see, potentially any* TLD will be up for grabs. So I could ask to register .kljhasdfkjhasdf and .josswinn and .jlwinn and .jossisasexgod and so on.

Of course, people have been trying to get around the current limitations for a while e.g. services like http://domai.nr/

I think that combined with the recent developments in browser location bar technology where search of page titles and URLs is now integrated with web search (even instant web search with the omnibox, encouraging a google click, rather than a direct click from the bar), you could argue that this combination of moves is further commodifying natural language expression at the level of TLDs, supported by and integrated into browser technology.

There’s now a much more free market in domain names, rather than one restricted by TLDs. The value of some existing domains, can only decrease as a practically infinite number are now made available. It clearly has implications for thinking about URIs as ‘assets’.

With the most recent browsers, the following could all serve the same function quite well:

studyatlincoln.lincoln.ac.uk

lincoln.ac.uk/studyatlincoln

lincoln.ac.uk/kljhasdf (with page title ‘Study at Lincoln’)

lincoln.ac.uk/kljhasdf (with page content including ‘study at lincoln’)

kjhsdfkjhsd.lincoln.ac.uk (with page title ‘Study at Lincoln’)

kjhsdkjd.lincoln (with page title or content including ‘Study at Lincoln’)

studyatlincoln.lincoln/kjhsdkjd (with page title or content including ‘Study at Lincoln’)

As David said on Twitter, the perceived value of the .ac.uk domain could plummet over time as institutions develop their brand to the extent of their own personalised TLD. I guess that we need to be ready to grab .unilincoln .lincolnuni .universityoflincoln .lincolnuniversity at the very least.

I think that we need to produce a considered** blog post about the implications of all of this for SEO, as clearly search is increasingly all that matters. James is writing a nice post about browser location bar developments. I could write one, furthering my thoughts here. Alex, Nick: could one of you write a post on what this all means for SEO? Is this something worth doing? It seems that the value of ‘cool URIs’ has been decisively pushed to the technical/developer domain, where good, reliable syntax remains valued as a predictable source of data but offers few user benefits over the instant search omnibox, for example. What do you think?

* re-reading the story, it’s not quite the situation I imagined where *any* phrase can be registered as a TLD. It looks like there will remain some regulation over the use of the new TLDs, but the University of Lincoln examples above still seem valid. I’m assuming that if there could be a .mashable or .redcross, then there could be a .unilincoln or .lincoln, too. The comments in the article also suggest that the cost of the new TLDs will be $185K, which clearly has implications for this new ‘market’. I’m trying to find recent documentation on the ICANN website about all of this, but don’t see it.

** this is not that post!

In an ideal world…

One of the key aspects of the Linking You project is to come up with the ‘right’ way of organising an institution’s URI structure, mapping a virtual collection of spaces in a way which makes sense. For this blog post I’m going to tackle the University’s corporate web site, that is the public-facing, shiny clean, ‘sell the University’ website which is only ever looked at by prospective students, parents of students, prospective staff and marketing/PR folks. A full-blown dissection of the University’s ideal subdomain structure for online services is a whole other blog post.

This is a clean-slate approach with no preconceptions of how things work or are currently arranged, just an attempt to draw up a logical way of breaking down content. In this I’m also going to be shockingly ruthless in getting rid of what I think is irrelevant to the University’s website, which means an awful lot of people are going to be upset when they realise that their 26-paragraph essay on how their department has recently invested £4,000 in new doors has been cast into the pit of internet oblivion (apologies if anybody has really put an essay on their department’s new doors on the corporate site, but it wouldn’t surprise me). Before I get going, I’m going to reference a popular online comic called XKCD. Look at this image, understand it, and all will become clear.

I considered a wide variety of ways to organise my thoughts and tackle the problem of sifting through content before finally opting – in true computing student style – that it would be easiest to try visualise what we were doing in an algorithmic manner. I’d put the ideal user flow into a graphing tool and see what came out, since if there were any clear groups or flow paths then they would be sensible places to start organising things. Out came Graphviz and I began to build an ideal user flow, based mostly on my own experience of trying to find information.

10 minutes in, and it doesn’t look pretty. I’d not even finished putting things in and it was already looking like a scribble of epic proportions.

Just a small part of an idealised web user flow.

It looks like we may need to abandon the notion of ‘nested’ URIs ((Uniform Resource Indicators, otherwise known as ‘addresses’)) as much as possible, and instead opt for a simple “type/identifier” method. In line with the principles of REST ((REpresentional State Transfer, a type of machine interface methodology for exchange of data)), these URIs should also be happy with appended actions to make life easier for those trying to get data. So, let’s get started with a few key types of information.

Continue reading

lincoln.ac.uk, 10 years later

www.lincoln.ac.uk
University of Lincoln home page

Over the past ten years, the University of Lincoln’s home page has evolved into a monolithic repository of course descriptions, staff profiles, news items, policy statements, information for staff, students, parents, the media, and anyone else who may stumble across the site.

Using an online sitemap generator I have created an XML sitemap and and a plain text list of all of the publicly accessible URIs on the www.lincoln.ac.uk site. I’ve removed anything that isn’t an html document (i.e. if it doesn’t have a mime of text/html it wasn’t counted). This amounts to some 4189 pages on the site. I’ve parsed this out further to what essentially are the top level directories:

http://www.lincoln.ac.uk/
http://www.lincoln.ac.uk/aad/
http://www.lincoln.ac.uk/about/
http://www.lincoln.ac.uk/accommodation/
http://www.lincoln.ac.uk/afas/
http://www.lincoln.ac.uk/alumni/
http://www.lincoln.ac.uk/architecture/
http://www.lincoln.ac.uk/bl/
http://www.lincoln.ac.uk/businessservices/
http://www.lincoln.ac.uk/ccawi/
http://www.lincoln.ac.uk/cerd/
http://www.lincoln.ac.uk/cjmh/
http://www.lincoln.ac.uk/clearing
http://www.lincoln.ac.uk/conferences
http://www.lincoln.ac.uk/dbs/
http://www.lincoln.ac.uk/dci/
http://www.lincoln.ac.uk/engineering/
http://www.lincoln.ac.uk/enquiries/
http://www.lincoln.ac.uk/events
http://www.lincoln.ac.uk/fabs/
http://www.lincoln.ac.uk/fashionshow
http://www.lincoln.ac.uk/forensic-erasmusmundus
http://www.lincoln.ac.uk/graduate-school/
http://www.lincoln.ac.uk/graduation/
http://www.lincoln.ac.uk/hcmd/
http://www.lincoln.ac.uk/hlss/
http://www.lincoln.ac.uk/holbeach/
http://www.lincoln.ac.uk/home/
http://www.lincoln.ac.uk/home/accommodation/
http://www.lincoln.ac.uk/home/blogs/
http://www.lincoln.ac.uk/home/calendar/
http://www.lincoln.ac.uk/home/charity/
http://www.lincoln.ac.uk/home/clearing
http://www.lincoln.ac.uk/home/clearing/
http://www.lincoln.ac.uk/home/conferences/
http://www.lincoln.ac.uk/home/contacts/
http://www.lincoln.ac.uk/home/cyclin/
http://www.lincoln.ac.uk/home/events/
http://www.lincoln.ac.uk/home/faculties/
http://www.lincoln.ac.uk/home/fees/
http://www.lincoln.ac.uk/home/finance/
http://www.lincoln.ac.uk/home/hull/
http://www.lincoln.ac.uk/home/identity/
http://www.lincoln.ac.uk/home/international/
http://www.lincoln.ac.uk/home/legal/
http://www.lincoln.ac.uk/home/lincoln/brayford/
http://www.lincoln.ac.uk/home/lincoln/cathedral/
http://www.lincoln.ac.uk/home/lincoln/riseholme/
http://www.lincoln.ac.uk/home/lincoln/sports-centre/
http://www.lincoln.ac.uk/home/lincolnacademy/
http://www.lincoln.ac.uk/home/locations/
http://www.lincoln.ac.uk/home/lr/
http://www.lincoln.ac.uk/home/maps/
http://www.lincoln.ac.uk/home/opendays/
http://www.lincoln.ac.uk/home/publications/
http://www.lincoln.ac.uk/home/research/
http://www.lincoln.ac.uk/home/staff_students/
http://www.lincoln.ac.uk/home/studentservices/
http://www.lincoln.ac.uk/home/supportdepartments/
http://www.lincoln.ac.uk/home/undergraduate/
http://www.lincoln.ac.uk/home/vacancies/
http://www.lincoln.ac.uk/home/vc/
http://www.lincoln.ac.uk/home/webteam/
http://www.lincoln.ac.uk/hshsc/
http://www.lincoln.ac.uk/humanities/
http://www.lincoln.ac.uk/international
http://www.lincoln.ac.uk/isc/
http://www.lincoln.ac.uk/journalism/
http://www.lincoln.ac.uk/law/
http://www.lincoln.ac.uk/Law/cdrc/
http://www.lincoln.ac.uk/lbs/
http://www.lincoln.ac.uk/lincoln/
http://www.lincoln.ac.uk/lishpa/
http://www.lincoln.ac.uk/LLMC
http://www.lincoln.ac.uk/lr/
http://www.lincoln.ac.uk/lsa/
http://www.lincoln.ac.uk/lsad/
http://www.lincoln.ac.uk/lspa/
http://www.lincoln.ac.uk/luac/
http://www.lincoln.ac.uk/media/
http://www.lincoln.ac.uk/mh/
http://www.lincoln.ac.uk/mht/
http://www.lincoln.ac.uk/news/
http://www.lincoln.ac.uk/opendays
http://www.lincoln.ac.uk/parentguide/
http://www.lincoln.ac.uk/policystudies/
http://www.lincoln.ac.uk/psychology/
http://www.lincoln.ac.uk/riseholmecampus
http://www.lincoln.ac.uk/riseholmecollege
http://www.lincoln.ac.uk/schoolsliaison/
http://www.lincoln.ac.uk/shsc/
http://www.lincoln.ac.uk/socialsciences/
http://www.lincoln.ac.uk/socs/
http://www.lincoln.ac.uk/sport/
http://www.lincoln.ac.uk/student_work/
http://www.lincoln.ac.uk/surveys/
http://www.lincoln.ac.uk/tempus/
http://www.lincoln.ac.uk/undergraduate/
http://www.lincoln.ac.uk/webteam/

My immediate impression is that there are are a lot of directories – over 100! Also what on earth do all of these acronyms mean?

Some are URIs are obvious and you’d find them on most sites – /home, /webteam, /contact. However the library section is under /lr instead of /library (LR according to the page title means Learning Resources, however the pages talks about Library and Learning Resources – i.e. so should it not be /llr?). I was interested to discover /lsad is The Lincoln School of Art, /luac stands for Lincoln University Assessment Centre (aren’t we technically The University of Lincoln – Lincoln University is in New Zealand (or also in three places in the USA)), /shsc is Lincoln School of Health and Social Care (again, why not /lshsc) and finally /socs is the Lincoln School of Computing Science (*cough* not /lsocs – also sometimes student societies are refered to “socs” so there’s even more confusion here). There seems to be an awful lot of inconsistency here in terms of the acronym used for the directory and the actual acronym we use internally. However the main problem here is that an outsider doesn’t understand our internal acronyms – if I was a potential arts student I’d have thought an all encompassing /arts would be better understood than /lsad.

There also inconsistency in the directory hierarchy. Some information is a subdirectory of /home whereas everything else is in the root directory /. It could be that URIs that start /home/ are less important than others, but then you could subjectively say that /home/legal is more imporant than /surveys. Likewise why is /opendays not under /events/opendays.

There is also a lot of apparent repetition. Campuses are represented under /home/lincoln/brayford/, /home/lincoln/cathedral/, /home/lincoln/cathedral/, /home/hull/ but Holbeach is on the root at /holbeach, and then there is also /riseholmecampus and /hull. Should all of these pages not be under /campus/ or /locations/ ?

Every school or faculty page (what’s the difference between a school a faculty and a department if you’re a potential student? Is one better than the other? Do I need to apply to the school or faculty? Does a school represent the academic side and a faculty represent the administrative side, if so, what is a department?) has a section contains staff profiles e.g. http://www.lincoln.ac.uk/cjmh/profiles/sara_moore.htm (by the way /cjmh stands for Criminal Justice and Mental Health, which apparently is an entity of the Law school a research group). However some departments/faculties/schools/research groups have a the member of staff’s name in the URI (as above) whereas this member of staff’s page is just a number – http://www.lincoln.ac.uk/lishpa/staff/1916.asp (again with inconsistent acronyms – LISHPA somehow stands for Lincoln School of Humanities (surely LSH?)). Note how the first staff page is a .htm whereas the second is .asp (is there a joke here about one being more dynamic than the other?). Over in CERD (Centre for Educational Research and Development), one member of staff can be found at http://www.lincoln.ac.uk/cerd/Staff/staff_l_bell.htm – why does this URI contain the word “staff” twice (likewise all the other profiles for CERD except one contain “staff” twice too)?

Whilst we’re on the subject of strange URI features, what’s with the funky underscores for course pages, e.g. http://www.lincoln.ac.uk/shsc/_courses/nursingAHP/_courselist.asp, http://www.lincoln.ac.uk/psychology/_courses/undergraduate/psychology/default.asp and http://www.lincoln.ac.uk/cerd/_courses/postgraduate_list.asp. Some pages also don’t replace spaces in the file name with underscores or hyphens e.g. http://www.lincoln.ac.uk/riseholmecollege/Non%20Course%20Pages/facilities/index.htm.

To conclude, I’ve highlighted a number of big inconsistencies and problems with the current URI structure for the corporate site in this post. My opinion of the URI structure that is currently in place is that the website has been influenced by corporate policy and politics and a lack of understanding by some departments in how they represent themselves on the web has resulted in a messy collection of pages. This isn’t one person’s fault, it’s just the organic development of a site which has lost its message. I believe the Linking You project is an excellent opportunity to explore the reasons why this institution has a website in the first place and through the technical and blue sky consultations which we plan on having with different internal and external stakeholders, we can develop a plan for a new website which is consistent, obvious and relevant.

Following this post will be a post by Nick that describes a hypothetical corporate website that was developed from scratch with no preconception of how the current website works. Coming up, we’ll also be writing about the URI structures for some of the web based software we use at Lincoln such as SharePoint, WordPress and Blackboard. We’re also going to write a presentation to present at our first technical consultation that we plan on holding in March.

N.B. In this post all staff names have been redacted and all links have attributes of rel=”nofollow”. Also I realise that department names have changed over the years and the website hasn’t updated in some instances for legacy or SEO reasons, but an outsider or a search engine has no knowledge of this.