Archiving sites

When we rebuilt the openly local map we knew that there were hundreds of hyperlocal sites, Facebook pages and Twitter accounts out there that weren’t on the map.  And that the map database contained hundreds of sites that had ceased publishing as natural wastage since 2009 in a fast moving sector where people are testing a myriad of business models.  Based on our recent work we judge that there are between 1,500 and 2,500 active hyperlocal sites in the UK of which we have 511 mapped.

Through some desk based exercises in the last couple of months, Finding Scottish Hyperlocals, Devon & Somerset – putting about 100 hyperlocal sites on the map and the Wakefield experiment we have added 209 sites to the map – not ‘new’ sites, just sites that had been around for ages but hadn’t got on the map. Our exercise covered barely 10% of the population. So based on this exercise we are cautiously confident that there are at least 1,000 other sites out there waiting to be added.

We have known for a long time that the OL map had lots of dead wood in it – sites that started and then shut up shop for some reason as part of natural wastage since 2009 – for instance the Daily Mail’s ‘Local People’ sites that ‘restructured’ in 2013 removing 122 sites from the database. So I’ve spent the past week reviewing all the sites on local Web List and archiving sites.

When we took the map over from Openly Local we imported 716 sites.  We added in around 209 in our desk based exercises.  This led to a grand total of 925 sites.

Every one of the 925 sites has been checked to see if it still exists and is occasionally updated, i.e is active somehow.  From this process the current state of the database is:

  • 925 sites at start of exercise
  • 372 sites no longer active (Archived but findable through search)
  • 42 sites that seem to have become spam or malware factories as their domains have been taken over – we have unpublished these to protect the unwary
  • 511 active hyperlocals displayed on map

This leaves 511 active hyperlocal sites on the map in the UK and ROI (that we know of).  These changes are natural wastage over five years that we have been overdue in catching up with.
Will and I have made a decision to not delete any sites in the list that are no longer active, but to move them into Archive status. This means that while they don’t appear on the map or lists of sites, they are still searchable. In the future if they do become active again we can just change the status of the site back to live.

The criteria I used when reviewing the sites was very roughly this:

site-checking-process

Dead sites – Does the site exist yes or no, if the site returns a 404 or server doesn’t respond then I archive it.

Changed sites – If the site exists then I check is it the original site, if it isn’t then I’ll unpublish it.

Spam/Link Bait I have found that some domains have expired and been hoovered up, these sites are now something completely different, some are legitimate sites for new things but a vast majority of them are spam / link bait sites. Will and I will need to discuss what is the best way forward with these, I’d hate to just delete them so maybe we just need to remove the URL so people can’t click through. If you have a view on what we should do, then let us know.

Active sites – if it is the original site, I check to see if it has been updated in roughly the last 3 months, I take a varying view on this depending on the type of site:

News-y sites  So if it is a site that looks news led i.e. latest stories on the front page, or one that I know used to publish every day or week I look to see when the last post was published. If it has been updated recently I leave it, if it hasn’t then I move it to archive status.

Community information – If the site is more a community information site than a news site and hasn’t been updated then I leave it.

One thing that appeared during this exercise is that custom sites with domains seemed to stop publishing and die off more than sites that used the available free platforms like  Blogger & WordPress. Maybe the hassle of having to keep the site updated technically as well as with content makes it less fun? More analysis would be needed to draw any real conclusion from this.

All of this isn’t fool proof, so if I have got your site wrong or you think we should be doing it differently then get in touch with us.

Finding Scottish hyperlocals

Living and working in rural Aberdeenshire, where I run my own hyperlocal site, I am the official Scottish correspondent for Talk About Local. Following on from Will’s post about filling in the white space around Wakefield on the Local Web List map I thought I’d share my experiences adding in sites from Scotland.

Way back in the history of Talk About Local, Will found a great site that listed lots of community websites across Scotland, the fantastically named http://scotlandinter.net. Sadly now the site seems to be more of a business directory and the list of local sites is nowhere to be seen.

Fortunately the wayback machine has a copy of the local sites page https://web.archive.org/web/20120309011705/http://scotlandinter.net/communities.html

I am using this list as a starting point to find local sites to add to the map. The sites listed on here will, if they are still live, have a decent history and archive of content. I’ve added around 50 sites using this method, when we took the data from Openly Local there were less than 20 sites listed for Scotland, I now have 64.

The task isn’t an easy one and can be mind numbingly boring at times, working through the list you have to click the link which takes you to the archive page, then you need to go to the current site, if it still exists, a great number don’t.

You come across sites which look like they have great potential https://web.archive.org/web/20120225131348/http://www.angusglens.co.uk/ but when you go to the live URL http://angusglens.co.uk/, it is ‘reserved for future use’, then you find a gem like http://www.theshoppie.com/arbroath/

Once you have found the live site you need to review it, I use some basic criteria

  • Is the site still in the spirit of a hyperlocal site? There are a lot of early hyperlocal sites that have changed to glossy corporate sites or affiliate link directories sites. I discount these sites, unless I can find any hyperlocal information.
  • Is the site dead? If it is dead but not been hoovered up by someone and still has the original content on there then I add it for posterity.

there are no real short cuts for us to add sites to the list, we have to fill the form in for each site we find, this ensures that the data is accurate and valuable to the public and our academic colleagues. We do have to make some assumptions about the radius the site covers and the description of the area covered.

Sometimes when you are looking at these sites, you get drawn in to them, there is something on the site that is just too interesting to not read, then you end up clicking links and before you know it you have lost an hour. I lost more time than I should looking around what is one of my all time favourite sites http://www.foggieloan.co.uk/ and its virtual museum.

I also found this link on caithness.org which may be of interest to anyone doing research in to rural networks http://www.caithness.org/laurathompson/index.htm

I am now focused on just reviewing each site and not getting drawn in to the links. Once I have checked each site listed on the scottishinter.net archive page I can go back and revisit the sites I have added and look for other local sites that are linked from them to add to the list.

I’ve not yet found a hyperlocal site for Dunfermline, home of the funders of Local Web List, Carnegie UK Trust, if anyone can help with this or any other Scottish hyperlocal sites that are not yet on the map, feel free to jump in and add them for us, you shouldn’t be able to add any duplicates.

 

Updated Search

We’ve updated the search on the site so it includes meta data. WordPress doesn’t search this data, tags & categories, out of the box so we’ve added it in.

So if you want to search for sites built on a certain platform type the platform in to the search box and hit enter, the same goes for networks.

We have removed the links to Networks & Platforms from the side bar as a result of this update. If you have any problems with the search then please let us know.