The National Bibliographic Knowledgebase

We’re pleased to announce the development of a National Bibliographic Knowledgebase (NBK). This will be a three year development that builds on the long term success of the Copac service. The NBK will provide a new platform for expanding the database to include all UK Higher Education libraries that wish to participate, as well as retaining and increasing the range of non-academic research libraries. This greater inclusiveness of HE (and other) libraries has been the most frequent enhancement request from Copac users and we will now be working towards that goal. Jisc has commissioned OCLC to create the NBK and we will be working with the Higher Education library community to bring on board many more HE libraries, as well as continuing to expand the range of specialist research libraries that contribute their catalogue.

In the short term the NBK will be developed in parallel with the continuing development of Copac and we aim to move all current contributor data onto the new platform. As the NBK becomes established it is anticipated that Copac services, including Copac Collection Management tools (CCM tools), will become integrated into the NBK, to offer functionality that utilises the expanded data set that the NBK will provide. We will be looking to enhance existing services in resource discovery and collection management, as well as developing new services to support libraries in the management of their print and digital resources.

Full details of the NBK are available on the Press release on the Jisc National Monograph Solutions (NMS) blog.

We have also added information about the NBK to the Copac FAQ pages.

This is very early days for the project. The Copac team will be working with current Copac contributors over coming months as we begin to develop the new NBK. We will also talking with library consortia, as well as individual institutions, as we look at widening the range of contributing institutions.

Interface update: New Sort & Direct Link options

We’ve been making some changes to the Copac interface and adding new facilities. The main developments are:

  • Search results now have an estimated number of records, so for larger results you have a better idea of the number of records involved.
  • The Sort facility can now be used for a result set of up to 2000 records.
  • Where your search includes a title the Sort will include a Title Rank option to bring exactly matching titles to the top of the list.
  • The Full record display now includes a ‘Direct Link’ option. You can copy the direct link and include it in your own documents. This lets you link directly to a specific Copac record without having to search.

In addition the online Help has been updated and expanded to provide more information about managing your search results. There is a ‘Help’ button towards the top right of each screen.

These developments are in response to feedback from people using Copac, so if there are changes or additions you would like to see please get in touch. We are currently working on the deduplication procedures, in particular for pre-1800 materials, and we will be introducing enhancements to this process in due course.

If you have any comments or questions please get in touch with the Copac helpdesk: help.copac@jisc.ac.uk.

Copac cloud platform and new Web address

We have now moved the Copac service onto our new cloud platform. This is the final stage of a project to transfer the service onto a more responsive platform with greater flexibility to support future development. We will be continuing to check all aspects of the service now the move is complete – if you notice any problems please let us know via the Copac helpdesk: copac@mimas.ac.uk

You will also see there is a new Copac URL: http://copac.jisc.ac.uk
This reflects the move of Mimas services into Jisc last year. The old Copac Web address will continue to work for the forseeable future.

If you use the option to login to Copac it is possible that the Web address change may take a while to be picked up locally. So whilst we expect that most people will see the change immediately, in some cases it could take up to 24 hours before the login works for you.

As part of the work on Copac during the platform move we have removed one little used feature. Previously, where a university library didn’t have its catalogue on Copac, the library could provide information to let us to set up a local catalogue search – so a member of that university who was logged into Copac could search both Copac and their local catalogue together. This facility has been little used, so has been removed for the time being – with apologies to those who have been making us of this option. There are discussions underway about the future scope of the Copac service and once this becomes clearer we will look again at whether there is still a need for this type of facility and, if so, the best way of providing this.

Announcing the new Copac interface and design…

A tremendous amount has been going on behind the scenes of Copac for quite a period of time now.  Like everyone across the sector we’re working at what feels like full tilt  —  tackling multiple projects, and figuring out as a team how to juggle and prioritise it all.  We’re undertaking quite a few JISC innovations projects, including the work with developing a shared service prototype for a recommender API based on aggregated circulation data, a considerable amount of effort is being invested in the Copac Collections Management project, we’ve been collaborating with our colleagues across the office on Linked Data research and development, working closely with the Discovery initiative, and our developers (namely Ashley Sanders) have just about cracked the new database design and algorithms that will address some of the major duplication issues we are currently challenged with as a national aggregator of bibliographic records.

Image of the Copac websiteIn order to understand and meet the needs of our current user-base (800,000 search sessions per month, and counting) we’ve also been conducting market research in the form of surveys, focus groups and interviews with our users and stakeholders. We’ve amassed a lot of knowledge about how Copac is used, its benefit to academics and librarians, the features most valued in the interface, and what we could be doing better (deduplication! Ebook records and access!) We still have a way to go to meet all these needs, and as a service with a ‘perpetual beta ethos,’ committed to innovation, we know we’ll never be ‘done’ with this work.

But the launch of the new interface and design today is a very significant milestone, and one we want to mark.  These changes are the product of a great deal of committed work to the principles of market research and user-centred design. Thanks to the efforts of Mimas web developers Leigh Morris and Shiraz Anwar, the new application interface positively reflects the real world user-journeys of Copac users, and has been rigorously tested to ensure it’s in line with those needs. The new graphic design has been developed to communicate the value proposition of Copac as a JISC service representing Research Libraries, and also as a tool to Research Libraries.  Mimas’ new graphic designer has done an excellent job of transforming a site that was out of date, (‘lacked depth’ and ‘cold’ I believe are words used) into something more engaging, reflecting the breadth and richness of the libraries that make up Copac.  Certainly, beyond providing an excellent resource discovery experience for end users (and this is why the simplicity and ease of use of the search and personalisation tools are our primary focus) it is important for us to communicate on behalf of JISC that Copac is a community-driven initiative, made possible by its contributors and representative bodies like RLUK. We hope that the new elements of the website represent this community feel, giving Copac a bit more of an engaging voice than perhaps we’ve previously had.

A big vote of thanks to my fantastic Copac and Mimas colleagues, and particularly those who have worked quite a few late nights and weekends lately: Shirley Cousins, Ashley Sanders, Leigh Morris, Lisa Jeskins, and Beth Ruddock. Thanks to Shiraz Anwar for his work earlier in this project in ensuring every detail of the interface design reflected user needs. Thanks also to Janine Rigby and Lisa Charnock from the Mimas Marketing team for the market research work, and working with us to identify the value proposition and identity of Copac, and to Ben Perry for translating that so swiftly into a design we all instantly agreed on.

Copac Beta Interface

We’ve just released the beta test version of a new Copac interface and I thought I’d write a few notes about it and how we’ve created it.

Some of the more significant changes to the search result page (or “brief display” as we call it) are:

  • There are now links to the library holdings information pages directly from the brief display. You no longer have to go via the “full record” page to get to the holdings information.
  • You can see a more complete view of a record by clicking on the magnifying glass icon at the end of the title. This enables you to quickly view a more detailed record without having to leave the brief display.
  • You can quickly edit your query terms using the search forms at the top of the page.
  • To further refine your search you can add keywords to the query by typing them into the “Search within results”  box.
  • You can change the number of records displayed in the result page.

The pages have been designed using Responsive Web Design techniques — which is jargon that means that the HTML5 and CSS have been designed in such a way that the web page rearranges itself depending on the size of your screen. The new interface should work whether you are using a desktop with a cinema display, a tablet computer or a mobile phone. Users of those three display types will see a different arrangement of screen elements and some may be missing altogether on the smaller displays. If you use a tablet computer or smartphone, then please give beta a try on them and let us know what you think.

The CGI script that creates the web pages is a C++ application which outputs some fairly simple, custom, XML. The XML is fed through an XSLT stylesheet to produce the HTML (and also the various record export formats.) Opinion on the web seems divided on whether or not this is a good idea; the most valid complaints seem to be that it is slow. It seems fast enough to us and the beta way of doing things is actually an improvement as there is now just one XSLT used in creating the display, whereas our old way of doing things used multiple XSLT stylesheets run multiple times for each web page. Which probably just goes to show that the most significant eater of time is the searching of the database rather than the creation of the HTML.

Copac deduplication

Over 60 institutions contribute records to the Copac database. We try to de-duplicate those contributions so that records from multiple contributors for the same item are “consolidated” together into a single Copac record. Our de-duplication efforts have reduced over 75 million records down to 40 million.

Our contributors send us updates on a regular basis which results in a large amount of database “churn.” Approximately one million records a month are altered as part of the updating process.

Updating a consolidated record

Updating a database like Copac is not as immediately intuitive as you may think. A contributor sending us a new record may result in us deleting a Copac record. A contributor who deletes a record may result in a Copac record being created. A diagram may help explain this.

A Copac consolidated record created from 5 contributed records. Lines show how contributed records match with one another.

The above graph represents a single Copac record consolidated from five contributed records: a1, a2, a3, b1 & b2. A line between two records indicates that our record matching algorithm thinks the records are for the same bibliographic item. Hence, record a1,a2 & a3 match with one another; b1 & b2 match with each other and a1 matches with b1.

Should record b1 be deleted from the database, then as b2 does not match with any of a1, a2 or a3 we are left with two clumps of records. Records a1, a2 & a3 would form one consolidated record and b2 would constitute a Copac record in its own right as it matches with no other record. Hence the deletion of a contributed record turns one Copac record into two Copac records.

I hope it is clear that the inverse can happen — that a new contributed record can bring together multiple Copac records into a single Copac record.

The above is what would happen in an ideal world. Unfortunately the current Copac database does not save a log of the record matches it has made and neither does it attempt to re-match the remaining records of a consolidated set when a record is deleted. The result is that when record b1 is deleted, record b2 will stay attached to records a1, a2 & a3. Coupled with the high amount of database churn this can sometimes result in seemingly mis-consolidated records.

Smarter updates

As part of our forthcoming improvements to Copac  we are keeping a log of records that match. This makes it easier for the Copac update procedures to correctly disentangle a consolidated record and should result in less mis-consolidations.

We are also trying to make the update procedures smarter and have them do less. For historical reasons the current Copac database is really two databases: a database of the contributors records and a database of consolidated records. The contributors database is updated first and a set of deletions and additions/updates is passed onto the consolidated database. The consolidated database doesn’t know if an updated record has changed in a trivial way or now represents another item completely. It therefore has no choice but to re-consolidate the record and that means deleting it from the database and then adding it back in (there is no update functionality.) This is highly inefficient.

The new scheme of things tries to be a bit more intelligent. An updated record from a contributor is compared with the old version of itself and categorised as follows:

  • The main bibliographic details are unchanged and only the holdings information is different.
  • The bibliographic record has changed, but not in a way that would affect the way it has matched with other records.
  • The bibliographic record has changed significantly.

Only in the last case does the updated record need to be re-consolidated (and in future that will be done without having to delete the record first!) In the first two cases we would only need to refresh the record that we use to create our displays.

 

An analysis of an update from one of our contributors showed that it contained 3818 updated records; 954 had unchanged bibliographic details and only 155 had changed significantly and needed reconsolidating. The saving there is quite big. In the current Copac database we have to re-consolidate 3818 records. In the new version of Copac we only need to re-consolidate 155. This will reduce database churn significantly, result in updates being applied faster and allow us to have more contributors.

Example Consolidations

Just for interest and because I like the graphs, I’ve included a couple graphs of consolidated records from our test database. The first graph shows a larger set of records. There are two records in this set that when either are deleted would result in the set being broken up into two smaller sets.

The graph below shows a smaller set of records where each record matches with every other record.

Performance improvements

The run up to Christmas (or Autumn term if you prefer) is always our busiest time of year as measured by the number of searches performed by our users. Last year the search response times were not what we would have liked and we have been investigating the causes of the poor performance and ways of improving it. Our IT people determined that at our busiest times the disk drives in our SAN were being pushed to their maximum performance and just couldn’t deliver data any faster. So, over the summer we have installed an array of Solid State Disks to act as a fast cache for our file-systems (for the more technical I believe it is actually configured as a ZFS Level 2 Cache.)

The SSD cache was turned on during our brief downtime on Thursday morning and so far the results look promising. I’m told the cache is still “warming up” and that performance may improve still further. The best performance indicator I can provide is the graph below. We run a “standard” query against the database every 30 minutes and record the time taken to run the query. The graph below plots the time (in seconds) to run the query since midnight on the 23rd August 2011. I think it is pretty obvious from looking at the graph exactly when the SSD cache was configured in.

It all looks very promising so far and I think we can look forward to the Autumn with less trepidation and hopefully some happier users.

Copac trial interface: feedback

Many thanks to those of you that gave us feedback on the recent trial of the new Copac user interface. We really appreciate the time you put into testing and responding to us through the feedback form, email, and twitter.

I’ve summarised the feedback below:

  • In general you gave an enthusiastic response to the new interface design, including positive comments on the layout and workflow. Those who tried it on mobile devices were pleased with the how it came out.
  • There were also positive comments about the range of features, with the availability of the holding library list on the initial search result listing being particularly popular.
  • The grey ‘colour scheme’ generated a number of comments. Some people liked it but others definitely didn’t! The lack of colour on the site was to try and avoid getting too much comment on the graphics as opposed to the functionality of the new interface, so it won’t be staying monochrome.
  • There were individual comments about wording, screen elements, or requests for additional features, which are all valuable in helping us refine the presentation and facilities.
  • Amongst those who didn’t like the new interface the major concern was the lack of the ‘Main search’ screen with its range of detailed search options. Whilst the initial test was working just with the Quick search, we can reassure that we always intended to reintroduce the other search screens once we had feedback on the overall design. This obviously wasn’t as clear as we’d hoped.

We are continuing to work on the interface, reassured that we are moving in the right direction for most of you. In the next stage we’ll be incorporating colour as well as adding the missing search screens. We will also be making changes in response to comments or requests relating to individual features, as well as ensuring that it works well for as wide a range of browsers and devices as possible.

You’ll be able to try out the new interface again in a few months time and provide input into the final version before the work is completed.

Copac trial interface – have your say on the future of Copac!

We are developing a new style of Copac interface with greater search flexibility, new functionality, and clearer displays. Following initial user testing we’re now opening up the trial interface for further comment. We’re making the early draft interface available for a week from 12.00 noon 23rd May to 12.00 noon 30th May. This is your opportunity to try out the new interface, and let us know what you think!

Access the Copac Alpha trial interface.

Please note: The interface is very pared down, and there is no colour scheme. Some elements are just placeholders for planned options. The interface is designed to work in the latest browsers – you might experience issues with display/functionality in older browsers, such as IE 6 and 7.

We’d really appreciate your input into this work. There are feedback options on the screens and all comments will feed into the ongoing development process. You can also email copac@mimas.ac.uk with your feedback.

There will be further opportunities to comment on the interface redevelopment as the work continues. This is part of the complete redevelopment of the Copac service and additional interface facilities will become available at later stages on the work.

Getting Excited about Collection Management

The Copac Collections Management Tools Project is a collaboration between Mimas, RLUK, and the White Rose Consortium.

A number of partners have been working through and with us here  at Mimas  on a  JISC funded  Collection Management project, which is part of the broader Resource Discovery Taskforce activity

Since we have all been working on this slightly under the radar, and recognising the need to share more about this project and what’s going on, we’re planning series of blog posts to update the community on the progress and lessons learned through the partnetship.  The following update is from Julia Chruszcz, who is project managing this piece of work:

Just two months into the JISC funded Copac Collection Management Project the progress has been significant. At a meeting of the project partners on the 6th May each of the representatives from the White Rose Consortium (WRC) universities (Leeds, York and Sheffield) articulated the potential significance of this tool on their decision making processes around monograph retention and disposal and collection development. This included notions of collaborative collection development and how such a Collection Management Tool could facilitate regional and national approaches, each influencing local decisions for libraries.

The WRC has undertaken the early testing of the web-based tool in an approach that the project has adopted to inform development and iteratively assess the tool.  The idea is to build up a full specification over the life of the project of what will be required to take such a tool forward to introduce into library workflows. The next stage, between now and the beginning of July will be to further develop the batch and web technical interfaces based upon the WRC feedback and for this development to undergo further critical testing. The project is due to provide an interim report at the end of June with full report to the JISC at the end July.

The enthusiasm from all the project partners, JISC, Mimas, RLUK and WRC, stems from the realisation that we have the potential to produce a tool that will make a real difference to helping libraries make informed decisions particularly at a time of financial constraint, and assist in furthering the possibility of a national monographs collection, protecting access for researchers at the same time as facilitating local decisions that will save money and resource longer term. And all this by intelligent re-use and application of an existing extensive database, a resource invested in by RLUK and the JISC over many years, the Copac database.

If this is something you are interested in we’d really like to hear your view point and perspective.