Hackers: Pat Redmond (pat.redmond@uon.edu.au) and Daniel McNamara (dpmcna@gmail.com)
The Problem
The National Archives of Australia currently have more than 300,000 images viewable online, and potentially millions more to come. Making these easily accessible to the public is a key challenge.
The user experience currently has significant drawbacks. The results are ordered by image number (default), title or date. However, they are not ranked for relevance, which means that it’s hard for users to find the items they need. Further, the presentation could be greatly improved, and there is little chance for user interaction.
Our Project
We bring ranked search to PhotoSearch. That way, even when a query returns many results users are able to find what they need. Autofill suggests possible queries to users and the number of records matching each of those queries. When users access a particular record, other related records are recommended to them.
We have improved the visualisation of the results returned to users. In particular, displaying lots of pictures without requiring the user to scroll down allows them to more rapidly assess which are relevant. Clicking on the image flips it over, displaying information about the image on the back.
We have also enabled photo tagging, so that users can tag people and places in the photo. This allows users to contribute back to the collection. Accessing the archive becomes for the first time a truly two-way process.
Presented at GovHack is a prototype of the new search functionality. Preliminary screenshots are below:
Clients and Stakeholders
Improving search capacities would help staff of the National Archives of Australia to access documents relevant to particular needs. An improved user interface would help the organisation to enhance its public profile.
The users of the National Archives of Australia would find it easier to navigate the online database. Specifically, academic researchers, family historians, and those with a more general interest would benefit. Every government department relies on the Archives to find information so they also have a stake in improved discovery.
Future Prospects
Over this weekend, a prototype built using Drupal and a small sample of the archive’s image files demonstrates proof of concept. We have used the open-source Solr search engine, which is open source and is based on TF-IDF (term frequency-inverse document frequency) plus other features.
With a few more days or weeks it could be a lot more developed.
Over time the search results could also be improved using click-through data indicating which documents past users have found most relevant. Users could keep their own personal collection of favourite documents, which could also improve search. Geocoding could be added to allow people to view results on a map. Improved faceted search could allow users to refine their search more using categories.
Search which is more relevant, visually appealing, and interactive would benefit any digital collection management system. Our prototype could easily be applied to other datasets within the National Archives and within government more generally. It could also have its search functionality enhanced through integration with the Australian Pictorial Thesaurus.
Everything has been built using open source tools. There would be additional costs for further development of search functionality. We expect the system to be fairly low maintenance once implemented.



























Peter Dallimore
Hey guys,
Fantastic work over the weekend. Just to let you know access to the Ninefold services provisioned for you during Govhack will be yours to continue working on your project for the next 6 months. Please keep us up to date with your projects so we can keep the buzz going.
Regards,
Pete
Ninefold
Daniel
Thanks Pete. For anyone interested the prototype is up at http://202.2.94.227/ – you can log in with username: test, password: test to use search