Hello World

Two weeks in the Lab

This is a guest post by Erika Taylor, Curator of Collections and Programs Tweed Regional Museum.

It’s an incredible thing to watch an idea come to life, from your brain, to reality. After two weeks working in with the DX Lab we are excited to present Main Street. Main Street is the result of the DX Lab first ‘digital drop-in’ program and explores how the collection at the State Library of NSW can be used in conjunction with a NSW regional collection, to provide a beautiful digital experience that explores both collections comparatively.

Main Street uses a selection of images of “Main Streets” from the Tweed Regional Museum Collection (running on top of the page) and compares them to a set of images of “Sydney Main Streets” from the SLNSW Collection (bottom of page). The data sets are organised in sequential order ranging from the 1870s to 1950s. The middle of the page shows common words from newspapers of that year, the Sydney Morning Herald and the Tweed Daily. The project provides scope for the future addition of “Main Streets” from other regional collections with very little effort involved.

Initial brainstorming for the project began a few weeks before arriving at The Library. Erika pitched a few initial ideas to Paula Bray and Richard Neville, Mitchell Librarian, via Skype. The most favourable idea was one that used the State Library of NSW collection to give context to a regional collection. On arrival at The Library the DX Lab and Erika workshopped the idea, evolving the project to specifically look at images of “Main Streets” making a visual comparison over time.

Initial ideas were to compare George Street in Sydney to a regional town main street, but Library Senior curator Louise Denoon reminded us of the bias that may bring to the project. The idea to include some of Sydney’s earliest suburbs would provide a deeper look over a wider spread of demographics and area. In 1911, census figures reveal that more than a third of people living in the metropolis still resided in the City of Sydney and its adjoining suburbs within walking distance –Glebe, Newtown, Redfern, Paddington, Erskineville and Waterloo. Thus those suburbs and their main street used in the comparison were included as “Sydney”. The idea of “Main Street” in the TRM collection was also to cover a large geographical area of several towns and Main streets. During the brainstorming phase, several data sets and API’s were suggested as being used: eHive API, State Library of NSW API, National Library of Australia’s TROVE API, Australian Bureau of Statistics API.

The Tweed Regional Museum collection data. The TRM uses online cataloguing platform eHive to publish their collection online. eHive provides an API  which can be used to access data. Images of “Main Streets” were tagged as such in the TRM’s collection. The API was then used to integrate these into the project. The resulting data set was curated down to 100 images.

The State Library of NSW collection data
A list of 100 images was curated by Erika Taylor from the SLNSW collection. This was organised in excel and provided to the web developer to be used in the project. The images were chosen from several of Sydney City’s main streets, and surrounding suburbs. Images of transport, people, activities, TROVE  headlines from the Sydney Morning Herald and The Tweed Daily were used in the project. These were gathered from TROVE using the API.

Look and Feel
UX developer Ruth presented an initial look and feel using vertical scrolling images comparing city and country main streets with TROVE data integrated. This idea was refined further with inspiration from the “On Broadway” data visualisation interactive installed in the New York Public Library in 2013. The idea to work in the horizontal axis was adopted, which provided a visual “main street” into the project.


  • A full day was used cleaning up the TRM data in ehive, which included tidying up descriptions, dates and date ranges. The SLNSW online collection data proved too dirty and too large to enable the use of the API in the project. It was decided that a better result, for such a small project, was to hand curate a data set of 100 “Main Street” images from the SLNSW collection. While this a little time consuming, the end result is much better for it. Date formats! So many, so little time.
  • The prolific use of “circa” and wide date ranges in both sets of collection data made using the exact dates, or matching any exact dates difficult. Instead a more fluid approach was taken to dates and images placed in chronological order. It was also important to realise that while many images of Sydney exist in the late 1800s it would be rare to find many in regional collections. As a result a lower date cut off of around 1870 was used, and a upper range of 1950 was used.
  • The project team would have liked to integrate a few bits of statistical data from the ABS, such as migration rates, population figures, however time restraints prohibited the investigation of this.

Main Street of the future
More regional collection data could be imported into Main Street quite easily. eHive has been a fabulous tool for the TRM to publish their collection online, it is user friendly and cost effective. However future data sets of 100 images from regional collections could be provided in a simple excel spreadsheet with jpegs.

Tips for regional museums and libraries to curate a data set

  1. Use the example excel document in Github.
  2. Fill it in for your collection, you will need 100 entries.
  3. In the “Image URL” column, if you don’t have your collection online then prepare a folder on your computer of images. In the excel column paste the name of the file so The Library can match the two. Use jpegs of good quality, the size preferred is 1280 pixels on the longest side @ 72 dpi.
  4. Organise your spreadsheet by year. Just use the year in the date column as no specific dates are used in the project, just a range, we only need to arrange all your data by the year.
  5. Try and include aerial views of main streets, shops, parades, things happening in main streets and a spread of dates from 1880 to 1950. Make sure the images you use are high quality and cropped neatly. Don’t choose images that are panoramic, or extreme portrait format. The images from the existing collections are mostly in landscape format

Lessons learnt
One of the foundation ideas of this project was the use of collection API’s (Ehive,  SLNSW and TROVE) to manipulate data. All good in theory. In reality using the data, both from the TRM and The Library, proved to be tricky. To be both useable and presentable I had to curate a data set of 100 images from both collections, fixing spelling mistakes, dates formats, general errors, and making sure to choose images that showed a range of main street activities that were also a good resolution size.

It reminded me of cultural heritage technologist Mia Ridge encountering similar problems playing with the Cooper Hewitt Design Museum collection data back in 2012. What Mia found then: “The quality of collections data has a profound impact of the value of visualisations and mashups. The collections records would be more usable in future visualisations if they were tidied in the source database” is sadly still true 3 years later. It’s obviously not an easy problem to solve, and one that most collecting institutions face as discussed by Seb Chan in 2012.

Big data is not for every project and ultimately I think Main Street benefitted by hand curated data sets, to provide better look and feel and also to provide a more in-depth contextual experience. I also think the project highlights how important the role of the curator is in digital and web projects. This is not to say API’s don’t have their place in being powerful tools for big data projects http://dxlab.sl.nsw.gov.au/making-loom/, just that they may always need a curatorial hand in producing meaningful storytelling. I was heartened in my two weeks at the Library to witness the curatorial, cataloguing, and even conservation staff participating in brainstorming workshops for the DX Lab, and in general being deeply engaged in what was happening in the digital spaces the library inhabits.

The other thing I did in my two weeks (most of the project work lay at the feet of the brilliant DX developers) was to complete a short and basic Code Academy course in PHP. I may never be called on to code anything, but I think it’s now almost mandatory for curators to understand how coding works. I also think it is timely to remember the next generation of curators will hit the job running, bringing coding skills to the table as a baseline skill set.

The code and documentation for Main Street is available on Github.  If you use it then we would love to hear about it.

This project is supported by Arts NSW’s Mentorship, Fellowship and Volunteer Placement Program; a devolved funding program administered by Museums and Galleries NSW on behalf of the NSW Government.