Editor’s Note: Over the past two years we have been fortunate to have had the assistance of twelve students working on various aspects of the project. They have come not only from the UK but also from Switzerland, America, the Netherlands, Hong Kong, Korea and China. They have all made invaluable contributions and enabled us to do far more than we originally thought we could achieve. Their work appears in various forms on the Reconstructing Sloane website, especially in the bibliography, here in the blog, and also in various aids you will find in the part of the site that contains the searchable transcriptions of Sloane’s catalogues. This blog is about one student’s experience creating an aid that enables researchers to reconstruct the contents of Sloane’s cabinets.
Hello! I am Xinyun Liu, a postgraduate student doing MA Digital Humanities at UCL. It was a great honour for me to work as a research assistant in the “Enlightenment Architectures: Sir Hans Sloane’s Catalogues of His Collections” project in the British Museum in the spring of 2019.
On my placement, my primary responsibility was to design a common XSL for the TEI-XML documents of Sir Hans Sloane’s 300-year-old manuscript catalogues of his collections under the supervision of the Principal Investigator, Dr Kim Sloan and with the assistance of Dr Victoria Pickering, a post-doctoral researcher on the project. The emphasis of the work was on the extraction of the so-called “cabinet numbers”, which were written in pencil in front of the cataloguenumbers of each entry. The purpose of my work was to better support the website construction by enabling the technical team to provide site visitors and research scholars with a new way to look up the collections in regard to the cabinet numbers.
What immensely motivated me to contribute my effort to the research is that it was of great significance for understanding the intellectual structures of the catalogues and learning how the curiosities were arranged by Sir Hans Sloane and his secretaries in the eighteenth century and their varied modes of organising the treasures by taking advantage of digital humanities. Sloane had a wealth of collections including books, manuscripts, prints and paintings, natural history specimens, antiques, costumes and other rare artefacts. As he collected them, Sloane made labels and descriptions of these items in more than 40 volumes of manuscript catalogues. In addition, the manuscripts also contain related information about the objects, including not only descriptions but also the related person’s name, place name, location code, notes, references and numbers. I was delighted to put what I had learned in the XML module in my MA course into practice to see the potential of the lists as historical information useful for recreation of ideas from the past. After that, Kim introduced her ideal blueprint of the website which would enable the interactive search by cabinet numbers, and then we had several discussions to clarify what was and was not feasible.
Designing a common XSL file for the markup documents was never easy because the XML documents of the manuscript catalogues have some differences in the position and logic of some certain elements and their values even though they shared the same structure. Therefore, I had to consider every possibility that could happen during the extraction of the values of the documents and find out the best way to sort the cabinet numbers, deleted numbers, catalogue numbers as well as their descriptions so that the researchers could clearly spot one specific cabinet number and what it contains.
The TEI guidelines define a semantic format for exchanging text and the XML documents I coped with followed the TEI guidelines strictly. But when I tried to get the text in <catnum> element I couldn’t capture it because it had defined namespace. To solve the problem, Victoria, one of the research assistants, came up with an idea that we could replace the <catnum> and its namespace with a new element called <ab>. In addition, some of the added handwritten content had prefixes and suffixes like letters and symbols, and this resulted in a large amount of confusing strings which dispersed the results containing the same numeric values, bringing a lot of difficulties to sort and search the content. We added <num> elements in the <add rend=“pencil”> to specify what number or string we wanted by using Regular Expression. Though there were still some wild results, the final lists were good materials for research.
At first, I designed an XSL file for the markup document for the Miscellanea catalogue which listed the cabinet number, deleted number, catalogue number and description of each entry. After numbers of revisions and improvements, I began to deal with the other catalogues such as Antiquities, Seals, Pictures, Mathematical Instruments, Fossils and so on. The heavy workload required patience and rigour, and I devoted myself to modifying the XSL file and XML copies and comparing them with high-resolution digital reproductions of the original manuscript pages to ensure there was no detail missing.
The work placement benefited me a lot. First, the placement was a test of my comprehensive ability. In order to complete the work well, it was necessary to have a certain practical ability and a positive attitude when failures and errors occurred. In addition, it improved my work responsibility and perseverance. It was my first time conducting an internship in a British organisation and as a research assistant, I also needed to adjust to the work environment and express myself better so that people could understand me and solve problems we encountered. Secondly, I realized the importance of accumulating knowledge. We read materials provided by our supervisors in the beginning of the work placement because the basic knowledge we learnt in the XML module was not enough for the placement and the practical work required us to be equipped with more TEI guidelines and XPath skills. I referred to many tutorials on specific websites such as Stack Overflow. Finally, I improved my work initiative. When the work was almost finished, I combined the separate modified XML files of each category I copied from the original manuscript XML file into a whole document. Moreover, to bring more convenience for the investigators and scholars who are probably not familiar with TEI-XML and XSL, I offered to generate the results of different catalogues into HTML pages via Oxygen so that the investigators and scholars can share and browse the tables intuitively without downloading and looking at the elusive XML and XSL documents. The results were also converted into Excel sheets for future research due to Excel’s powerful filter functions. Recognising the importance of what the curators, research professionals and our assistants have done makes me feel quite proud. I will strive to improve my performance and continue to keep an eye on the ‘Enlightenment Architectures’ and make contributions to the project.