History

The inception of Apatosaurus goes back to the first year of my PhD (2019). I was committed to using the new computer tools that had been developed specifically for transcribing, collating, and analyzing manuscripts. These tools include: - The Online Transcription Editor - The Collation Editor - The CBGM

I soon discovered, however, that these tools were not natively interoperable with each other despite their integration in platforms managed by INTF and ITSEE. The output of the Online Transcription Editor (TEI XML), for example, is in an entirely different format and structure than what the Collation Editor requires as input (JSON). I wanted to use these as standalone tools, but their maintainers intend them to be used in an integrated environment. Still, I am deeply appreciative that the tools are open source.

The only reliable way for me to use these tools, I learned, was to learn enough of programming to create "glue" applications myself. Thus began my self-taught software development path. Since I began, I have created many little tools that exist to convert the output of one program to the input required by another. I have collected the most helpful tools into the single desktop Python application, Criticus. Most of modules included with Criticus help to move users from transcription to collation. Apatosaurus helps move users from collation to analysis, especially involving the CBGM. But Apatosaurus has become more than "glue." It is a virtual research environment in its own right and additional features are being actively developed.

The earliest iterations of what would become Apatosaurus were called the "Apparatus Explorer." One of my first GUI applications was a Python and TKinter program which was later converted to a slightly more visually appealing Python and Qt version. These desktop versions were mainly for visualizing a TEI apparatus and adding edges to local stemmata. As any developer will recognize, these were very difficult to distribute, and almost as difficult to get styled correctly on both Windows and Mac. I now know how to make desktop apps play nicely cross platform, but I decided that this app would work better and be more accessible as a web application instead.

The first web application version of Apparatus Explorer was a bad idea. At the time, I had become proficient in Python, but to do a web app I needed to learn JavaScript. Or did I? I learned about a project called Brython, which was a Python interpreter written in JavaScript. Yes, you can see where this is going. I ended up writing a SPA entirely in Python and tacked it onto my personal website. This worked for a while, but it often failed to load on iOS. I also had an growing unease with the large (3mb) JavaScript library that had to be downloaded for Apparatus Explorer to work. I think Brython is pretty cool, and its creator is keeping it up to date with the newest Python syntax. But after learning JavaScript, I was ready to use best practices for a public facing web app.

Version 2.0 of Apparatus Explorer is still live (and will be until all of its users have migrated to Apatosaurus). In this version I kept the Python code on the backend, and only ran JavaScript, CSS, and HTML on the front end. Each version of Apparatus Explorer has had a fatal flaw. The fatal flaw in this version is that the user collation data was stored in TEI XML files. That is, users uploaded a file, and it was stored on AWS S3. The data was safe, but it meant that to edit a collation, Apparatus Explorer needs to retrieve the entire XML file from S3, parse it, find the element to edit, serialize the entire file, and then upload the new version to S3. This has actually worked surprisingly well, but it will struggle with very large collation files because it loads the entire thing into memory. I could optimize this. But I decided the better idea was not to use XML files on disk (and object storage) as the data store.

Here we come to Apatosaurus. The reason for the total rewrite was to store collation data in a more useful and performant way. User still upload TEI collation files, but instead of storing and editing the actual file, Apatosaurus parses the file and creates database objects for each of the key elements and relates them accordingly. And since I was rewriting the entire application, I decided to make every part of the collation editable. Once every part was editable, it became possible to create a collation without uploaded TEI in the first place. Finally, another motivation was my desire to provide a friendly user interface for the open-cbgm, which is otherwise available as a command line application.