Subscribe to: Posts Comments Photos Links 250 Posts and 424 Comments till now

Archive for July, 2009

Software development in Digital Humanities

One of the topics that greatly interested me from THATCamp 2009 (which really wasn’t addressed at Digital Humanities 2009) was software development/process of digital humanities projects. I’m interested in questions of workflow and task distribution—what does the team look like, and what does it actually do on a day to day basis? I had a lot of really great conversations with people that helped me clarify my thinking, and a lot of great book recommendations (most of which I can’t seem to find on the shelves, even though they say they are checked in. sigh).

Two sessions in particular at THATCamp were of great interest to me: “Picking a platform(s)” and “Software Development.” I also attended an interesting session on Drupal which, though a bit over my head, intrigued me.

Picking a Platform

A few points jumped out at me in this session. One is that it is possible to over analyze when picking a software solution, and often the best solution is to pick two or three likely candidates (the good ones tend to jump out/get recommended a lot), and work out a rough prototype in each of them. What strikes me about this is that we never work this part of the development of a project into the time estimates for a projects (when we actually manage to estimate time on a project, which is rare). Another point rephrased many different times is to try to avoid building something new whenever possible, and to look for stable, well supported projects to build on.

Another part of the discussion involved frameworks. If no CMS like Wordpress or Drupal exists that fits your project, a framework is the next best choice. Omeka is built on the Zend framework, and CHNM also tried CakePHP and CodeIgniter (and maybe something else, I was having a hard time keeping up at this point). At the CDRH, we recently used CodeIgniter for a project, which I liked because it was lightweight and the documentation seemed friendlier for non programmers (i.e. me) that have to use the framework. It worked pretty well for the project we used it on, and we will likely try out other frameworks in the future. We’re also looking into Drupal and Wordpress to build certain sites, but I am still unclear on how to integrate the large amounts of TEI documents we have into these products (if anyone has any advice, I’d love to hear it).  Many of our past projects have been built using Cocoon, and we will likely keep using this for the straightforward TEI sites.

Near the end of the session, I asked what people use to do the kinds of things I do most frequently—i.e. transform TEI documents into a website. DEAD SILENCE. This was a little surprising, but not completely so since most of the projects talked about were community building apps, not content driven sites. It was recommended I look into code4lib as well as the eXist database. This has led me to question the difference between digital library/etext stuff and digital humanities—is it the focus on content? is it the tools used? Is there even a reason for a distinction? Many of our projects at the Center could be classified more as a digital library, and I am starting to wonder if clarifying the type of project in this manner might be able to help us with our workflow a bit.

CHNM Creative Lead Jeremy Boggs remarked how the hiring of a graphic designer and his own studies in graphic design have changed CHNM’s approach to design dramatically. I found this interesting in light of other conversations I had at THATCamp regarding design—I’ll be talking about this a bit more in future blog posts.

A final point from this session is the sentiment, expressed frequently, that it would be nice to have some central place where we can share this kind of information and ask questions.

Software Development

I was really interested in this session, because at CDRH we are still trying to figure out how to… well, create software. We have very few processes in place, so it’s up to the development “team” (mostly consisting of me, a programmer, and a text encoder) to determine tools, put them into use, and train anyone else that needs to be trained. On the one hand, this is kind of nice—we get to choose what works for us. On the other hand, it is a little overwhelming, especially since none of us really have software development experience to speak of. As a result, I have become really interested in the software development process, especially as it relates to digital humanities.

Several concepts were brought up as essential to any project team of any size: bug tracking software, version control software, and, maybe, communication software. Right now we have a wiki and subversion, and are working on the bug tracking and other communication helpers. Sure, we can yell over the cubicle walls at each other, but at some point, we need a way to track all the broken stuff (especially as we keep finding more each day).

The group also talked about making more of our code open source, even the code we don’t think is very good, because others may be able to improve on it, or at the very least, use it to learn from. We need to get away from the “We’ll release it when it is finished” mentality and just get it out there. Another idea was to stop making one use only tools and to focus on broader things that can be reused. I like this idea in the abstract, but when I really start to think about how it applies to us, I see so many exceptions—little projects that will only happen once, special cases having to so with an esotaric area of inquiry from a scholar. I think part of what makes digital humanities interesting is we take on the stuff that doesn’t have a broader appeal, and therefore in any other context just wouldn’t get done. However, many of these one time projects might be of use to someone else, so the code should still be available.

What I’d like to see is a balance. Every digital humanities center or project is likely to have some code specific to their own project, but there’s also likely to be something which can be given back to enrich the community.  Both are important.

A common complaint in general at THATCamp, and especially in the Software Development session, was that documentation was always lacking. This seems to be a universal truth in software development, not just DH. One idea someone brought up was to pitch the documentation to a technical writing class for a class project. Another was to have a “documentation sprint” in the spirit of the code sprint.

One thing that was not brought up, but I have thought a lot about since reading Joel on Software (in book form), is the idea of the spec. We have never written a spec for any of our projects (that I know of), but I think it would be a useful exersize. You can read Joel’s series on specs here in 4 parts: I II III IV. I especially like the idea that a detailed spec, done right, can serve as a basis both for testing the end product and as a start to the documentation. What we have in place of a spec is a mess of meeting minutes—sometimes a year’s of biweekly meetings—which would be near impossible to go through to nail down all the decisions that have been made. Instead, the idea of a website is in one person’s, or several people’s, heads, which makes it pretty hard to sit down and build.

To be continued

I’ve just scratched the surface on the topic of software development. To an experienced software developer, most of these ideas will be old hat, but to me, most of it is completely new. I think this is one of the sometimes frustrating things abut working in a digital humanities center—we don’t hire with the thought of creating software projects, even though that is much of what we do. And much of the apparatus for website development I am used to from working in an ad agency (such as an art director) isn’t there either. It’s really no one’s job to tell us how to work as a team effectively.

On the other hand, this is what I absolutely love about my job. I get to be the art director, designer, coder, tester and researcher, which I find much more interesting than just design or just coding. For me, it’s really ideal, and I would not change a thing.

Digital Humanities and THATCamp 2009

So I have an overdue post due from DH09 and THATCamp09. Maybe I should first explain what those are.

Digital Humanities 2009 Conference

Digital Humanities is the web conference for those involved in (wait for it…) digital humanities. This was the first academic conference I’d been to, adhering mostly to 1.5 hour sessions with three papers each, with people generally reading a paper with bullet point slides in the background. This is in contrast to the library conference I’d been to, which generally had less paper reading (though the bullet points, unfortunately, seem to be a staple everywhere).

Many of the sessions I attended had to do with data visualization and tool discussions. In both of these types of presentations, I found myself wishing that the presenters would start with the demo and then go on to talk about it, especially as many weren’t available on the web. Some of the groupings didn’t really make sense – that is, two of the talks would have a lot to do with each other and the other didn’t really.

The first two days I was at DH I felt very out of place- more than I felt at my first ALA, bot in that case I was staying with a close friend who is also a librarian, so that helped. I felt acutely my lack of an overarching “research interest.” Also, I didn’t know anyone in person and though I knew a few people from Twitter, I always saw them while they were talking to someone else so I didn’t introduce myself. In hindsight, this was probably my biggest mistake.

By day three, I felt more at ease, and this was also the most interesting day of presentations for me, so that helped. I was finally starting to get the hang of things when DH ended- but I did introduce myself to several people the last day.

I don’t know for sure if I will attend future DH conferences. My position does not get any travel funding (even if I were to present, and I’m not at all sure what I would present on anyway) and this was the last year I could get student pricing.

THATCamp (The Humanities and Technology Camp)

THATCamp Twitter Word Cloud

THATCamp Twitter Word Cloud (photo by ghbrett)

On the opposite end of the cost spectrum, by contrast, was THATCamp which is donation only. I attended THATCamp last year, so I knew a little more what to expect, and I felt more at ease right away. I’m not sure exactly why that is, but I think it has to do with the informality and the mix of people. It’s not that I don’t like hanging out with academics, but my job just doesn’t allow the time to think about the academic-y research questions, and THATCamp addressed some of the more, shall we say, down to earth aspects of digital humanities such as: how can we make this all work? While DH seemed attended by the scholars who told someone what to do, THATCamp seemed to be attended by more of the techies (I don’t consider that term a put down, BTW) themselves, and scholars who took more active roles in the development of their projects. Again, maybe my impressions are completely off, because it has all to do with  the sessions attended.

THATCamp, for those who don’t know, is (mostly) an unconference, though with a little more structure than many unconferences. The structure comes in the form of a blog, where people can post their ideas ahead of time and others can comment. The first day of camp we signed up for sessions, and the organizers grouped these logically.  I like this idea, but in practice a bunch of people posted on the blog the last day, when I didn’t have time to read them all before the camp started, and stuff was grouped together that maybe should have been seperate. I think either a 24 hour suggested deadline for the blog might be good, or some of Daniel Chudnov’s great suggestions for improving on THATCamp.

Oh, thats me on the left (photo by ghbrett)

Oh, that's me on the left (photo by ghbrett)

Last year when I attended THATCamp, I did not know how to program, was still getting my Master’s degree in library science, and was a lowly assistant at the CDRH- so while I found everything very interesting, I had a hard time putting things in context and I really couldn’t implement anything I learned as my job was centered around setting meetings and taking minutes. Still, I felt like a part of the crowd and accepted even then, and this was true even even more (if possible) this year. Now, I am a graduate with a degree in library science, and working as one of the developers (visual resources designer) at the CDRH. I have also started down the programming road thanks to Steve Ramsay’s class taken last school year. This meant that this year’s THATCamp filled me with ideas I could actually implement, which is a Very Good Thing. I hope more of my co-workers can attend in the future.

THATCamp was notable for being my first conference where it seemed like almost everyone was on twitter. Tweets came fast and furious, and at some point I couldn’t keep up anymore. But it was a great way to make connections between sessions. Since THATCamp ended, the “#thatcamp” tag has been used to continue discussions started at camp, and to start discussions about regional THATCamps- an idea that just may be the best thing to come out of THATCamp. Travel money is unlikely to get any looser in the next few years, and regional unconferences are a great way to get together at a minimal cost and share ideas. Hopefully more centers and universities will be able to sent their staff to a regional get togethers if nothing else. (More on that, probably, in a later post.)

Both DH and THATCamp were enormously beneficial, and I am glad I went. I am a little sad to miss out on ALA this year, but buying a new house (before selling the old one) has limited my funds somewhat. Maybe next year. Int he meantime, both conferences have given me a lot to think about and (hopefully) blog about.