Day in the life post 1

Syntax Error -  Folded Up Beyond All Recognition

Photo by Simon Pow

I have been quiet on my blogs lately. I think I just needed a bit of a break- not to mention that school and my practicum are taking more time than I thought they would. I am back now to participate in the “A Day in the Life of a Library” meme, started by Bobbi L. Newman at Librarian by Day. You can see a list of other “Day in the Life” posts on the wiki.  (or sign up to participate yourself!)

Today was a little unusual, because I took a vacation day to go to my practicum at the Nebraska Library Commission. For those that don’t know, a practicum is a school requirement where I get to pay tuition to work for 90 hours. Exciting, huh? Usually, a practicum is used to get a sampling of what it might be like to work various jobs in a library. In my case, I was approached to help build a new website for the Nebraska Library Commission to supplement their existing site, and it was a too good an experience to pass up. I’m running a little bit behind on hours for the practicum (it’s over August 1st) which is why I am taking vacation days to finish it up.

It was actually really, really wonderful to be able to devote a full 8 hours to a project like this. My usual day (as you’ll see later this week) is a little bit of everything, with a LOT of interruptions. While it makes the day go quick, it can also make it hard to get things done. Today, on the other hand, I had the luxury of working almost non stop on one project. So today’s post will be somewhat short. Except I already rambled quite a bit already. oops.

I got to the Commission a little before 8. My first order of business was to talk to Michael Sauers. I had a favor to ask him having to do with the Other Job, and I wanted to touch base about the presentation we’ll be giving together this fall.

After that, I picked up where I left off last night on my website design. I have finished working out most of the design part, and am down to the nitty gritty of the CSS- not exactly my favorite thing to do. This is where having a lot of time to work on something comes in handy, because I can’t squash all those CSS cross browser bugs very quickly. Around 9:30 I checked the email account for the Other Job, responded to a couple of things, checked my personal email, checked Twitter, and started work on the CSS again.

I made a lot of progress today, which felt great. My next step in my practicum is to show my design proposals to the web committee, and hope they like them. I also have been working on a few suggestions for them. I am trying to leave them with well commented code and some Photoshop files they can use to make their own custom attractive graphics.

So that’s it for today. There will probably be some more variety in tomorrow’s post, but this being summer I can’t guarantee anything. :)

Posted in Uncategorized | Tagged , , , , | 1 Comment

ALA Annual update

Colors of San Pedro

Colors of San Pedro by my hovercraft is full of eels

The last few weeks have been a bit of a blur. Various house issues, preparing for vacation and ALA Annual, work, school, and life have been keeping me very busy. All my poor blogs are neglected. :(

I’m not going to post an ALA schedule yet, because I learned last year that it will just change anyway as ALA draws closer. I will probably post a few tentative plans next week, and will hopefully blog some sessions. Of course I will go to the sessions Cory Doctorow is at.

As for social activities, I will go to the Scholarship Bash Saturday night, and then some of us are trekking to San Pedro for the Rocky Horror Picture Show. I will go to the Blog Salon and the NMRT Social (I’m sad they’re not in the same hotel this year) most likely.

I’m heading for vacation in California before Annual- so if you are there beforehand too and want to do something, email me (karin@nirak.net.) If you want my cell phone # to contact me during Annual, just email me.

I will likely be posting vacation related stuff to my blog at os-agnostic, so check there if you want to read any of that. I’m also going to bring a painting to LA so I can give away a painting during my trip – hopefully to someone at the conference. It worked well at THATCamp.

I think that’s it. If you’re going to Annual, I’ll see you there, and if not, I hope I don’t annoy you too much with my conference postings and tweets. :)

Posted in Conferences, Library | Tagged , , , , | 1 Comment

THAT Camp, Day 1

Finally back for good in my hotel after day 1 of THAT Camp. I am exhausted and energized at the same time. The organizers have brought together an absolutely amazing group of people, and I am humbled by the sheer brilliance present. I’m going to do a quick overview, but many of the topics discussed will show up in my blog for weeks to come.

First, though- the DC area is becoming a favorite destination of mine, even though I have only been here twice now. I spent 5 hours yesterday int he National Gallery of Art, and was, of course, awed the entire time. (The only annoying part was listening to people say ‘why is that art? I could do that!’ over and over. ) Fairfax is lovely, despite the occasional disappearing sidewalks (seems people don’t walk long distances here very often?)

THAT Camp began with a great breakfast and a whole group meeting where we planned out the schedule for the day. Participants posted their presenting ideas to the blog for a couple of weeks leading up to the unconference, so the task was a bit easier.

Session 1 – Art

The first session was a session on art- specifically digital art. There were only two others including me, David Rieder and Susan Harum. We had a great discussion of what digital art might look like and how it might be supported. David and Susan had many, many great links to share, and it was great to hear how other campuses are dealing with the emergence of digital art. I’d love to see more about this topic.

Lunch!

A fantastic lunch was accompanied by Dork Shorts- brief talks on technology topic. Presenters had 5 minutes to show off their site or idea. More good link goodness, although some of the sites were in production and not yet available to the public.

Session 2 – Alternative search

I started the session with a brief slide show that addressed some of the points I’ve made in my recent alternative search postings.

After that, I left it up to the group to talk about what we could do to make search better. I was thrilled that the group contained a number of people with much more experience with search than me, and we talked about technologies, what the users want, and how to make search better. Josh Greenburg brought up the excellent point that some of what we think of as search problems are really user interface problems- so I am looking forward to attending the interface design tomorrow.

One of the developers of Blacklight (Bess Sadler), an open source OPAC enhancement, was there and the work that they have done is absolutely amazing. I particularly liked her ideas for allowing departments to customize search for different disciplines through an easy to use GUI interface. There were a lot of other great links mentioned, which, unfortunately I lost because of an errant keystroke.

Session 3 – Making things

Bill Turkle lead two sessions on the Arduino- I attended the second. I managed to make a light blink and alter a few programs, but what I am really excited about is getting an Arduino. I have never done anything with physical computing or electronics before, so it was a steep learning curve for me. I am the proud new owner of an Arduino, though, and I have several ideas of project I can’t wait to get started with.

Session 4 – Creative Commons/Copyright

I sort of led this session, too, through I felt a bit like an impostor because I am by no means an expert on copyright. I started with a discussion on creative commons, talked about why I use it, and what some of the advantages and disadvantages are. The group talked about some of the copyright issues they have had, and we tried to brainstorm some ways to get around them. I wish I had more answers for the frustrating issue of copyright. I believe in intellectual property, but also share the belief of many that the copyright system as it stands is as much of a hindrance as a help.

One of the frustrations the group expressed was the tendency of institutions to hold back higher resolution images from the web, opting instead to only allow very low resolution images to try and make money by selling higher resolution images. One solid idea we came up with is to try and collect studies that analyze the cost vs benefits of doing this and compile a list of advantages of making higher resolution images available and free to use. I’m going to work on this – I’m wondering if I can make it into an independent study project for school.

Andrea Ferguson talked a little bit about her experiences getting her MFA at the University at the University of South Florida, and I came away much more optimistic about Fine Art in Academia. I have been afraid that digital art was stifled many places, but many conversations have now led me to believe that that just isn’t so. Makes me want to go for an MFA even more.

Recap and dinner

At the end, the group met again and Josh Greenburg made a few final remarks. Then many of us went to dinner at Minerva, a fantastic Indian restaurant here in Fairfax. The dinner and the conversation were excellent.

I look forward to another great day tomorrow, though my brain feels about full already. I have a beautiful walk to CHNM tomorrow in the morning to look forward to, during which I can clear my thoughts.

Posted in Conferences, Library, Work | Tagged , , , , | 1 Comment

Alternative search, part 2: Analyzing a document

Analyzing the metadata in a document is a fairly straightforward process. However, analyzing the document itself is a little messier.

Document analysis is nothing new- people have been programming computers to refine full text analysis since the days when full text first started to appear. As hard drive space became cheaper and computers become more powerful, new documents began to be stored and new ways to analyze them were developed. In Information Storage and Retrieval, Korfhage mentions several of these methods, but things have evolved quite a bit since then. Besides the methods mentioned below, specialized retrieval systems such as face recognition used in law enforcement have been developed, but I will focus on a few technologies available to the general public.

Text Analysis

Text analysis in documents has come a long way since the inclusion of the first full text documents in databases. Search engines have become quite good at parsing the full text of web pages, as well as using hypertext and other measures to determine what a page is about. With the advent off more and more full electronic text, scholars have started to study ways to use text analysis on literary works. One such project is the Mellon funded MONK project. Sites are starting to work text analysis into their search and browsing features as well.

The Willa Cather Archive offers a feature to perform in depth text analysis on all of Cather’s books using a program called TokenX. This process is different than simply searching for a term because you can do new things, such as compare the use of words across books and contextualize the words for the user. These kinds of analyses allow scholars new ways to analyze literature.

Screenshots for Information Retrieval paper

Cather Archive text analysis powered by TokenX
, search results view.

Screenshots for Information Retrieval paper

Cather Archive text analysis powered by TokenX
, words in context view.

Another now common way to analyze documents is to create a word cloud of common words. Word clouds are commonly made up of user entered metadata such as tags, and are less commonly used with entire documents. The reason why is fairly obvious when one sees such a cloud- words like “a,” “an,” “the,” and “that” end up being the largest words in the cloud because they are the most common. However, a word cloud can be a useful way to browse even full text documents. This can be achieved by carefully filtering out words that do not add meaning to the cloud. The website “The Mountain Meadows Massacre in public discourse” does this in one of its visualizations, offering a view of common words used in articles about the Mountain Meadows Massacre . Another site that uses this technique is a search engine called Quintura (Fig. 15). Quintura analyzes the results from a web search and creates a word cloud of corresponding terms. Users can click on words to add or subtract them from a search. This may be more intuitive for users who don’t know how to use an advanced search.

Screenshots for Information Retrieval paper Screenshots for Information Retrieval paper
“The Mountain Meadows Massacre in public discourse” word cloud. Quintura search engine.

Multimedia Document Analysis

Although text analysis has been around for a while, it is only recently that computers have been able to analyze image and sound documents. It is not that such a search is impossible. In fact, Korfhage reported work was already beginning on such analysis in 1997, but it is extremely computer intensive and complex. As Korfhage notes, the transformations something might go through are enormous- a picture of a bridge can be from above, below, from the side, or on the bridge. It might be a sketch or a photograph. Also, there are hundreds of types of bridges (p. 249). Asking a computer to identify a bridge in an image is still a long way off and may never happen. However, other kinds of image analysis are possible and even easy using computers.

Color is one thing that is easy enough to analyze using a computer. The computer can select areas of a picture, average the colors, and match those colors up to a user provided hue. This allows for some interesting image analysis that aids both browsing and finding. One such site is called Flickr Colr Pickr . The navigation in this site is simple: choose a color from the color wheel, and the engine returns results that match the color. Another search that uses the Flickr API is called Retrievr, which allows for an even more complex query: it lets the user draw a picture to return pictures that resemble the drawing. This may work well when looking for photos of a sunset or the ocean, and less well for images of a dog. Retreivr is based on research by Chuck Jacobs, Adam Finkelstein and David Salesin, who created an algorithm which is “simple, requires very little storage overhead for the database of signatures, and is fast” (Jacobs, Finkelstein, & Salesin, 1995, p. 277).

Screenshots for Information Retrieval paper Screenshots for Information Retrieval paper
Flickr Color Fields allows searching Flickr photos by color. Retrievr matches photos to a drawing.

The above means of finding photos work well for browsing, but not as well for finding. One application that could prove very useful for finding is demonstrated by Dave Pattern (based on earlier experiments by Tim Hodson) (clarified thanks to Tim’s comment below) in an experimental site which lets you search for a book by color. Tim Hodson explained the usefulness of such a feature in a blog post. Imagine a patron asking “I heard about a book three months ago. I can’t remember who wrote it or what it was called, but it was blue” (Hodson, 2008, para. 3). Pattern goes on to describe the process for searching book covers in the same way Retreivr searches Flickr images “The search works by comparing the hex colours of the 8×8 version of the search image with the corresponding pixels of the book covers. Each book cover then gets ranked by how well it matches the search image” (Pattern, 2007, para. 8). Etsy has yet another fun way to search for products with its Colors search. Pick a color and Etsy will show you photos of products whose colors match your request. Although this isn’t a perfect method, it is an innovative way to search products.

Screenshots for Information Retrieval paper Screenshots for Information Retrieval paper
Dave Pattern’s demo of a book search by color. Etsy Colors.

One website, called like.com, uses several methods to help the user find a good result. Like.com might be one of the first applications of research performed by Wei-Ying Ma1 and B. S. Manjunath in 1999, promising the ability to “retrieve all images that contain regions that have the color of object A, texture of object B, shape of object C, and lie in the upper of the image” (p. 184). It not only uses existing metadata as mentioned above, it uses image analysis to find similar products. In the example picture, a small box is drawn around part of the product, and the engine finds products similar in style or color. The user can then refine by style, color, and other options. This kind of innovative searching is likely to get more and more common.

Screenshots for Information Retrieval paper
Like.com lets you search by drawing a box around the part of the item you like.

Though full text document analysis is exciting, things really start to get interesting when sites allow for user added metadata and use that data to provide ever better search results. That’ll be the next (and last) part in the series.

Bibliography:

Hodson, T. (2008, March 6). Colourphon: cooking up something interesting. Information Takes Over. Retrieved April 28, 2008, from http://informationtakesover.co.uk/archives/2008/03/06/colourphon-cooking-up-something-interesting/.

Jacobs, C. E., Finkelstein, A., & Salesin, D. H. (1995). Fast multiresolution image querying. Proceedings of the 22nd annual conference on Computer graphics and interactive techniques, 277-286.

Korfhage, R. (1997). Information storage and retrieval. New York: Wiley Computer Pub.

Ma, W. Y., & Manjunath, B. S. (1999). NeTra: A toolbox for navigating large image databases. Multimedia Systems, 7, 184-198.

Pattern, D. (2007, February 1). Michael Stephens = Norman Bates?!? Self-plagiarism is style. Retrieved April 28, 2008, from http://www.daveyp.com/blog/index.php/archives/172/.
Figures

Posted in Uncategorized | Tagged , , , | 2 Comments

Alternative search methods

The field of information storage and retrieval concentrates heavily on mathematic formulas for ideal retrieval, and while this is really fascinating (and way over my head) I am also interested in new methods that have been developed for information retrieval in the last five or so years. That’s not to say math is not involved in the new methods- it’s still there, but there are new methods of collecting and using metadata and analyzing materials that are surprisingly useful. search thumbnails

Types of Alternative Search

Alternative search technologies can be divided into a few distinct categories. There’s more than I have listed, I’m sure, but these are the ones I am primarily interested in.

  • Many sites use existing human or computer supplied metadata to find and display information, but some sites are taking this approach above and beyond the traditional ways to create new and novel ways of finding information.
  • Some searches analyze a documents’ contents (documents is used in the loosest form here, and meant to include everything from text to sound, images, and video) to return a result. Text is traditionally used for this, but some aspects of images are very easily returned in this way. For instance, it is fairly easy to analyze a picture for an average color and search by colors nearby in the color spectrum.
  • A final method of search and retrieval is to rely on user added metadata. This form of search is becoming increasingly popular, and sites are inventing new ways to encourage users to supply their own metadata.

Two further distinctions in retrieval systems can be made: finding systems and browsing systems. Finding systems assist the user in finding a specific item, for instance, a picture of a cat. A finding system may also help answer a specific question. A browsing system helps the user find something, even if they are not exactly sure what they want. Browsing may also help the user make connections in a collection of documents, an especially useful attribute in online exhibits; in this way, browsing helps the user formulate a question rather than find an answer. A system that doesn’t work as a finding system may work wonderfully as a browsing system. One final note is that more and more systems use a combination of search techniques to find a relevant match.

Over the next few days I’ll examine a few sites that use existing metadata, the document’s content, or user supplied metadata to facilitate finding and browsing.

Posted in Information Literacy, Library | Tagged , , , | Leave a comment

A few final words on Digital Humanities and Art History before I move on

Thanks to everyone who commented on my previous two posts. I’m still working these things out in my head, and am speaking from a very limited (and naive) perspective of only a handful of institutions and projects that I have seen.

One of the things I left out is that digital humanities centers are by no means the only entity that could help with digital projects or publications of art materials. This could also be accomplished through collaborations with other departments on campus (such as Computer Science)  or through a university press. I imagine that we’ll probably start to see a number of these collaborations at the same time.

The part of this that is stuck in my brain, and which I don’t have an answer for, is what one of these projects would look like? I now have an idea of what a history or literature project looks like, but not much of what an art history or especially a fine art project would look like. I have seen a few examples of art history sites, and just presenting the images as one would in a book is somehow a bit of a letdown. But I don’t know what it is that I expect to be different. As for fine art- I have seen several fine art projects on the internet, and again, I always think something is somehow missing. I’m going to ponder this and research more and come up with some links and ideas.

Posted in Art, Work | Tagged , , , , | Leave a comment

More Thoughts on Digital Humanities and Fine Arts

After more thought about the previous post, I think my question is:

Should digital humanities centers take it upon themselves to encourage fine art and art history faculty to create digital projects?

That would probably involve searching for funding from different venues and changing some assumptions, but I certainly think it is possible. It might mean specifically reaching out to fine art and art history faculty and demonstrating what a digital humanities center can do for them. More than just getting images on the web, it would mean a new kind of exploration for art history and fine art. Imagine an art history digital project illustrated with beautiful, high resolution zoomable (and downloadable) images that explain a concept better than static text ever could. Or a faculty artist’s web page which explores the meaning of the work in depth with (again) high resolution images interwoven with text and multimedia that brings the work alive. Better yet, imagine at least some of that content released under a license so others can reuse it, at least for educational purposes.

Ben noted in the comments of the last post that very few images that come up in a Google image search for an artist come from .edu domains. That does not surprise me—many artists and curators, especially in the academic realm, are nervous about posting images online and are stingy with high resolution images. However, what is considered high resolution has changed. I think of high resolution as above 1200×900—but many images on museum websites are around 300 pixels. Some museums sell high quality copies, but they could provide a nice big resolution and still sell the REALLY high resolution photo. Museum websites often are also stingy about letting you download images for your own use.

Ben also commented that some projects might be squashed by university lawyers. I think that is absolutely true, but that has been true for digital humanities in general. One of the great things about these centers is that they are constantly looking for materials to publish online, and will push for access for all. This is important because if we (as a society) don’t push for fair use from copyright holders, the copyright holders will take advantage and achieve ever more restrictions on use. This is true for books as well as paintings—but books, of course, are easier to deal with, because there are multiple copies. So we can go ahead and digitize that book that is clear of copyright, because it can be bought for a decent price, or our library already has a copy. With paintings, however, it’s more tricky. Many museums disallow photography in all galleries, even if the some galleries contain out of copyright works. This is all the more reason, I think, for digital humanities centers to step in, especially on campuses that hold works of art.

Ira Greenburg also left a great comment, saying:

Where I teach, “digital” seems to get inserted into every conversation these days – ranging in tone from vitriolic to sacrosanct. As a painter turned programmer (I still consider myself an artist), I find the debate tiresome and primarily fueled by ignorance on both sides.

I totally agree with this. I sometimes question whether digital humanities centers will continue past the next 10 or 20 years because I hope, eventually, that the facilities to create digital works, projects, and research, will be prevalent in every department on campus. Right now, though, a faculty member who wants to attempt a digital project has little support on many campuses. If they want to write a book, there’s a fairly straightforward process to follow, but a digital project requires expertise many don’t have.

Digital humanities centers are uniquely placed to reach out to fine art and art history faculty and create some unique and very exciting projects. Funding might be tough at first- but then, it was for digital humanities projects too in the beginning. I have a feeling that quite a few individual art faculty would really appreciate the help- some want to move online, but don’t know how or what the web can do for them. And if my suspicions are correct, they probably won’t get a lot of help from within their own department. (Again, depending on the institution.)

At this point I still have more questions than answers. I’ll end with a fantastic quote from Ira’s comment:

Working at the level of code, established disciplinary boundaries dissolve (and eventually the temples that house them will as well.)

Posted in Art, Work | Tagged , , , , | 4 Comments

Digital Humanities and Fine Arts

With THAT Camp quickly approaching, I have been thinking about digital humanities quite a bit. For those that don’t know, digital humanities is a cross disciplinary field that helps explore the humanities through digital exploration. That might mean anything from an online history exhibit to in depth text analysis of literary works. Across the country, Digital humanities centers are springing up to support new kinds of digital research. The reach of these centers varies widely- some are mostly history based, and in fact it seems like a great deal of digital humanities research focuses on history. Others are more broad, and include projects in many humanities disciplines: Art History, Literature, Language, Classics, etc. A big part of the discussion in the Digital Humanities is talking about new models for publishing: what does it mean to publish online? What does peer review look like for online projects? How should promotion and tenure change to account for digital work? (Some places won’t even accept digital scholarship as part of a tenure portfolio).

One humanities discipline that I rarely see addressed in digital humanities, though, is Fine Art, and the question of why has been on my mind a lot. One obvious reason I come up with is that funding agencies for arts and other humanities are different- there’s the National Endowment for the Humanities (NEH), and the National Endowment for the Arts (NEA). The NEH has many initiatives to support digital work (see the new Office for Digital Humanities), while the NEA- well, I don’t think it has much in the way of digital initiatives (please correct me if I am wrong.) Which is a shame, really- the NEA could go very far towards “bringing the arts to all Americans” (one of the goals stated on their “about us” page) by supporting digital work, especially if they also supported work that released into Creative Commons or some such license. The separation of funding agencies is one explanation for the divide, but are there others?

To be sure, Fine Art is different from other humanities disciplines. The measure for success is different, for one thing- it’s nice to publish a book, or have a book written about you, of course, but more weight is placed on exhibitions- where do you exhibit? Is it a solo or a group show? The important thing, of course, is the professor’s work, but it is not enough to make work an never exhibit it. I’m sure similar discussions must take place in the academic fine art world that take place in other disciplines, such as: Are there other models for tenure? What should count? What about an online exhibition?

I wonder if digital humanities in general has room for fine art. Where I work, we offer research faculty fellowships once a year to help faculty with digital projects. I don’t think any fine art faculty have applied, but I wonder what would happen if they did. Our Center is not really set up for a fine art project, and, to be honest, I’m not even sure what one would look like. But I would be interested to find out.

Posted in Uncategorized | Tagged , , | 6 Comments

So you want to learn to program

I have had “learn to program” on my list of stuff to do for years. It’s always “after I do this…”

But! There is a great new resource created by the fabulous William J. Turkel & Alan MacEachern called the Programming Historian which is also great for librarians and any scholar who wants a way to make programs that are actually useful to your work.

You can find it at the Programming Historian Wiki.

I’ll be working through it over the next week or two, and then I hope to move to some of the other programming resources I never seem to get around to.

Posted in Library | Tagged , | 2 Comments