The perfect blend of good food, good books, and whatever else I toss in.

Thursday, March 14, 2013

Know Your Vocabulary

It is an interesting project that I find myself in the middle of. When I first began this internship and my site supervisor were discussing the tasks I should work on or start to think about, one of the things mentioned was to create a controlled vocabulary for the items on two of the pages on the staff intranet. For the first seven or so weeks, it was something I've kept in the back of my mind and a task my site supervisor said was low priority. Well, now that the pages have been updated and the ownership of the items is now posted, this project of creating a controlled vocabulary has moved to the top of my list. And while it's a bit tedious to record and organize the terminology currently in place, I'm finding myself really enjoying this particular project.

Controlled vocabulary is not a new concept for me. In fact, it was one I came to know and love during my second semester when I was taking a class on databases (one of the core classes for the program). Controlled vocabulary can be seen as metadata--data (information) about data--that can have a number of uses in a practical sense, particularly as an aid for creating more successful searches. The reason why it's "controlled" vocabulary is because it's a set of predefined terms that are adopted as the "correct" entries for the type of information considered. I know that may not make much sense, so here is an example. Think of a field where you are asked to enter a date. There are quite a lot of ways that you can do that: 2013-03-14; March 14, 2013; Mar 14, 2013; 3-14-13; 3-14-2013; 2013, March 14; and so on. All of them are correct, but perhaps the system reading the information can only understand one type of entry. The field may have a line of text under it, something along the lines of "Enter date as MM-DD-YYYY" so you know the required method of entry. A controlled vocabulary works in a similar way; there is a predetermined "right" way of stating a term or phrase, and you must use the controlled vocabulary when working within its perimeters.

Librarians, especially those who do cataloging, are very familiar with the Library of Congress Subject Headings (LCSH), a type of controlled vocabulary used to enter the subjects a library item is related to or addresses in its contents. The LCSH is quite extensive and covers every subject you can think of. If you want to see examples of these subject headings, go to your library's website and do a search of your choice. Click on one of the books in the results to look at its details, and you'll see entries for the item's "Subject." Tamora Pierce's book Mastiff shows subject entries for "Kings and rulers -- Juvenile fiction," "Police -- Fiction," and "Fantasy," among others. Each of these terms could be rewritten in different ways with different words and still get the same or similar meaning. Having a standard set of terms--a controlled vocabulary--means that the terms can also become useful for searching, since all items on the same subject have the same subject entry and can be found in one search. You don't have to worry about different people using different terminology because catalogers use the same subject heading rules.

Okay, I went off on a tangent a bit there. Back to my project. I had to approach it a bit differently than the projects I did in my database class and my cataloging class that used the idea of controlled vocabulary. On those, I was starting from scratch and coming up with controlled vocabulary terms with a blank slate (although in the cataloging class, I did have to use the LCSH rules). With this project, there are terms already entered for each item on the two pages I'm updating. Terms with no sense of organization or logic. So I first went through and recorded all the terms and the frequency of their use. Then I organized them into sub-classes based on general topic that they relate to (I'm still working on this piece of it). My next step will be to see if there are terms that can be combined into more general terms. The trick with controlled vocabulary is to get just specific enough--not too broad, and not too focused, but just right. One must be Goldilocks, except with words rather than porridge and beds. Lastly, I will designate terms/phrases for each of the old ones (or simply note them as needing to be deleted). This will be my controlled vocabulary.

I will have to post another recipe tomorrow. I made a quick and easy dinner last night and it was good--something I haven't had since my grandmother was alive. Stay tuned...

2 comments:

  1. It is good to know you are using and adapting what you learned in class to your internship position. I enjoyed reading about your approach, and I'll be checking your recipe as well.

    ReplyDelete
    Replies
    1. I'm beginning to think that I have a much more analytical mind than I thought. Perhaps that is partly the cause of why I've enjoyed this project as much as I have (and maybe why I like coding web pages too!).

      It's funny how much of the concepts I've learned come up in real-world experiences, but not necessarily where you would expect them. Controlled vocabulary is so heavily used in cataloging (something I found myself distinctly not liking--no offense to catalogers!) and I haven't really seen it anywhere else--but it's definitely playing a role with this internship project. It just goes to show that you don't always know when those concepts will come into play!

      Delete