That’s All, Folks.

Aaaaaand, I’m done.

Ok, not really. Our team still wants to present our work to Joe’s, and of course we also need to present to our classmates. But the documentation of processes, and detailed recommendations for staffing and infrastructure has all been made official. The final reflections I included in the paper probably sum up things best, so I have included them below:

I was really pleased to have been assigned this project. A lot of my background experience in the areas of multimedia and photography became useful at several points. But more than anything I was appreciative of the enthusiasm Joe’s staff exhibited for the work we were doing. Sierra was the most immediate example of this, but several other Joe’s staff seemed receptive and helpful as well. That fact makes me optimistic that many of the points and suggestions we make in this paper will eventually be implemented in some form.

If I had to point to a specific disappointment I would probably say that it would have been nice to have had enough time to help them push all the way to the digital museum implementation stage. That is something I haven’t done before and would have been a good learning experience for me.

With regards to the team aspect of the project I would say that this is one of the first times in the program that I’ve felt like a team was really needed for a project. Not necessarily for the division of tasks, but for the multiple points of view on some pretty complex problems. I greatly appreciated my teammates’ ideas, and I think Joe’s has a stronger set of solutions because of the multiple viewpoints.

Containers and Codecs

I spent a fair chunk of time this week considering what kind of file types Creative Works currently produces, and what they should consider using for various purposes. Most of that time I spent examining the many options for video. None of which are terribly attractive for long term archiving. I was quickly reminded how messy and chaotic the evolution of digital video has been, and how even the most popular and widely used formats in use today will be considered quaint in less than 5 years.

I poured through lengthy comparisons of the archival merits of exotic formats, and checked various institutions for their minimum standards for video submissions. Ultimately I ended up believing that NARA’s requirements were the most down-to-earth and realistic for Joe’s, given that their editors are all beginner level yet their reuse patterns most resemble a production house, not a cultural heritage institution. NARA’s guidelines made provisions for such business needs.

In fact, the business needs and reuse patterns of Joe’s had me making recommendations for both video and still images that I’m slightly scared would be career limiting without explanation. In short, I believe that cultural heritage institutions have the luxury of not considering the expediency of workflow. They can save in whatever format preserves maximum data without fear of how that format affects their ability to use it. In most other businesses access is of far greater value, and they are willing to make compromises to achieve it. Having lived in that world for a long time now, and having personally felt the effects of inaccessible assets, I find that I’d rather guide the decision makers in that system to make informed decisions, but not at the expense of their ability to work. Sometimes that means allowing them to save in a lossy format. The horror!

Options for the “Digital Museum”

This week I concentrated on documenting several of the activities and processes our team has performed for Joe’s Movement Emporium. In particular, I have focused on providing a step-by-step process for activities that Creative Works staff will have to maintain on their own, such as updating their keyword taxonomy. We have also provided a hard copy of the taxonomy itself, and the folder structure outline that we are suggesting they use for both their in-progress files as well as their long term storage.

The inventory analysis continues to take shape, and I’m optimistic that we may already have some data that will illuminate the necessity for Joe’s to put more resources into asset management. I’m still hoping that we can find a way to provide yearly file size totals, but even the file counts that we have already are compelling.

Lastly, I have begun doing a bit of high-level research on possible tools that Creative Works could potentially look at for the “digital museum” goal that they want to work towards. I know this is outside the scope of our project for the semester, but I also see the value in having some idea of what the possible end state might look like. It is a given that we will recommend looking at Omeka. From a cost-benefit standpoint there are few (if any) competitors that offer the same features, ease of use, and flexibility of application. Once again, the sticking point for me has been what a more short-term-achievable alternative might look like for them? Initially I started exploring what capabilities are built into platforms that they already own, or could add onto for minimal extra cost, such as OneDrive, or Adobe Creative Cloud. In short, neither seems to be a good option. OneDrive doesn’t appear to offer much in the way of a publicly available and searchable presentation-quality interface. It is primarily a workflow collaboration tool. Adobe does offer some cloud capabilities for sharing images, such as Adobe Portfolio, which comes with Creative Cloud. However, this does not appear to offer searchability. Adobe also has something called Adobe Experience Manager that appears to be an end-to-end workflow and digital asset management system. It just happens to be the most expensive one on the market, with a total implementation cost around $2 million. That’s clearly meant for large enterprises. Not Joe’s.

In the end I suspect my teammate Lauren is correct that the best short term solution will be to encourage Joe’s to look at how they might use readily available social media outlets in a smarter, more organized way. In fact, if Creative Works staff adhere to the metadata practices we have developed for them these improvements will set the stage for more accurate search results in platforms like YouTube and Flickr.

Beginning to Wrap Up

This week Zach and I went in to Joe’s to confer with Sierra and finalize the keyword taxonomy our team has been developing for them. We then imported it into Bridge on each of their workstations, so that they are all working with an identical set of keywords. Our hope is that this will function as a bit of a controlled vocabulary for recurring categories of metadata. Of course keywords are a rather fluid area of metadata and do not retain any hierarchical structure, but if this helps them maintain consistency then Creative Works will be a step ahead when they finally get to the stage of implementing the “digital museum” that is their ultimate goal.

We then addressed the topic of folder structure of in-progress work files. This discussion – and its related tangents – was quite fruitful. We were able to dive deeper on many points that our group previously had questions about, including their current hardware environment, the challenges it presented, how it effected workflow and file organization, and what an ideal hardware environment would look like for them. We also briefly touched on the concept of establishing a schedule of archiving tasks that maps to the timeline of each semester. For example, this schedule could:

  • Carve out time the week prior to the start of classes for preparing workstations with a fresh template of benchmark folders for students to populate.
  • Establish regular intervals during the semester for identifying portfolio quality work.
  • Set aside time at the end of each semester for applying any final metadata and migrating student assets to a more permanent repository.

I feel like we’re getting close to the end of what we can physically do for Joe’s. We have provided them with the basic tools, but they are ultimately the ones that have to identify their content and make decisions about it. Most of the retroactive curation challenge for them now rests in knowing the “aboutness” of their files, and as a group we just don’t know what we’re looking at, so we’re ill-suited to apply metadata or make “keep vs toss” decisions…  well, at least decisions that aren’t obvious, like duplicates. I think the remainder of the value we provide to Joe’s will be in documenting many of the processes and standards we have established so that they have easy reference material for replicating these practices on their own for all future work. I hope to turn my attention to that this coming week.

GAH! Why JPEGs shouldn’t be immediately dismissed if you encounter them!

OK, I know we’re only required to do blog posts on our projects, so I guess I’m doing two posts this week. But I gotta vent.

I just finished the reading for this week, and while I fully appreciated the main point that Python skills can be extremely helpful at automating repetitive processing tasks, I found myself horrified that the digital archivist immediately decided not to retain any of the JPEGs, opting instead for the RAW files.

I sincerely hope that she conferred with the photography department prior to making that decision. And that they confirmed that there would be no difference between the corresponding image sets. Because as a professional photographer for the last 17 years I can tell you that the RAW file is not always the final image. In fact, it seldom is. There is often some retouching that happens in Photoshop after the image has been exported to JPEG or TIFF. It’s not possible to save that information in the .NEF or .DNG. So throwing those files in the trash could potentially mean throwing out the best visual quality version (i.e. no dust, no lens flare, color corrected, nose hairs removed, tie straightened, 20 lbs lighter, etc.) in favor of the best bit depth version that includes none of those improvements. Even if you get someone to retouch the RAW file again it may or may not look like the version that appeared in the original publication your customer is hoping to find.

Additionally, while Adobe Camera RAW has been storing RAW adjustments in sidecar XMP files for a long time, I would be very nervous about assuming that XMP files created in 2005 in CS2 mapped to the same type of adjustments in today’s Creative Cloud. It was a remarkably different tool back then.

Yes, JPEG is a “lossy” format. However, this really only matters if the user intends to go back and reprocess it trying to pull more highlight, shadow, or color detail out of it. If you’re pretty sure your target user group only ever needs to grab-and-go then a high quality, high resolution JPEG is usually more than enough.

Yes, TIFF would technically be a higher quality format for saving final retouched images. But here’s the thing… the size of TIFFs became unworkable for event photographers once camera resolutions rose above about 6 Megapixels. For example, I use a 24 Megapixel camera at work (this is the average today), and a single TIFF produced by it is in excess of 200 Mb. One image. We shoot hundreds, sometimes thousands of images from a single event. My workflow would grind to a near halt if I had to wait for Lightroom to process that much extra data, or for Photoshop to open a batch. By comparison, JPEG is around 12 Mb for the same image, it snaps open in both Lightroom and Photoshop, and as long as the designer doesn’t want to twist the colors or exposure drastically, no one can visually tell the difference… even when blown up to larger than life size.

All this is to say that when you encounter them: 1) Don’t assume the JPEG folder has identical content to the RAW folder. Inquire about the workflow habits of the creator before you make any decisions… and 2) don’t assume that JPEGs are worthless just because they’re JPEGs. Examine the quality before dismissing them, but also ask yourself how likely your target user group is to need to pull additional detail out of shadows and highlights, and how likely they are to actually make high quality prints that require maximum color depth. Some may need that. Many will not.

/rant

The Hardware Dilemma

Our group has successfully migrated all of Joe’s reachable digital assets (one drive is currently down for the count…  Zach is contacting MITH about possible remedies) to the largest external drive they currently have, a 4TB Western Digital MyBook. Lauren plans to run our inventory software on it this Thursday before class to create an excel file detailing every file and folder that exists. Hopefully after that we can find some clever ways of re-sorting the many fields to start tapping some actionable metrics out of the inventory, such as how much disc space they fill up in a semester, and with what type of work predominantly. These are both pieces of information that will be vital to the recommendations we plan to leave with Joe’s regarding how they should plan and budget for the basic maintenance of their storage environment, as well as the file format choices they make to maximize longevity of their project files.

In truth, the file format of finished student work is actually much less challenging than Joe’s commitment to keeping all student project files for at least two years. Updates in software occur so rapidly now that old project files for After Effects, Premiere, Audition, iMovie, Final Cut, and others sometimes can’t be opened in the new version. Usually the new version allows a conversion to take place, but not all filters render the same way, and occasionally some are deleted. I am not aware of a solution to this issue. These are highly proprietary formats and I suspect some risk may simply be inherent in the promise to retain these project files.

This past week I took a first crack at establishing an underlying folder structure for Joe’s to use going forward. The Events documentation portions I’m pretty confident will suite their needs. I modeled them on the organizational system we use at MITRE for housing our event photography. The system has worked well for us in a multi-shooter environment. The keyword taxonomy we’ve worked up for use in Adobe Bridge roughly mirrors this system as well, though Joe’s is a bit simpler than the one we use at MITRE. The student work areas of the folder structure on the other hand bear a great deal more group discussion, as well as collaboration with Sierra, who after all, will be the one that either maintains the structure or abandons it.

Probably the biggest thing I’m struggling with at the moment is what to recommend to Joe’s for hardware changes. Our whole group agrees that working off of a centralized server would greatly simplify their workflow and preservation efforts. And I have no problem with offering that as one possible goal to work towards. However, that kind of server space and the high speed network that would necessarily accompany it in order to allow video editing will undoubtedly make that a long term goal at best. I would like to offer them a “minimum effort” option that is achievable in the short term and offers them at least some measure of increased security and/or performance. So far I’m undecided on what the easiest path forward would be.

You Have Died of Dysentery

Having just come from our second meeting with Joe’s staff I believe I can confidently say we have come face to face with Dallas’ “wild frontier.”

Our second meeting introduced us to some new figures at Joe’s that work with the files in a much more hands-on capacity than the administrators we spoke with two weeks ago. We had a brief discussion on their views regarding many of the same needs and organizational challenges, but the bulk of our time was spent delving into the loose network of external hard drives, and individual work stations where their files are stored. We spent more than an hour doing this, but it honestly felt like just a glimpse of the larger picture.

From what I saw there appeared to be three approaches to the organizational structure.

  1. Working folders: These were wildly inconsistent in their filenames, file locations, and hierarchical folder structure. In a few cases there was evidence that instructors had grouped certain types of work into folders by student name, but this practice was sporadic at best. These working areas were scattered in many places, both on the individual workstation hard drives, and the external drives that we were told were for backup. We were told that any organization of files in these areas was influenced by a combination of the instructor and the student’s organizational habits.
  2. Back up folders: These literally had names like “TTP 2015 PHOTO/VIDEO DUMP”. I got the impression that these were more or less one for one backups without any attempt to examine the contents.
  3. Curated folders: These folders had consistent naming structure and were found in several different drives. The apparent purpose of them appears to be retrospectively gather high value content from the current semester’s classes in order to make them easier to find.

My first thoughts after this meeting is that policy is going to be a big part of the solution for them. They currently have no guidelines that they give instructors or staff. This simple measure would go a long way towards establishing consistent practices. To aid in that capacity they could probably take advantage of many OS level features that would help them tighten control over certain areas. For example, they could give students read/write permission to only a single folder with their name on it in order to ensure all of a student’s work was contained in a logical location. Even if the student’s own organizational habits were lax these habits would not contaminate any other areas.

The epiphany of pragmatism

After reviewing both the Yakel and Dallas articles I honestly can’t say that I feel there are inconsistencies between the two, despite Dallas’ statements otherwise. Yes, Yakel’s article is broader in scope, and tends be referenced often by those in the library, archives, and museum fields. Thus, their application of the principles outlined in her article does tend to fit a more traditional and scholarly approach for large well established institutions. However, many of the applications that Dallas discusses strike me as the faithful application of the same principles Yakel outlines to complex real-world situations that are often significantly smaller in scale but which have the potential to be optimized to increase the value and accessibility of the collection; collections that ordinarily would not fall under the purview of large established institutions. As such they require a unique approach.

I had a similar positive reaction to the Dallas article that I did to the “Sheer curation” section of the Digital curation Wikipedia article. Both articles describe an effort to meet users as well as producers of digital assets “where they are” rather than shoehorn them into an existing archival model. This introduces the opportunity to enhance their workflows with an eye towards minimizing speed bumps, and apply digital curation practices that actually improve workflows at the same time that we add value to the collection. Likewise it seemed to me that both articles recognized digital asset producers as a primary user group of their own collections.

Much of the Dallas article struck me as a fruitful approach for dealing with the curatorial challenges I face at my office, and my group faces on the Joe’s Movement Emporium project. I firmly believe that the more we can help solve workflow issues for Joes, the more likely any new curatorial processes we introduce will be accepted and become entrenched in their regular workflow.

Blog Post 2

In considering the coverage of topics such as digital curation, digital preservation, content management, digital asset management, and others I find that the topic of digital preservation seems well represented, however the same level of effort does not seem to have carried over to some of the topics mentioned above that are often used interchangeably with digital preservation. This narrow focus on only the digital preservation article puzzles me a bit because I think there would be tremendous value to the layperson in describing how digital preservation differs from and relates to the other concepts. I think of digital curation in particular as being a more encompassing activity that leverages digital preservation expertise, but also takes a wider view of the entire asset lifecycle, and infuses the process with a mandate to improve the asset, not just preserve it. I think that distinction is either underplayed or missing in both articles.

I think the biggest content gap in the digital curation article is some description of the types of activities that go into digital curation. Yakel’s core concepts/activities list would be a good addition for the article to fill that need. Admittedly, someone did list the activities outlined in the DCC’s lifecycle model, however they originally seemed to be using them as a supporting statement in a section on digitization of analog materials. It did not make sense to me, so I separated the two topics and moved the lifecycle model to a new section tentatively called Methodology. I hope to revise this later with a broader reorganization effort.

If I had to guess at why the digital curation article was less developed than the digital preservation article I would guess that as a more narrowly focused activity, there may be more professionals that specialize in digital preservation specifically, whereas the process of digital curation spans multiple disciplines and may involve partnerships between multiple employees and departments. So, perhaps there are fewer professionals that feel they have the requisite expertise to speak on the subject holistically.

Blog post 1

My first reaction to the quality of some of the articles was puzzlement over the quality classification of some of them. For example, the Digital Preservation article seemed to have few content gaps. There are certainly some sections that could expand upon their particular topic, but I felt like the page overall had a pretty complete outline to adequately explain to the layperson what digital preservation was about. Yet the article was labeled as Start-class, the same quality label that was applied to the Digital Curation article, which I found very wanting in detail and completeness. I think the Digital Curation article could benefit from an explanation of the core concepts/activities of digital curation, as well as an explanation of how it differs in definition from digital preservation. I also feel like the Approaches section of the article is awkwardly implemented and seemingly random in what is included there.

The Digital Preservation WikiProject seems like a wonderful way to coordinate the efforts of multiple volunteers. And personally, I think that effort appears to have paid off specifically for the Digital Preservation article. Though it doesn’t look to me like there has been a great deal of recent activity on their stated goal of addressing shortcomings in related topic articles, as evidenced by the quality of articles such as the Digital Curation and Digital Asset Management articles. I think the Open Tasks section of the Digital Preservation WikiProject could be turned into a more specific list of articles for volunteers to address. That might help refocus efforts on filling the knowledge gaps that still exist in related topic articles.

On an academic level, the process of evaluating an article seems relatively straightforward. On a personal level I must admit I’m less fond of the old-school web forum style discussion areas. Users seem to regularly get snippy and pompous. Several years ago I stopped patronizing such forums in my personal life for that very reason, so this part of Wikipedia is an exercise in patience.