05 – File and folder naming

Naming Folders

Following a logical naming schema for folders is just as important as the file naming schema when scanning.

We recommend sticking to the physical structure of collections in a hierarchy such as:

  • Institution name folder
    • Manuscript/Series/ or Collection name folder
      • Box name folder
        • Folder name folder
          • [files]
          • Document name folder – this is an optional folder. Some documents, such as books, can be associated with hundreds of files. It is usually easier to sift through a collection when high volume documents are separated into their own folders rather than all being in one folder.
            • [files for document]

Regardless of whether one is scanning for five minutes or an hour, it’s important to always have a folder name with the collection title in it for longevity and to avoid user error. For the purposes of quality control and preservation, folder naming is of the utmost importance.

Batch Folder Creation in Windows

If you have a list of folders you wish to create in another folder, then add those names to a blank text document by one line each. Add “mkdir” in front of each folder name without the quotes.

e.g.

mkdir Mss2_Box001

mkdir Mss2_Box002

Save the text file, but add “.bat” to the end of the text file name. Place the .bat file in the directory you wish to create the folders in and then double click the file to run it.

Naming Files

Your supervisor will always assign the file naming scheme or convention for you to follow during digitization.

Multiple files to one image/book and front and back

Some files will need to be scanned in parts or pages. Whether a book or an oversize item that is represented by several images, the first image will always be _001 and the following files incremented as needed.

We will use “_front” and “_back” to eliminate any misunderstanding of a different alphanumeric naming schema.

Special Collections file naming schema

Clemson University Libraries Special Collections mainly organizes content based on two categories: University Archives (ua) and Manuscripts (MSS).

  • University Archives Series
    • ua#_”Box”#_Folder#_Item#_Page# – e.g. “ua81_Box01_01_002_003”
    • Oversize: Series#_OS_Folder#_Item#_Page#
  • Manuscript Archives
    • Manuscript without series: Mss#_”Box”#_Folder#_Item#_Page# – “Mss71_Box01_01_001_001”
    • Manuscript with series: Mss#_Series#_”Box”#_Folder#_Item#_Page# – “Mss71_01_Box01_01_001_001”
    • Manuscript Oversize: Mss#_Series#_OS_Folder#_Item#_Page#

National Park Service file naming schema

For the NPS, we use their park code, usually the first two letters of the first two words in the park’s name, eg. Fort Sumter FOSU, and then the given catalog/id number for an item. Sometimes an accession number is also used if catalog numbers repeat over multiple collections. If no catalog number has been assigned, then we use a similar schema as the Special Collections one listed above.

Other file naming conventions

Much like the Special Collections file naming schema, it’s ideal to name a file based on its physical presence within a collection. Most collections will be assigned a short, 3 or 4 letter identifier. If scanning materials from a partner institution, their initials are used. Presbyterian becomes “pre” or Southern Wesleyan becomes “swu”. Refer back to box, folder, multi-page document, and finally page number.

Boxes and folders may not have numbers and shortened names can be used. Even document titles can be used for naming a file, which most institutions prefer. The importance for our team is to scan a file so that when we see just that file name, we are able to identify where it belongs within a collection. Otherwise, it’s important to implement embedded technical metadata within a file’s XMP or IPTC fields. Here one can add information as to who scanned the file, where and when it was scanned, and what collection that file belongs to. Most photo editing software allows the embedding of such metadata.

Always avoid adding symbols to file names. Generally “-“, “_”, and “#” are OK to use, but periods and ampersands (&) should never be used.