Load File Ingestion

Import's Load File Ingestion provides access to:

Both new content and data overlays are handled in the same manner. The pre-processed data may come in a variety of forms. For the purposes of loading into Nebula, we categorize the incoming content into the following four categories:

  • Load files: At a minimum, load files should contain document identifiers and family identifiers. Most load files contain document metadata and pointers to text and native files. Load files are typically stored in a data folder within a volume folder on a Nebula share.
  • Images: Only single page TIFF or JPEG files accompanied by an image load file (such as Opticon, LFP, or DII) may be loaded into Nebula as images. Image files are typically stored in an images folder within a volume folder on a Nebula share.
  • Native files: Native files consist of files delivered in a native format (such as Microsoft Excel spreadsheets, Word documents, MSG email messages or PDF files). Native files are typically stored in an natives folder within a volume folder on a Nebula share.
  • Text: Text files are typically stored in a text folder within a volume folder on a Nebula share.

Note: Pre-processed content may consist of one or more of these categories.

Ideally, pre-processed content will be delivered in a format mirroring this structure. However, in the events it takes another form please consult with the KLD project manager or Nebula TechQ to explore options available for reorganization or required conversion to an importable format. Should pre-processed content arrive without a metadata load file. content may still be loaded, with the following caveats:

  • Single-page images require an image load file in order to be loaded.
  • Native and/or text file names, minus the file extension, will determine the document ID of a resulting record in Nebula. (For example, doc0001234.xlxs would import into a Nebula record ID of doc0001234.)
  • Document records will not contain any metadata.
  • Family relationships, such as parents and attachments, will not be present.
  • Records loaded without associated text will not be searchable, except for by document ID.

Note: Pre-processed content must always be staged in an output share location.