Import Details

Every collection or import in a repository has a details page. The details available will differ depending on the type of import.

To view details of an import collection

  1. Click the Action icon for the collection you want to view, then click Details.
  2. View the following information:
    • Status
    • Start Time
    • Stop Time
    • Elapsed

    Collections - Processed Content contain the following information:

    Details

    Overview of documents located during the processing, as well as the number that were culled into Review.

    1. Discovered: Number of files in input of collection.
    2. Exceptions: Number of exceptions during processing into Cull or a matter. (A report is available for these documents.)
    3. Note: A document can have more than one exception for not processing.

    4. Input Size: Data size in GB of the original data size (compressed).
    5. Exploded: Number of documents present after extraction of compressed files, such as ZIPs and RARs, and including email attachments.
    6. NIST/Excluded: Number of deNISTed documents/number of excluded files (for example, JPGs <5kb, signature line images, and so on.)
    7. Output Size: Data size in GB of the uncompressed data.
    8. Processed: Number of total files processed (including containers, extracted documents, embedded documents and attachments).
    9. Need OCR: Number of documents flagged as having image layers without text.
    10. Duplicates: Document Count of family duplicates within the collection.
    11. Containers: Number of PSTs, ZIPs, RAR, other container files found in the data set.
    12. OCRed: Number of files sucessfully OCRed (whether within Nebula or imported from external application.)
    13. Exported Duplicates: Document count of family duplicates to data within the collection that have been exported.
    14. Extracted: Total number of documents extracted and searchable.
    15. Promoted to Review: Final number of documents promoted to the review matter.

    Charts

    Bar graph of documents imported into Nebula summarizing individual custodians and document types in a collection.

    • Custodian Summary
    • Doc Types Application Summary

    Exceptions Summary

    In case of any exceptions, the Exceptions Summary provides counts and types of exceptions that were recorded during the processing of the collection. Click CSV Report to generate and export an exceptions report.

    Exceptions can be caused by improper paths in the load file, ID conflicts with existing documents, incomplete load files, or improper import settings.

    The Nebula TechQ team can provide guidance regarding exception causes and recommend resolutions.

    Document Sources

    This provides the content source path, corresponding custodian assignment for each path, and an output path for the data in the collection.

    • Path
    • Custodian
    • Output path
    • Export Metadata

    Operations

    This shows the status of post-processing operations performed on a collection.

    1. Export Metadata
    2. HTML Applicable Files in the Collection
    3. Detect Language on the Collection
    4. Reindex Applicable Files in the Collection
    5. OCR Applicable Files in the Collection
    6. Export files that need OCR in the Collection
    7. Import files into Collection as OCR
    8. Named Entity Detection on the Collection
    9. Sentiment Analysis on the Collection

    Processing options

    This shows the options that were selected for the processing of the collection.

    1. Text Email Headers
    2. Max Spreadsheet Size
    3. Exclude Attached Images
    4. Use System Date
    5. De-NIST
    6. Ignore System Dates after Collection Date
    7. Explode Embedded

    Third-party Imports contain the following information:

    Processing

    This shows counts of information such as documents staged and imported, pages imported, text imported, natives imported, documents with images imported, and any import errors that may have occurred.

    Configuration

    This shows details regarding the configuration of the import.

  3. Operations

    This shows the status of post-import operations performed on imported content.

  4. Overlay and Production imports contain the following information:

    Processing

    This shows counts of information such as documents staged and imported, pages imported, text imported, natives imported, documents with images imported, and any import errors that may have occurred.

  5. Configuration

    This shows details regarding the configuration of the overlay.

To reprocess an import collection

  1. On the Processing Details page, click Reprocess.
  2. On the Upload Files page, click Restart.

To generate a report for an import collection

  1. On the Processing Details page, click Reports.
  2. On the Create Report dialog box, select the Report Type you want to generate:
    • Exclusion Report: Summary of documents that were not imported, usually because they were duplicates or did not meet minimum requirements.
    • Exception Report: Summary of documents that did not process due to an issue with the file.
    • OCR Report: Summary of documents of OCR’d due to the RUN option.
  3. Click CSV.