Dashboard (Import)

The Import Dashboard displays a summary of every collection of ingested files (imports) in Nebula, whether they are completed, on going, or queued. Each import has a unique name and displays a state (success, warning, or fail), a start date time and an end date time, and a summary of information regarding the documents imported. The Dashboard's History page enables you to review the import history dating back to the creation of the repository. You can also use the History page to run post-processing tasks, including identifying language present in a collection and NLP processing.

Types of Imports

Processing imports are labeled Collections on the dashboard,

Pre-processed data (or produced content) imports are labeled ThirdParty.

Content overlays are labeled Data Overlay.

Production overlays are labeled Production.

Third party, data overlay, and production imports will also indicate whether they have been staged, in which case the import displays STAGING. Staging is the step by which Nebula verifies the potential imports for exceptions, errors, or issues. If the import has completed, the import displays IMPORT.

To view the History page

  1. Click Import > Dashboard.

Import History Actions

Clicking the Action icon for a document set enables you to perform the following tasks:

  • View additional details of each entry

Every collection or import in a repository has a details page. The details available will differ depending on the type of import.

To view details of an import collection

  1. In the History list, locate the import collection you want to view in detail.
  2. Click its Action icon and select Details.
  3. View the import collection's details.
  • Delete the import collection

To delete an import collection

  1. In the History list, locate the import collection you want to delete.
  2. Click its Action icon and select Delete.
  3. On the Delete Item dialog box, click Delete.
  • Detect language
  • Language ID is a process that attempts to identify the language(s) in documents. Documents with very little text or with mostly numbers (such as spreadsheets) tend to make poor candidates for language identification. The Language ID tool can be configured to detect the predominant language or to try to detect multiple languages with a document.

To detect language

  1. In the History list, locate the import collection you want to work with.
  2. Click its Action icon and select Language ID.
  3. On the Language ID dialog box, enter the following information on the Language Options tab:
    • Max Text Snippet Size: 10 KB to 20 MB
    • Language Probability: 0.05 to 0.99.
    • Detection Mode: Single or Multiple.
    • Select to Detect OCRed Documents. Clear to include.
    • Detect Languages in only undetected documents or all documents.
    • Min File Size: 10 Bytes to 300 Bytes.
    • Short to Normal Threshold: 50 Bytes to 1000 Bytes.
    • Select to Ignore Spreadsheets. Clear to include.
  4. On the Distribution tab, select one of the following :
    • Default Strategy (Process on any available Workers): Select to distribute documents amongst available workers.
    • Select Workers (Process with specific Workers): Select the workers to receive the distributed documents.
  5. Click Save.
  • Apply NLP Processing
  • Natural language processing tools analyze document text in authored content.

    Sentiment analysis works to determine whether the communication is positive, neutral, or negative. Sentiment analysis is typically run on communication-type documents, like emails.

    Named Entity Detection works to identify and classify entities within a document, such as names of people, places, or organizations, and groups them into a set of categories.

    You can configure the NLP tool to perform sentiment analysis or named entity detection, or both.

To apply NLP Processing

  1. In the History list, locate the import collection you want to work with.
  2. Click the Action icon and select  NLP Processing.
  3. On the Natural Language Processing dialog box, select the type of NLP process you want to apply:
    • If you choose Sentiment Analysis, select the Document types.
    • If you choose Named Entity Detection, select the Document types and Entity categories.
  4. Select the documents in the collection you want to process: All or Only Outdated (documents added to the collection since the last time it was processed).
  5. Click Save.
  • Reindex Collection

Reindexing can be run in the event that the Nebula search index becomes out of sync with the text in a collection. The re-index process brings the collection index back into sync in the database.

Note: This is not a common occurrence and performing collection reindexing is not recommended without the guidance of the Nebula TechQ support team.

To reindex collection

  1. In the History list, locate the import collection you want to work with.
  2. Click the Action icon and select Reindex Collection.
  3. On the Reindex Collection dialog box, click Reindex.
  • Create an Optical Character Recognition (OCR) collection
  • OCR generates searchable text from image-type documents, such as scanned documents that aren't otherwise searchable. Any doucments that Nebula is unable to successfully

To create an OCR collection

  1. In the History list, locate the import collection you want to work with.
  2. Click the Action icon and select OCR Collection.
  3. On the OCR Collection dialog box, select the following:
    • Items
    • Priority
    • Language profile
  4. Click Save.
  • Export records needing OCR
  • In the events that Nebula is unable to successfully perform OCR for documents in a collection, this process enables you to export affected documents so that OCR can be run using separate external tools and then to re-import the results back into Nebula.

To export needs OCR

  1. In the History list, locate the import collection you want to work with.
  2. Click its Action icon and select Export Needs OCR.
  3. On the Export Needs OCR dialog box, select the Items.
  4. Click the Folder icon to select Output.
  5. Click Save.
  • Import OCR

To import OCR

  1. In the History list, locate the import collection you want to work with.
  2. Click its Action icon and select Import OCR.
  3. On the Import OCR dialog box, click the folder icon to Select input.
  4. Click Ok.
  • Process the import collection

To process the import collection

  1. In the History list, locate the import collection you want to process.
  2. Click its Action icon and select Processing options.
  3. Complete the Upload Files page.