Managing Your Knowledge Source

Learn how to upload and manage different types of data sources for your ChatNexus chatbot

Your chatbot's effectiveness depends heavily on the quality and organization of its knowledge base. This guide covers how to upload, configure, and manage different types of data sources to enhance your chatbot's capabilities.

Accessing Data Source Settings

Datasource Navigation
  1. Navigate to the Configuration section in your dashboard
  2. Select "Datasource Settings" from the menu
  3. You'll see the data source management interface with various upload options

Supported Data Sources

File Upload

Datasource Navigation

Supported Formats

  • PDF Documents (*.pdf)
  • Microsoft Word (*.doc, *.docx)
  • Text Files (*.txt)
  • Markdown (*.md)
  • CSV/Excel Files (*.csv, *.xlsx)
  • JSON Documents (*.json)
  • HTML Files (*.html)

Best Practices for File Upload

  • Ensure files are properly formatted
  • Remove any unnecessary formatting
  • Keep file sizes under recommended limits
  • Use descriptive file names
  • Organize documents by topic or category
Datasource Navigation

Add knowledge from web pages:

  • Enter complete URLs
  • Support for both single pages and entire domains
  • Automatic content extraction and cleaning
  • Handles dynamic content
  • Respects robots.txt and site policies

Tips for Website Import

  • Verify URL accessibility
  • Check content relevance
  • Consider update frequency
  • Monitor crawling depth
  • Respect website terms of service
Datasource Navigation

Import knowledge from video content:

  • Automatic transcript extraction
  • Support for timestamps
  • Handles video descriptions
  • Processes closed captions
  • Multiple language support

Optimizing YouTube Imports

  • Use high-quality videos
  • Verify transcript accuracy
  • Consider video length
  • Check caption availability
  • Select relevant segments

Plain Text Input

Datasource Navigation

Direct text entry for:

  • Quick knowledge additions
  • Custom instructions
  • Specific rules or guidelines
  • Immediate updates
  • Testing and validation

Advanced Configuration

Chunk Settings

Datasource Navigation

Control how your content is processed:

Chunk Size

  • Default: 1000 characters
  • Recommended range: 500-2000 characters
  • Factors to consider:
  • Content complexity
  • Response detail needed
  • Processing efficiency
  • Token limits

Text Splitter Options

  1. Recursive Character Text Splitter
  • Best for general purpose use
  • Maintains context across splits
  • Configurable overlap
  • Preserves semantic meaning
  1. Token Text Splitter
  • Optimal for API token limits
  • Precise control over chunk sizes
  • Efficient processing
  • Ideal for technical content
  1. Markdown Text Splitter
  • Preserves markdown structure
  • Maintains formatting
  • Respects document hierarchy
  • Better for documentation
  1. HTML Text Splitter
  • Preserves HTML structure
  • Handles web content
  • Maintains tag hierarchy
  • Better for web pages

Monitor these metrics in your dashboard to ensure optimal performance.