Managing Your Knowledge Source
Learn how to upload and manage different types of data sources for your ChatNexus chatbot
Your chatbot's effectiveness depends heavily on the quality and organization of its knowledge base. This guide covers how to upload, configure, and manage different types of data sources to enhance your chatbot's capabilities.
Accessing Data Source Settings

- Navigate to the Configuration section in your dashboard
- Select "Datasource Settings" from the menu
- You'll see the data source management interface with various upload options
Supported Data Sources
File Upload

Supported Formats
- PDF Documents (*.pdf)
- Microsoft Word (*.doc, *.docx)
- Text Files (*.txt)
- Markdown (*.md)
- CSV/Excel Files (*.csv, *.xlsx)
- JSON Documents (*.json)
- HTML Files (*.html)
Best Practices for File Upload
- Ensure files are properly formatted
- Remove any unnecessary formatting
- Keep file sizes under recommended limits
- Use descriptive file names
- Organize documents by topic or category
Website Links

Add knowledge from web pages:
- Enter complete URLs
- Support for both single pages and entire domains
- Automatic content extraction and cleaning
- Handles dynamic content
- Respects robots.txt and site policies
Tips for Website Import
- Verify URL accessibility
- Check content relevance
- Consider update frequency
- Monitor crawling depth
- Respect website terms of service
YouTube Links

Import knowledge from video content:
- Automatic transcript extraction
- Support for timestamps
- Handles video descriptions
- Processes closed captions
- Multiple language support
Optimizing YouTube Imports
- Use high-quality videos
- Verify transcript accuracy
- Consider video length
- Check caption availability
- Select relevant segments
Plain Text Input

Direct text entry for:
- Quick knowledge additions
- Custom instructions
- Specific rules or guidelines
- Immediate updates
- Testing and validation
Advanced Configuration
Chunk Settings

Control how your content is processed:
Chunk Size
- Default: 1000 characters
- Recommended range: 500-2000 characters
- Factors to consider:
- Content complexity
- Response detail needed
- Processing efficiency
- Token limits
Text Splitter Options
- Recursive Character Text Splitter
- Best for general purpose use
- Maintains context across splits
- Configurable overlap
- Preserves semantic meaning
- Token Text Splitter
- Optimal for API token limits
- Precise control over chunk sizes
- Efficient processing
- Ideal for technical content
- Markdown Text Splitter
- Preserves markdown structure
- Maintains formatting
- Respects document hierarchy
- Better for documentation
- HTML Text Splitter
- Preserves HTML structure
- Handles web content
- Maintains tag hierarchy
- Better for web pages
Monitor these metrics in your dashboard to ensure optimal performance.