Add Knowledge

Learn how to create and configure knowledge bases in Vectense Platform to provide contextual information for your AI workflows.

Overview

Adding knowledge to your workspace involves:

Creating a Knowledge Base: Set up the knowledge container
Configuring Sources: Connect to your data sources
Processing Content: Index and prepare content for AI use
Testing Retrieval: Validate that knowledge works correctly

Prerequisites

Before creating a knowledge base:

Active Workspace: Access to workspace with knowledge creation permissions
AI Model: At least one configured model for embedding generation
Content Sources: Documents, files, or data sources to index
Understanding: Clear idea of what knowledge you want to capture

Creating Your First Knowledge Base

Step 1: Navigate to Knowledge Creation

Go to Knowledge: Click "Knowledge" in the main navigation
Create New: Click "Create a new knowledge" button
Start Configuration: Begin the knowledge setup wizard

Step 2: Configure Basic Information

Knowledge Name

Choose a descriptive name that reflects the content
Examples: "Product Documentation", "Company Policies", "Customer Support KB"
Can be changed later if needed

Description

Optional but recommended description
Helps team members understand the knowledge purpose
Include information about content scope and intended use

AI Model Selection

Choose which model will generate embeddings for this knowledge
The model affects search quality and language understanding
Use the same model you plan to use in workflows for best results

Step 3: Choose Knowledge Source Type

Select the type of data source for your knowledge base:

File Bucket (Recommended for Beginners)

What it is: Direct file upload through the web interface

Best for:

Company documents and policies
Product manuals and documentation
Training materials and guides
Small to medium document collections

Supported Formats:

PDF: Portable Document Format files
Word: Microsoft Word documents (.docx, .doc)
Excel: Spreadsheets and data files (.xlsx, .xls, .csv)
Text: Plain text files (.txt)
RTF: Rich Text Format documents
Markdown: Markdown formatted files (.md)

Configuration:

No additional configuration required
Files are uploaded after knowledge creation
Automatic format detection and processing
Simple drag-and-drop interface

How to Use:

Select "File Upload" as source type
Complete knowledge creation
Upload files using drag-and-drop interface
Wait for automatic processing and indexing

Local Filesystem

What it is: Connect to local directories and network drives

Best for:

Large document repositories
Shared network drives
Version-controlled documentation
Automatically updated content

Configuration Options:

Source Path

Local Directory: /path/to/documents
Network Share: //server/share/documents
Mounted Drive: /mnt/shared/knowledge

File Pattern (Glob)

All Files: **/* (everything recursively)
PDF Only: **/*.pdf
Documentation: **/*.{md,txt,pdf}
Exclude Folders: **/*.pdf,!**/archive/**

Pattern Examples:

**/*.pdf              # All PDF files recursively
docs/**/*.md          # Markdown files in docs folder
*.{txt,md}           # Text and markdown in root only
**/*,!**/temp/**     # Everything except temp folders

How to Configure:

Select "Local Filesystem" as source type
Enter the source path to monitor
Set file pattern to filter files
Configure update frequency (if supported)
Test connection and file access

Web Content

What it is: Crawl and index web pages and documentation

Best for:

Product documentation websites
Internal wikis and knowledge bases
News and blog content
Public information sources

Configuration Options:

Start URL

Documentation Site: https://docs.yourcompany.com
Wiki: https://wiki.internal.com/products
Blog Section: https://blog.company.com/category/product

Crawl Depth

1: Only the starting page
2: Starting page + directly linked pages
3+: Multiple levels of links (be careful with large sites)

Max Pages

Limit total pages crawled
Prevent excessive resource usage
Typical values: 50-500 pages

Advanced Options:

URL Patterns: Restrict crawling to specific URL patterns
Exclude Patterns: Skip certain pages or sections
Update Schedule: How often to refresh content

How to Configure:

Select "Web Content" as source type
Enter the starting URL
Set crawl depth (usually 2-3 levels)
Set maximum pages to crawl
Configure any URL restrictions
Test the crawl with a small depth first

Step 4: Create and Process

Create Knowledge Base

Review all configuration settings
Click "Create" to create the knowledge base
Wait for initial setup to complete
Monitor processing status

Content Processing

File Upload: Upload files through the interface
Filesystem: Automatic scanning and indexing begins
Web Crawl: Crawling starts immediately
Progress Monitoring: Track processing in the Jobs section

Content Management

File Upload Management

Upload Files

Navigate to your knowledge base
Go to the "Edit" tab
Use drag-and-drop or "Select Files" button
Wait for upload and processing completion

Supported Upload Methods:

Drag and Drop: Drag files directly to the upload area
File Browser: Click "Select Files" to browse
Bulk Upload: Select multiple files at once

File Management:

View Uploaded Files: See all files in the knowledge base
Delete Files: Remove individual files
Replace Files: Upload newer versions
Monitor Processing: Track indexing progress

Filesystem Management

Monitoring

Files are automatically detected and processed
New files are indexed when added
Modified files are re-processed
Deleted files are removed from the index

File Filters

Use glob patterns to control which files are included
Exclude temporary or system files
Focus on specific file types or directories

Update Frequency

Changes are detected in real-time or near real-time
Large directories may have some processing delay
Monitor the Jobs section for processing status

Web Content Management

Content Updates

Web content can be refreshed manually or automatically
Set up refresh schedules for regularly updated sites
Monitor for broken links or access issues

Content Quality

Review extracted content for quality
Some web pages may not extract cleanly
Adjust crawl settings if needed

Content Processing Details

Text Extraction Process

Document Processing

Format Detection: Automatic detection of file type
Text Extraction: Pull text content from documents
Structure Preservation: Maintain headings and organization
Metadata Extraction: Capture file properties and information

Content Chunking

Size Optimization: Break large documents into manageable pieces
Context Preservation: Maintain document structure and relationships
Overlap Strategy: Ensure continuity between chunks
Quality Assurance: Validate chunk quality and content

Vector Generation

Embedding Creation: Convert text to numerical representations
Semantic Indexing: Enable meaning-based search
Optimization: Optimize for search performance
Storage: Store embeddings in vector database

Quality Assurance

Content Validation

Verify that all uploaded files are processed successfully
Check for any processing errors or warnings
Review extracted text quality
Validate that content is searchable

Common Issues and Solutions:

Files Not Processing

Check file format is supported
Verify file is not corrupted
Ensure file permissions allow reading
Check system resources and processing queue

Poor Text Extraction

Try different file formats (e.g., export PDF to Word)
Check for password-protected files
Verify file encoding for text files
Review extracted content in processing logs

Missing Content

Verify file patterns include desired files
Check directory permissions for filesystem sources
Validate web URLs are accessible
Review crawl logs for errors

Testing Your Knowledge Base

After content processing is complete:

Manual Testing

Navigate to Knowledge: Go to your knowledge base
Test Tab: Click on "Test" or "Manual Run"
Enter Query: Type a question related to your content
Review Results: Check that relevant content is returned
Refine if Needed: Adjust configuration based on results

Test Queries Examples

Product Info: "What are the system requirements?"
Process Questions: "How do I reset my password?"
Policy Queries: "What is the return policy?"
Technical Questions: "How do I configure SSL?"

Quality Evaluation

Relevance: Results should be relevant to the query
Completeness: Important information should be findable
Accuracy: Retrieved content should be correct
Coverage: Test various types of queries

Integration with Workflows

Using Knowledge in Workflows

Context Injection

Knowledge is automatically available to AI steps
AI models can query knowledge during processing
Relevant context is provided based on workflow needs

Manual Knowledge Queries

Use Knowledge Retrieval action in workflows
Explicitly search for specific information
Control what context is provided to AI models

Best Practices

Design knowledge structure to match workflow needs
Test knowledge retrieval with actual workflow scenarios
Monitor knowledge usage and performance
Keep content updated and relevant

Monitoring and Maintenance

Usage Monitoring

Query Volume: Track how often knowledge is accessed
Popular Content: Identify most-accessed information
Performance: Monitor search response times
Quality Metrics: Track search result relevance

Content Maintenance

Regular Updates: Keep content current and accurate
Quality Reviews: Periodically review content quality
Cleanup: Remove outdated or irrelevant content
Optimization: Optimize search performance

Performance Optimization

Index Maintenance: Regular optimization of search indexes
Content Curation: Focus on high-quality, relevant content
Resource Monitoring: Track processing and storage usage
Cost Management: Monitor embedding generation costs

Troubleshooting

Common Issues

Knowledge Creation Fails

Check user permissions for knowledge creation
Verify workspace license supports knowledge bases
Ensure AI model is configured and accessible
Review error messages for specific issues

Content Not Processing

Verify source accessibility (filesystem permissions, web URLs)
Check file formats are supported
Monitor processing jobs for errors
Review system resources and capacity

Poor Search Results

Review content quality and organization
Test with different query phrasings
Check if content was processed correctly
Consider adjusting content chunking settings

Performance Issues

Monitor system resources during processing
Consider processing content in smaller batches
Review concurrent processing limits
Optimize file organization and structure

Getting Help

Documentation: Reference specific source type guides
Job Logs: Review processing logs for detailed error information
Community: Ask questions in user forums
Support: Contact technical support for complex issues

Best Practices

Content Organization

Clear Structure: Organize content logically and consistently
Quality Focus: Prioritize high-quality, relevant content
Regular Updates: Keep content current and accurate
Documentation: Document content sources and organization

Performance Optimization

Batch Processing: Process large amounts of content in batches
Incremental Updates: Only process changed content when possible
Resource Management: Monitor and optimize resource usage
Index Maintenance: Regularly optimize search indexes

Security and Privacy

Access Control: Limit access to sensitive knowledge bases
Content Review: Review content for sensitive information
Audit Logging: Track knowledge access and usage
Compliance: Ensure knowledge handling meets regulatory requirements

Your knowledge base is now ready to provide intelligent context to your AI workflows. Continue to Test Knowledge to validate retrieval quality, then Refresh Knowledge to learn about maintenance and updates.

Overview​

Prerequisites​

Creating Your First Knowledge Base​

Step 1: Navigate to Knowledge Creation​

Step 2: Configure Basic Information​

Step 3: Choose Knowledge Source Type​

File Bucket (Recommended for Beginners)​

Local Filesystem​

Web Content​

Step 4: Create and Process​

Content Management​

File Upload Management​

Filesystem Management​

Web Content Management​

Content Processing Details​

Text Extraction Process​

Quality Assurance​

Testing Your Knowledge Base​

Manual Testing​

Test Queries Examples​

Quality Evaluation​

Integration with Workflows​

Using Knowledge in Workflows​

Monitoring and Maintenance​

Usage Monitoring​

Content Maintenance​

Performance Optimization​

Troubleshooting​

Common Issues​

Getting Help​

Best Practices​

Content Organization​

Performance Optimization​

Security and Privacy​

Overview

Prerequisites

Creating Your First Knowledge Base

Step 1: Navigate to Knowledge Creation

Step 2: Configure Basic Information

Step 3: Choose Knowledge Source Type

File Bucket (Recommended for Beginners)

Local Filesystem

Web Content

Step 4: Create and Process

Content Management

File Upload Management

Filesystem Management

Web Content Management

Content Processing Details

Text Extraction Process

Quality Assurance

Testing Your Knowledge Base

Manual Testing

Test Queries Examples

Quality Evaluation

Integration with Workflows

Using Knowledge in Workflows

Monitoring and Maintenance

Usage Monitoring

Content Maintenance

Performance Optimization

Troubleshooting

Common Issues

Getting Help

Best Practices

Content Organization

Performance Optimization

Security and Privacy