Introduction to AI Models
AI Models in Vectense Platform are large language models (LLMs) that provide intelligent processing capabilities for your workflows. They enable natural language understanding, content generation, and complex reasoning tasks.
What are AI Models?
AI Models are sophisticated neural networks trained on vast amounts of text data that can:
- Understand natural language input in context
- Generate human-like text responses
- Reason through complex problems and scenarios
- Transform data between different formats
- Extract specific information from unstructured content
How Models Work in Vectense
Model Integration
Models integrate seamlessly into workflows:
- Workflow Step: AI models are used as workflow steps
- Input Processing: Receive data from previous workflow steps
- Context Application: Use knowledge bases for additional context
- Intelligent Processing: Apply AI capabilities to the input
- Output Generation: Produce results for subsequent workflow steps
Persona-Driven AI
Vectense uses a persona system to control AI behavior:
- Role Definition: Specify what role the AI should play
- Goal Setting: Define what the AI should accomplish
- Skill Assignment: List specific capabilities the AI should use
- Communication Style: Control how the AI expresses responses
- Guardrails: Set boundaries for AI behavior
Model Types
Cloud-Based Models
Commercial AI services accessed via API:
Advantages:
- Latest AI technology and capabilities
- No infrastructure management required
- Scalable to any volume
- Regular updates and improvements
Considerations:
- Data sent to external providers
- Usage-based pricing (per token)
- Internet connectivity required
- Subject to provider terms and availability
Self-Hosted Models
AI models running on your own infrastructure:
Advantages:
- Complete data privacy and control
- No external dependencies
- Predictable costs (no per-use fees)
- Customizable and fine-tunable
Considerations:
- Requires infrastructure management
- Initial setup and maintenance complexity
- May have lower performance than latest cloud models
- Hardware requirements for optimal performance
Supported Providers
OpenAI
Available Models:
- Latest GPT models: Most capable models for complex reasoning and tasks
- Optimized variants: Improved speed and efficiency options
- Cost-effective options: Fast models for simpler tasks
Best For:
- Complex analysis and reasoning
- Code generation and debugging
- Creative writing and content generation
- General-purpose AI tasks
Anthropic
Available Models:
- Claude model family: Various capability and performance tiers
- High-capability models: Complex reasoning and analysis
- Efficient variants: Balanced performance options
Best For:
- Document analysis and comprehension
- Ethical and safe AI applications
- Long-form content processing
- Research and analytical tasks
Mistral AI
Available Models:
- Mistral model family: Range from small to large variants
- High-performance options: Complex task handling
- Efficient models: Cost-effective processing
Best For:
- Multilingual text processing
- Cost-effective AI processing
- European data governance requirements
- Efficient workflow automation
Ollama (Self-Hosted)
Available Models:
- Llama 2 & 3: Meta's open-source models
- Mistral: Self-hosted version of Mistral models
- CodeLlama: Specialized for code generation
- Many other open-source models
Best For:
- Sensitive data processing
- Air-gapped environments
- Cost-predictable deployment
- Custom model fine-tuning
Model Capabilities
Natural Language Understanding
- Text Comprehension: Understand complex documents and instructions
- Intent Recognition: Identify what users want to accomplish
- Context Awareness: Maintain understanding across long conversations
- Language Detection: Automatically identify input language
Content Generation
- Text Creation: Generate articles, emails, summaries, and reports
- Format Conversion: Transform content between different formats
- Style Adaptation: Adjust writing style for different audiences
- Template Filling: Complete forms and templates with appropriate content
Data Processing
- Information Extraction: Pull specific data from unstructured text
- Classification: Categorize content into predefined groups
- Summarization: Create concise summaries of long documents
- Translation: Convert text between different languages
Reasoning and Analysis
- Logical Inference: Draw conclusions from available information
- Problem Solving: Break down complex problems into steps
- Comparison: Analyze similarities and differences
- Recommendation: Suggest actions based on criteria
Model Selection Criteria
Task Complexity
Simple Tasks (Classification, basic extraction)
- Smaller, efficient models from any provider
- Faster response times, lower costs
- Suitable for high-volume processing
Complex Tasks (Analysis, reasoning, creative work)
- Larger, more capable models from providers
- Higher quality outputs, better reasoning
- Worth the additional cost for critical tasks
Data Sensitivity
Public/Non-sensitive Data
- Cloud models from external providers acceptable
- Take advantage of latest AI capabilities
- Simpler setup and management
Sensitive/Regulated Data
- Self-hosted models recommended
- Complete data privacy and control
- Compliance with data governance requirements
Volume and Cost
Low to Medium Volume
- Cloud models with pay-per-use pricing
- No infrastructure investment required
- Predictable costs based on usage
High Volume
- Consider self-hosted models
- Infrastructure costs vs. per-token costs
- Break-even analysis for cost optimization
Performance Requirements
Real-time Applications
- Faster, smaller models from any provider
- Consider response time requirements
- May need to trade quality for speed
Batch Processing
- Can use more capable models
- Quality more important than speed
- Optimize for best results
Model Configuration
Basic Settings
Model Selection
- Choose appropriate model for your use case
- Consider performance vs. cost trade-offs
- Test different models for quality comparison
API Configuration
- Set API endpoints and authentication
- Configure timeout and retry settings
- Set up monitoring and alerting
Advanced Parameters
Temperature (0.0 - 2.0)
- Controls randomness in responses
- Lower values: more deterministic
- Higher values: more creative/random
Max Tokens
- Limits response length
- Helps control costs
- Prevents overly long responses
Top P (0.0 - 1.0)
- Alternative to temperature
- Controls diversity of word choices
- Fine-tunes response variation
Cost Considerations
Token-Based Pricing
Understanding Tokens
- Roughly 0.75 words per token
- Both input and output count toward costs
- Different models have different per-token rates
Cost Optimization
- Use efficient prompts
- Choose appropriate model size
- Consider self-hosted for high volume
Self-Hosted Costs
Infrastructure Costs
- Hardware purchase or rental
- Electricity and cooling
- Maintenance and support
Break-Even Analysis
- Compare infrastructure costs to token costs
- Consider growth projections
- Factor in management overhead
Security and Privacy
Data Handling
Cloud Models
- Data sent to external providers
- Review privacy policies and terms
- Understand data retention practices
Self-Hosted Models
- Complete data control
- No external data transmission
- Suitable for regulated industries
Access Control
- Role-based access to models
- API key management and rotation
- Usage monitoring and auditing
- Network security and encryption
Getting Started
First Steps
- Assess Requirements: Understand your AI processing needs
- Choose Provider: Select based on data sensitivity and requirements
- Configure Model: Set up authentication and basic parameters
- Test Functionality: Validate model performance with sample data
- Integrate Workflows: Use models in your automation processes
Best Practices
- Start with well-known models and standard settings
- Test thoroughly with representative data
- Monitor usage and costs regularly
- Keep API keys secure and rotate regularly
- Document successful configurations for reuse
Ready to configure your first model? Continue to Connect a Model for detailed setup instructions.