Skip to main content

Models

AI Models are the intelligent engines that power your workflows in Vectense Platform. They provide language understanding, reasoning, and generation capabilities that enable sophisticated automation and content processing.

Overview

Models in Vectense Platform enable:

  • Natural Language Processing: Understand and generate human-like text
  • Document Analysis: Extract information from various document types
  • Decision Making: Make intelligent choices based on context and criteria
  • Content Generation: Create summaries, responses, and new content
  • Data Transformation: Convert unstructured data into structured formats

Quick Navigation

Supported Model Providers

Commercial AI Providers

OpenAI

  • Models: Latest GPT models and variations
  • Strengths: Excellent general performance, function calling, reasoning
  • Use Cases: Complex analysis, content generation, conversation
  • Requirements: OpenAI API key

Anthropic

  • Models: Claude model family (various capability tiers)
  • Strengths: Strong reasoning, safety, long context handling
  • Use Cases: Document analysis, ethical AI, research tasks
  • Requirements: Anthropic API key

Mistral AI

  • Models: Mistral model family (small to large variants)
  • Strengths: Efficient performance, multilingual support, cost-effective
  • Use Cases: General automation, content processing, analysis
  • Requirements: Mistral API key

Other Providers

  • Google: Gemini and PaLM model families
  • Azure OpenAI: Enterprise-grade OpenAI models
  • Custom Providers: Any OpenAI-compatible API endpoint

Self-Hosted Solutions

Ollama

  • Models: Wide variety of open-source models (Llama, Mistral, CodeLlama, etc.)
  • Strengths: Complete privacy, no external dependencies, cost control
  • Use Cases: Sensitive data processing, air-gapped environments, custom models
  • Requirements: Ollama server installation

Other Self-Hosted Options

  • OpenAI-Compatible Servers: vLLM, TGI, LocalAI, etc.
  • Custom Deployments: Any API-compatible model server

Model Capabilities

Language Understanding

  • Text Comprehension: Understand complex documents and instructions
  • Context Awareness: Maintain context across conversation turns
  • Intent Recognition: Identify user goals and requirements
  • Entity Extraction: Pull specific information from unstructured text

Content Generation

  • Text Generation: Create articles, summaries, and responses
  • Format Conversion: Transform content between different formats
  • Translation: Convert text between languages
  • Style Adaptation: Adjust tone and style for different audiences

Reasoning and Analysis

  • Logical Reasoning: Make inferences and draw conclusions
  • Data Analysis: Analyze patterns and trends in data
  • Problem Solving: Break down complex problems into manageable steps
  • Decision Support: Provide recommendations based on criteria

Specialized Tasks

  • Code Generation: Write and explain code in various languages
  • Mathematical Operations: Perform calculations and solve equations
  • Creative Writing: Generate creative content and ideas
  • Summarization: Create concise summaries of long documents

Model Selection Guide

Choosing the Right Model

For General Purpose Tasks

  • Large Models: Best overall performance for complex reasoning and analysis
  • Flagship Models: Latest versions from major providers
  • Balanced Options: Mid-tier models offering good performance/cost ratio

For High-Volume Processing

  • Efficient Models: Smaller, faster models for simple tasks
  • Cost-Optimized: Models with lower per-token pricing
  • Self-Hosted Options: No per-token costs for high-volume use

For Sensitive Data

  • Self-Hosted Models: Complete data privacy and control
  • On-Premises Deployment: Keep data within your infrastructure
  • Private Cloud: Dedicated instances for maximum security

For Specialized Tasks

  • Code Tasks: Models fine-tuned for programming and development
  • Long Context: Models with extended context windows for large documents
  • Multilingual: Models optimized for multiple languages
  • Domain-Specific: Models trained for specific industries or use cases

Performance Considerations

Latency Requirements

  • Real-time Applications: Use faster, smaller models optimized for speed
  • Batch Processing: Can use larger, more capable models
  • Interactive Workflows: Balance response time and quality

Cost Management

  • High Volume: Consider self-hosted solutions to avoid per-token fees
  • Variable Load: Pay-per-use cloud models for flexibility
  • Budget Constraints: Use smaller, more efficient models

Quality Requirements

  • Critical Decisions: Use the most capable models available
  • Simple Tasks: Smaller models may be sufficient
  • Quality vs. Cost: Balance based on business impact and requirements

Model Configuration

API-Based Models (Cloud Providers)

Required Information

  • API Key: Authentication credential from provider
  • Endpoint URL: API service endpoint (usually default)
  • Model Name: Specific model identifier to use
  • API Version: Provider-specific version (if required)

Configuration Options

  • Temperature: Controls response creativity/randomness
  • Max Tokens: Limits response length
  • Top P: Alternative to temperature for response variation
  • Frequency Penalty: Reduces repetition in responses

Self-Hosted Models (Ollama)

Required Information

  • Endpoint URL: Ollama server address
  • Model Name: Downloaded model identifier
  • Connection Settings: Timeout and retry configurations

Setup Requirements

  • Ollama server running and accessible
  • Models downloaded and available
  • Network connectivity from Vectense to Ollama

Model Management

Model Testing

Before using models in production workflows:

  • Connectivity Test: Verify API access and authentication
  • Response Quality: Test with sample prompts
  • Performance Measurement: Check response times and consistency
  • Error Handling: Test failure scenarios and error responses

Model Monitoring

Track model performance and usage:

  • Usage Metrics: Monitor token consumption and costs
  • Response Quality: Track output quality over time
  • Error Rates: Monitor failed requests and timeouts
  • Performance Trends: Analyze response times and throughput

Model Updates

Keep models current and optimized:

  • Version Updates: Update to newer model versions when available
  • Configuration Tuning: Optimize parameters based on usage patterns
  • Cost Optimization: Review and optimize model usage for cost efficiency
  • Security Updates: Update API keys and security settings regularly

Cost Management

Understanding Costs

Token-Based Pricing (Cloud Providers)

  • Input Tokens: Cost for processing input text
  • Output Tokens: Cost for generated responses
  • Model Pricing: Different models have different per-token costs
  • Volume Discounts: Some providers offer volume-based pricing

Self-Hosted Costs (On-Premises)

  • Infrastructure: Server hardware and maintenance costs
  • No Per-Use Fees: No token-based charges
  • Scaling Costs: Additional hardware for increased capacity

Cost Optimization Strategies

Model Selection

  • Use appropriate model size for each task
  • Consider faster, cheaper models for simple tasks
  • Use self-hosted models for high-volume processing

Prompt Optimization

  • Write efficient prompts that minimize token usage
  • Use clear, concise instructions
  • Avoid unnecessary context or examples

Workflow Design

  • Cache results when possible
  • Batch similar requests
  • Use conditional logic to avoid unnecessary AI calls

Security and Privacy

Data Privacy

Cloud Models (External Providers)

  • Data sent to external providers
  • Review provider privacy policies
  • Understand data retention policies
  • Consider data classification and sensitivity

Self-Hosted Models (On-Premises)

  • Complete data privacy and control
  • No external data transmission
  • Full compliance with data governance
  • Suitable for sensitive or regulated data

Security Best Practices

API Key Management

  • Store API keys securely
  • Rotate keys regularly
  • Monitor API key usage
  • Restrict key permissions when possible

Access Control

  • Limit model access to authorized users
  • Use role-based permissions
  • Monitor model usage patterns
  • Audit model access logs

Network Security

  • Use HTTPS for all API communications
  • Implement network firewalls and restrictions
  • Monitor network traffic for anomalies
  • Secure internal model deployments

Troubleshooting

Common Issues

Connection Failures

  • Verify API keys and credentials
  • Check network connectivity
  • Confirm endpoint URLs
  • Review firewall and proxy settings

Poor Response Quality

  • Review and refine prompts
  • Adjust model parameters
  • Check input data quality
  • Consider different model selection

High Costs

  • Monitor token usage patterns
  • Optimize prompt efficiency
  • Consider alternative models
  • Implement usage limits

Performance Issues

  • Check model response times
  • Monitor concurrent request limits
  • Consider model alternatives
  • Optimize request patterns

Getting Help

  • Documentation: Reference provider-specific guides
  • Community: Join user forums for model discussions
  • Support: Contact technical support for complex issues
  • Monitoring: Use built-in monitoring tools for diagnostics

Best Practices

Model Configuration

  • Start Simple: Begin with well-known models and standard settings
  • Test Thoroughly: Validate model performance with real data
  • Document Settings: Keep records of successful configurations
  • Version Control: Track configuration changes over time

Usage Optimization

  • Right-Size Models: Use appropriate model capabilities for each task
  • Monitor Performance: Track usage metrics and costs regularly
  • Optimize Prompts: Continuously improve prompt efficiency
  • Scale Appropriately: Plan for growth and changing requirements

Security and Compliance

  • Data Classification: Understand what data is sent to models
  • Compliance Requirements: Ensure model usage meets regulatory needs
  • Regular Audits: Review model access and usage patterns
  • Privacy Protection: Implement appropriate data protection measures

Ready to configure your first model? Start with Introduction to understand model concepts, then proceed to Connect a Model for step-by-step setup instructions.