Skip to main content

Connect a Model

This guide walks you through connecting AI models to your Vectense Platform workspace, enabling intelligent processing in your workflows.

Before You Begin

Prerequisites

  • Active Workspace: Access to a workspace with model creation permissions
  • Provider Account: Account with your chosen AI provider (for cloud models)
  • Access Keys: Authentication keys from your provider
  • Network Access: Internet connectivity for cloud models or network access for self-hosted models

Required Information

Gather the following information before starting:

  • Provider Type: Choose from supported providers
  • Access Keys: Authentication keys from your chosen provider
  • Model Names: Specific model identifiers you want to use
  • Connection Settings: Usually default settings work fine

Connecting Cloud Models

OpenAI Models

Step 1: Navigate to Models

  1. Go to "Models" in the main navigation
  2. Click "Configure new Model"
  3. Select "OpenAI" as the provider

Step 2: Configure Basic Information

  • Name: Give your model configuration a descriptive name
    • Example: "OpenAI Model for Document Analysis"
  • Description: Optional description of the model's intended use

Step 3: Configure OpenAI Settings

  • OpenAI Model: Select from available models
    • Choose based on your use case requirements
    • Consider performance, cost, and capability trade-offs
    • Refer to OpenAI documentation for latest model options
  • OpenAI Endpoint: Use default connection or custom deployment
  • OpenAI Access Token: Enter your OpenAI access key
    • Obtain from your OpenAI account dashboard
    • Keep secure: Store in password manager

Step 4: Test Connection

  • Click "Test Connection" to verify setup
  • Review test response for quality
  • Adjust settings if needed

Step 5: Save Configuration

  • Click "Save" to store the model configuration
  • Model is now available for use in workflows

Anthropic Models

Step 1: Choose Anthropic Provider

  1. Navigate to Models → "Configure new Model"
  2. Select "Anthropic" as the provider

Step 2: Configure Anthropic Settings

  • Name: Descriptive model configuration name
  • Anthropic Model: Select from available Claude models
    • Choose based on your performance and cost requirements
    • Refer to Anthropic documentation for latest model options
    • Consider capability vs. cost trade-offs for your use case
  • Anthropic Endpoint: Use default or custom endpoint
  • Anthropic API Key: Enter your Anthropic API key
  • Anthropic API Version: Refer to Anthropic documentation for current version

Step 3: Test and Save

  • Test connection to verify setup
  • Save configuration when working correctly

Mistral AI Models

Step 1: Choose Mistral Provider

  1. Navigate to Models → "Configure new Model"
  2. Select "Mistral" as the provider

Step 2: Configure Mistral Settings

  • Name: Descriptive model configuration name
  • Mistral Model: Select model size
    • mistral-large: Best performance
    • mistral-medium: Balanced option
    • mistral-small: Fast and economical
  • Mistral Endpoint: Use default endpoint
  • Mistral API Key: Enter your Mistral API key

Step 3: Test and Save

  • Verify connection with test
  • Save working configuration

Connecting Self-Hosted Models (Ollama)

Prerequisites for Ollama

  • Ollama Server: Running Ollama instance
  • Downloaded Models: Models downloaded on Ollama server
  • Network Access: Vectense can reach Ollama server
  • Model Compatibility: Supported model formats

Step 1: Prepare Ollama Server

Install Ollama (if not already installed)

# Download and install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Start Ollama service
ollama serve

Download Models

# Download Llama 2 model
ollama pull llama2

# Download Mistral model
ollama pull mistral

# Download Code Llama for coding tasks
ollama pull codellama

# List available models
ollama list

Step 2: Configure Ollama in Vectense

Navigate to Model Configuration

  1. Go to Models → "Configure new Model"
  2. Select "Ollama" as the provider

Configure Ollama Settings

  • Name: Descriptive name for the configuration
  • Ollama Model: Enter the model name from your Ollama server
    • Use exact name from ollama list command
    • Examples: llama2, mistral, codellama
  • Ollama Endpoint: Enter your Ollama server URL
    • Local: http://localhost:11434
    • Remote: http://your-server:11434
    • Docker: Use container name or IP

Step 3: Test Connection

  • Click "Test Connection" to verify
  • Check that Ollama server is reachable
  • Verify model is available and responding

Step 4: Save Configuration

  • Save the working configuration
  • Model is ready for workflow use

Advanced Configuration

Model Parameters

Temperature (0.0 - 2.0)

  • 0.0-0.3: Very deterministic, consistent responses
  • 0.4-0.7: Balanced creativity and consistency
  • 0.8-2.0: More creative and varied responses
  • Default: 0.7 for most use cases

Max Tokens

  • Limits response length
  • Higher values allow longer responses but increase costs
  • Set based on expected output length
  • Typical values: 500-4000 tokens

Top P (0.0 - 1.0)

  • Alternative to temperature for controlling randomness
  • Lower values: more focused responses
  • Higher values: more diverse word choices
  • Default: 0.9 for most use cases

Performance Tuning

Timeout Settings

  • Set appropriate timeout values for your use case
  • Longer timeouts for complex tasks
  • Shorter timeouts for simple, fast responses

Retry Configuration

  • Configure automatic retries for failed requests
  • Set retry limits and delays
  • Handle transient network issues

Rate Limiting

  • Configure request rate limits
  • Prevent overwhelming the model provider
  • Manage costs and usage

Model Testing

Basic Functionality Test

Test basic model functionality:

  1. Use the built-in test feature
  2. Send a simple prompt like "Hello, please introduce yourself"
  3. Verify response quality and format
  4. Check response time

Advanced Testing

For production readiness:

  1. Task-Specific Testing: Test with prompts similar to your use case
  2. Load Testing: Test with multiple concurrent requests
  3. Error Handling: Test with invalid inputs and edge cases
  4. Performance Testing: Measure response times under various conditions

Quality Assessment

Evaluate model outputs:

  • Accuracy: Responses are factually correct
  • Relevance: Responses address the prompt appropriately
  • Consistency: Similar prompts produce consistent responses
  • Style: Response tone and format match requirements

Troubleshooting

Common Connection Issues

API Key Problems

  • Verify API key is correct and not expired
  • Check API key permissions and quotas
  • Ensure API key has necessary scopes

Network Connectivity

  • Test network connectivity to provider endpoints
  • Check firewall and proxy settings
  • Verify DNS resolution for provider domains

Endpoint Configuration

  • Verify endpoint URLs are correct
  • Check for typos in configuration
  • Ensure using correct API version

Ollama-Specific Issues

Server Not Reachable

# Check if Ollama is running
ps aux | grep ollama

# Test connection manually
curl http://localhost:11434/api/tags

# Restart Ollama if needed
ollama serve

Model Not Found

# List downloaded models
ollama list

# Download missing model
ollama pull model-name

Permission Issues

  • Ensure Ollama has necessary file permissions
  • Check that ports are open and accessible
  • Verify user permissions for model files

Performance Issues

Slow Response Times

  • Check network latency to provider
  • Consider using faster models
  • Optimize prompt length and complexity
  • Check server resources for self-hosted models

High Error Rates

  • Monitor API quotas and limits
  • Implement proper retry logic
  • Check for rate limiting issues
  • Verify model availability

Security Best Practices

API Key Management

  • Secure Storage: Never store API keys in plain text
  • Regular Rotation: Change API keys periodically
  • Limited Scope: Use API keys with minimal necessary permissions
  • Monitoring: Monitor API key usage for anomalies

Network Security

  • HTTPS: Always use encrypted connections
  • VPN: Consider VPN for self-hosted model access
  • Firewall: Restrict access to necessary ports only
  • Authentication: Implement proper authentication for Ollama

Data Privacy

  • Cloud Models: Review provider data handling policies
  • Self-Hosted: Ensure proper data isolation
  • Logging: Be careful about logging sensitive data
  • Compliance: Ensure configuration meets regulatory requirements

Model Management

Monitoring

After connecting models:

  • Usage Tracking: Monitor token consumption and costs
  • Performance Metrics: Track response times and success rates
  • Error Monitoring: Set up alerts for failures
  • Quality Monitoring: Regularly assess output quality

Updates and Maintenance

  • Provider Updates: Stay informed about model updates
  • Configuration Reviews: Periodically review settings
  • Performance Optimization: Adjust parameters based on usage
  • Security Updates: Update credentials and access controls

Cost Management

  • Usage Monitoring: Track model usage and costs
  • Budget Alerts: Set up alerts for usage thresholds
  • Model Optimization: Use appropriate models for each task
  • Efficiency Improvements: Optimize prompts and workflows

Next Steps

After successfully connecting your model:

  1. Test in Workflows: Use your model in a simple workflow
  2. Configure Knowledge Bases: Add context to your AI
  3. Monitor Performance: Track model usage and performance
  4. Optimize Usage: Learn advanced optimization techniques

Your AI model is now ready to power intelligent workflows in Vectense Platform!