voyage-3-large icon

voyage-3-large

Visit

Voyage AI's latest SOTA general-purpose embedding model, ranking first across 8 evaluation domains spanning 100 datasets, outperforming OpenAI and Cohere by 9.74% and 20.71% on average.

Share:

voyage-3-large is Voyage AI's latest state-of-the-art general-purpose and multilingual embedding model released in January 2025, ranking first across 8 evaluation domains spanning 100 datasets, including law, finance, and code.

Performance Advantages

voyage-3-large outperforms competitors across multiple dimensions:

  • vs OpenAI text-embedding-3-large: +9.74% average performance
  • vs Cohere Embed v3-English: +20.71% average performance
  • vs voyage-3: +4.14% average performance
  • vs voyage-3-lite: +7.68% average performance

Particularly strong in specialized domains like law, finance, and code, setting the 2025 retrieval performance benchmark.

Core Features

Flexible Dimensions

Supports the following output dimensions:

  • 2048 dimensions: Highest quality
  • 1024 dimensions (default): Balanced performance and cost
  • 512 dimensions: Faster inference, reduced storage
  • 256 dimensions: Maximum compression

Quantization Support

Through Matryoshka learning and quantization-aware training, voyage-3-large supports smaller dimensions and int8/binary quantization that dramatically reduce vector database costs with minimal impact on retrieval quality.

  • int8 quantization: 4x storage cost reduction
  • Binary quantization: Up to 200x storage cost reduction with minimal quality loss

Long Context Support

  • Context length: 32K tokens
  • Matryoshka learning for flexible sizing

Multiple Data Types

voyage-3-large supports int8, uint8, binary, and ubinary data types for extreme storage and compute optimization.

Performance Metrics

Latency and Throughput

  • Latency: 90ms for a single query with up to 100 tokens
  • Throughput: 12.6M tokens per hour at $0.22 per 1M tokens on ml.g6.xlarge

Domain-Specific Advantages

In specialized domains like law, finance, medical, and code, voyage-3-large demonstrates significant advantages over general embedding models.

Use Cases

  • Specialized Domain Retrieval: High-precision retrieval in law, finance, medical, code
  • Large-Scale Vector Databases: Dramatically reduce costs using quantization
  • High Performance Requirements: Applications needing cutting-edge retrieval performance
  • Cost Optimization: 200x storage reduction with binary quantization
  • Long Document Processing: 32K token context length support

Pricing

Based on AWS Marketplace data:

  • Base pricing: $0.22 per 1M tokens (on ml.g6.xlarge instance)
  • Specific pricing may vary by deployment method and scale

Pros & Cons

Pros:

  • 2025 SOTA Performance: Ranks first across 100 datasets
  • Domain-Specific Advantages: Exceptional in law, finance, code
  • Extreme Quantization: 200x storage reduction with binary quantization
  • Flexible Dimensions: Supports 256-2048 dimension options
  • Long Context: 32K token support

Cons:

  • Newer Model: Released January 2025, relatively new community ecosystem
  • Pricing: Requires API fees compared to open-source models
  • Documentation: As a new model, docs and best practices still accumulating

Cost Optimization

Binary Quantization Benefits

Storage cost for 1 billion 2048-dim vectors:

  • Unquantized: ~8TB storage
  • Binary quantized: ~40GB storage (200x reduction)

For large-scale vector databases, this cost reduction is revolutionary.

Conclusion

voyage-3-large is the first choice for cutting-edge retrieval performance, particularly suited for:

  • Specialized domain applications (law, finance, medical)
  • Large-scale vector databases needing extreme cost optimization
  • Scenarios with highest retrieval quality requirements
  • Long document processing (32K tokens) applications

For general scenarios, OpenAI text-embedding-3-large offers a more mature ecosystem. For multilingual and open-source needs, BGE-M3 is better. But for specialized domains and maximum performance, voyage-3-large is the best choice in 2025.

Comments

No comments yet. Be the first to comment!