The Gemini API offers text embedding models to generate embeddings for words, phrases, sentences, and code. These foundational embeddings power advanced NLP tasks such as semantic search, classification, and clustering, providing more accurate, context-aware results than keyword-based approaches.
Building Retrieval Augmented Generation (RAG) systems is a common use case for embeddings. Embeddings plays a key role in significantly enhancing model outputs with improved factual accuracy, coherence, and contextual richness. They efficiently retrieve relevant information from knowledge bases, represented by embeddings, which are then passed as additional context in the input prompt to language models, guiding it to generate more informed and accurate responses.
Generating embeddings
Use the embedContent
method to generate text embeddings:
Python
from google import genai
client = genai.Client()
result = client.models.embed_content(
model="gemini-embedding-001",
contents="What is the meaning of life?")
print(result.embeddings)
JavaScript
import { GoogleGenAI } from "@google/genai";
async function main() {
const ai = new GoogleGenAI({});
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: 'What is the meaning of life?',
});
console.log(response.embeddings);
}
main();
Go
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
contents := []*genai.Content{
genai.NewContentFromText("What is the meaning of life?", genai.RoleUser),
}
result, err := client.Models.EmbedContent(ctx,
"gemini-embedding-001",
contents,
nil,
)
if err != nil {
log.Fatal(err)
}
embeddings, err := json.MarshalIndent(result.Embeddings, "", " ")
if err != nil {
log.Fatal(err)
}
fmt.Println(string(embeddings))
}
REST
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{"model": "models/gemini-embedding-001",
"content": {"parts":[{"text": "What is the meaning of life?"}]}
}'
You can also generate embeddings for multiple chunks at once by passing them in as a list of strings.
Python
from google import genai
client = genai.Client()
result = client.models.embed_content(
model="gemini-embedding-001",
contents= [
"What is the meaning of life?",
"What is the purpose of existence?",
"How do I bake a cake?"
])
for embedding in result.embeddings:
print(embedding)
JavaScript
import { GoogleGenAI } from "@google/genai";
async function main() {
const ai = new GoogleGenAI({});
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: [
'What is the meaning of life?',
'What is the purpose of existence?',
'How do I bake a cake?'
],
});
console.log(response.embeddings);
}
main();
Go
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
contents := []*genai.Content{
genai.NewContentFromText("What is the meaning of life?"),
genai.NewContentFromText("How does photosynthesis work?"),
genai.NewContentFromText("Tell me about the history of the internet."),
}
result, err := client.Models.EmbedContent(ctx,
"gemini-embedding-001",
contents,
nil,
)
if err != nil {
log.Fatal(err)
}
embeddings, err := json.MarshalIndent(result.Embeddings, "", " ")
if err != nil {
log.Fatal(err)
}
fmt.Println(string(embeddings))
}
REST
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{"model": "models/gemini-embedding-001",
"content": [
{"parts": [{"text": "What is the meaning of life?"}]},
{"parts": [{"text": "What is the purpose of existence?"}]},
{"parts": [{"text": "How do I bake a cake?"}]}
]
}'
Specify task type to improve performance
You can use embeddings for a wide range of tasks from classification to document search. Specifying the right task type helps optimize the embeddings for the intended relationships, maximizing accuracy and efficiency. For a complete list of supported task types, see the Supported task types table.
The following example shows how you can use
SEMANTIC_SIMILARITY
to check how similar in meaning strings of texts are.
Python
from google import genai
from google.genai import types
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
client = genai.Client()
texts = [
"What is the meaning of life?",
"What is the purpose of existence?",
"How do I bake a cake?"]
result = [
np.array(e.values) for e in client.models.embed_content(
model="gemini-embedding-001",
contents=texts,
config=types.EmbedContentConfig(task_type="SEMANTIC_SIMILARITY")).embeddings
]
# Calculate cosine similarity. Higher scores = greater semantic similarity.
embeddings_matrix = np.array(result)
similarity_matrix = cosine_similarity(embeddings_matrix)
for i, text1 in enumerate(texts):
for j in range(i + 1, len(texts)):
text2 = texts[j]
similarity = similarity_matrix[i, j]
print(f"Similarity between '{text1}' and '{text2}': {similarity:.4f}")
JavaScript
import { GoogleGenAI } from "@google/genai";
import * as cosineSimilarity from "compute-cosine-similarity";
async function main() {
const ai = new GoogleGenAI({});
const texts = [
"What is the meaning of life?",
"What is the purpose of existence?",
"How do I bake a cake?",
];
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: texts,
taskType: 'SEMANTIC_SIMILARITY'
});
const embeddings = response.embeddings.map(e => e.values);
for (let i = 0; i < texts.length; i++) {
for (let j = i + 1; j < texts.length; j++) {
const text1 = texts[i];
const text2 = texts[j];
const similarity = cosineSimilarity(embeddings[i], embeddings[j]);
console.log(`Similarity between '${text1}' and '${text2}': ${similarity.toFixed(4)}`);
}
}
}
main();
JavaScript
import { GoogleGenAI } from "@google/genai";
import * as cosineSimilarity from "compute-cosine-similarity";
async function main() {
const ai = new GoogleGenAI({});
const texts = [
"What is the meaning of life?",
"What is the purpose of existence?",
"How do I bake a cake?",
];
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: texts,
taskType: 'SEMANTIC_SIMILARITY'
});
const embeddings = response.embeddings.map(e => e.values);
for (let i = 0; i < texts.length; i++) {
for (let j = i + 1; j < texts.length; j++) {
const text1 = texts[i];
const text2 = texts[j];
const similarity = cosineSimilarity(embeddings[i], embeddings[j]);
console.log(`Similarity between '${text1}' and '${text2}': ${similarity.toFixed(4)}`);
}
}
}
main();
Go
package main
import (
"context"
"fmt"
"log"
"math"
"google.golang.org/genai"
)
// cosineSimilarity calculates the similarity between two vectors.
func cosineSimilarity(a, b []float32) (float64, error) {
if len(a) != len(b) {
return 0, fmt.Errorf("vectors must have the same length")
}
var dotProduct, aMagnitude, bMagnitude float64
for i := 0; i < len(a); i++ {
dotProduct += float64(a[i] * b[i])
aMagnitude += float64(a[i] * a[i])
bMagnitude += float64(b[i] * b[i])
}
if aMagnitude == 0 || bMagnitude == 0 {
return 0, nil
}
return dotProduct / (math.Sqrt(aMagnitude) * math.Sqrt(bMagnitude)), nil
}
func main() {
ctx := context.Background()
client, _ := genai.NewClient(ctx, nil)
defer client.Close()
texts := []string{
"What is the meaning of life?",
"What is the purpose of existence?",
"How do I bake a cake?",
}
var contents []*genai.Content
for _, text := range texts {
contents = append(contents, genai.NewContentFromText(text, genai.RoleUser))
}
result, _ := client.Models.EmbedContent(ctx,
"gemini-embedding-001",
contents,
&genai.EmbedContentRequest{TaskType: genai.TaskTypeSemanticSimilarity},
)
embeddings := result.Embeddings
for i := 0; i < len(texts); i++ {
for j := i + 1; j < len(texts); j++ {
similarity, _ := cosineSimilarity(embeddings[i].Values, embeddings[j].Values)
fmt.Printf("Similarity between '%s' and '%s': %.4f\n", texts[i], texts[j], similarity)
}
}
}
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"contents": [
{"parts": [{"text": "What is the meaning of life?"}]},
{"parts": [{"text": "What is the purpose of existence?"}]},
{"parts": [{"text": "How do I bake a cake?"}]}
],
"embedding_config": {
"task_type": "SEMANTIC_SIMILARITY"
}
}'
The following shows an example output from this code snippet:
Similarity between 'What is the meaning of life?' and 'What is the purpose of existence?': 0.9481
Similarity between 'What is the meaning of life?' and 'How do I bake a cake?': 0.7471
Similarity between 'What is the purpose of existence?' and 'How do I bake a cake?': 0.7371
Supported task types
Task type | Description | Examples |
---|---|---|
SEMANTIC_SIMILARITY | Embeddings optimized to assess text similarity. | Recommendation systems, duplicate detection |
CLASSIFICATION | Embeddings optimized to classify texts according to preset labels. | Sentiment analysis, spam detection |
CLUSTERING | Embeddings optimized to cluster texts based on their similarities. | Document organization, market research, anomaly detection |
RETRIEVAL_DOCUMENT | Embeddings optimized for document search. | Indexing articles, books, or web pages for search. |
RETRIEVAL_QUERY |
Embeddings optimized for general search queries.
Use RETRIEVAL_QUERY for queries; RETRIEVAL_DOCUMENT for documents to be retrieved.
|
Custom search |
CODE_RETRIEVAL_QUERY |
Embeddings optimized for retrieval of code blocks based on natural language queries.
Use CODE_RETRIEVAL_QUERY for queries; RETRIEVAL_DOCUMENT for code blocks to be retrieved.
|
Code suggestions and search |
QUESTION_ANSWERING |
Embeddings for questions in a question-answering system, optimized for finding documents that answer the question.
Use QUESTION_ANSWERING for questions; RETRIEVAL_DOCUMENT for documents to be retrieved.
|
Chatbox |
FACT_VERIFICATION |
Embeddings for statements that need to be verified, optimized for retrieving documents that contain evidence supporting or refuting the statement.
Use FACT_VERIFICATION for the target text; RETRIEVAL_DOCUMENT for documents to be retrieved
|
Automated fact-checking systems |
Controlling Embedding Size
The Gemini embedding model, gemini-embedding-001
, is trained using the
Matryoshka Representation Learning (MRL) technique which teaches a model to
learn high-dimensional embeddings that have initial segments (or prefixes) which
are also useful, simpler versions of the same data. You can choose to use the
full 3072-dimensional embedding, or you can truncate it to a smaller size
without losing quality to save storage space. For best quality, we recommend
using the first 768 and 1536.
By using the output_dimensionality
parameter, users can control the size of
the output embedding vector. Selecting a smaller output dimensionality can save
storage space and increase computational efficiency for downstream applications,
while sacrificing little in terms of quality.
Python
from google import genai
from google.genai import types
client = genai.Client()
result = client.models.embed_content(
model="gemini-embedding-001",
contents="What is the meaning of life?",
config=types.EmbedContentConfig(output_dimensionality=768)
)
[embedding_obj] = result.embeddings
embedding_length = len(embedding_obj.values)
print(f"Length of embedding: {embedding_length}")
JavaScript
import { GoogleGenAI } from "@google/genai";
async function main() {
const ai = new GoogleGenAI({});
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: 'What is the meaning of life?',
outputDimensionality: 768,
});
const embeddingLength = response.embedding.values.length;
console.log(`Length of embedding: ${embeddingLength}`);
}
main();
Go
package main
import (
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
// The client uses Application Default Credentials.
// Authenticate with 'gcloud auth application-default login'.
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
defer client.Close()
contents := []*genai.Content{
genai.NewContentFromText("What is the meaning of life?", genai.RoleUser),
}
result, err := client.Models.EmbedContent(ctx,
"gemini-embedding-001",
contents,
&genai.EmbedContentRequest{OutputDimensionality: 768},
)
if err != nil {
log.Fatal(err)
}
embedding := result.Embeddings[0]
embeddingLength := len(embedding.Values)
fmt.Printf("Length of embedding: %d\n", embeddingLength)
}
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent" \
-H "x-goog-api-key: YOUR_GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"contents": [
{"parts": [{"text": "What is the meaning of life?"}]}
],
"embedding_config": {
"output_dimensionality": 768
}
}'
Example Output:
Length of embedding: 768
The 3072 dimension embedding is normalized. Normalized embeddings produce more accurate semantic similarity by comparing vector direction, not magnitude. For other dimensions, including 768 and 1536, you need to normalize the embeddings as follows:
Python
import numpy as np
from numpy.linalg import norm
embedding_values_np = np.array(embedding_obj.values)
normed_embedding = embedding_values_np / np.linalg.norm(embedding_values_np)
print(f"Normed embedding length: {len(normed_embedding)}")
print(f"Norm of normed embedding: {np.linalg.norm(normed_embedding):.6f}") # Should be very close to 1
Example Output:
Normed embedding length: 768
Norm of normed embedding: 1.000000
Use cases
Text embeddings are crucial for a variety of common AI use cases, such as:
- Retrieval-Augmented Generation (RAG): Embeddings enhance the quality of generated text by retrieving and incorporating relevant information into the context of a model.
Information Retrieval: Use embeddings to search for semantically similar text or documents given a piece of input text.
Anomaly detection: Comparing groups of embeddings can help identify hidden trends or outliers.
Classification: Automatically categorize text based on its content, such as sentiment analysis or spam detection
Clustering: An effective way to grasp relationships is to create clusters and visualizations of your embeddings.
Storing embeddings
As you take embeddings to production, it is common to use vector databases to efficiently store, index, and retrieve high-dimensional embeddings. Google Cloud offers managed data services that can be used for this purpose including BigQuery, AlloyDB, and Cloud SQL.
The following tutorials show how to use other third party vector databases with Gemini Embedding.
Embedding models
Generally available model
Legacy models
embedding-001
(Deprecating on August 14, 2025)text-embedding-004
(Deprecating on January 14, 2026)
Using embeddings
Unlike generative AI models that create new content, the Gemini Embedding model is only intended to transform the format of your input data into a numerical representation. While Google is responsible for providing an embedding model that transforms the format of your input data to the numerical-format requested, users retain full responsibility for the data they input and the resulting embeddings. By using the Gemini Embedding model you confirm that you have the necessary rights to any content that you upload. Do not generate content that infringes on others’ intellectual property or privacy rights. Your use of this service is subject to our Prohibited Use Policy and Google's Terms of Service.
Start building with embeddings
Check out the embeddings quickstart notebook to explore the model capabilities and learn how to customize and visualize your embeddings.