Skip to main content

Gemini Generating Content

Before you begin: Set up your project and API key

Before calling the Gemini API, you need to set up your project and configure your API key.

Get and secure your API key

You need an API key to call the Gemini API. If you don't already have one, create a key in RockAPI.

Create an API key

Install the SDK package and configure your API key

Note: This section shows setup steps for a local Python environment.

The Python SDK for the Gemini API is contained in the google-generativeai package.

  1. Install the dependency using pip:

    pip install -U google-generativeai
  2. Import the package and configure the service with your API key:

    import google.generativeai as genai
    from google.api_core.client_options import ClientOptions

    genai.configure(
    api_key=$ROCKAPI_API_KEY,
    transport="rest",
    client_options=ClientOptions(api_endpoint="https://api.rockapi.ru/google-ai-studio"))

Generate text from text-only input

The simplest way to generate text using the Gemini API is to provide the model with a single text-only input, as shown in this example:

import google.generativeai as genai
from google.api_core.client_options import ClientOptions

genai.configure(
api_key=$ROCKAPI_API_KEY,
transport="rest",
client_options=ClientOptions(api_endpoint="https://api.rockapi.ru/google-ai-studio"))

# Choose a model that's appropriate for your use case.
model = genai.GenerativeModel('gemini-1.5-flash')

prompt = "Write a story about a magic backpack."

response = model.generate_content(prompt)

print(response.text)

In this case, the prompt ("Write a story about a magic backpack") doesn't include any output examples, system instructions, or formatting information. It's a zero-shot approach. For some use cases, a one-shot or few-shot prompt might produce output that's more aligned with user expectations. In some cases, you might also want to provide system instructions to help the model understand the task or follow specific guidelines.

Generate text from text-and-image input

The Gemini API supports multimodal inputs that combine text with media files. The following example shows how to generate text from text-and-image input:

import pathlib
import google.generativeai as genai
from google.api_core.client_options import ClientOptions

genai.configure(
api_key=$ROCKAPI_API_KEY,
transport="rest",
client_options=ClientOptions(api_endpoint="https://api.rockapi.ru/google-ai-studio"))

# Choose a model that's appropriate for your use case.
model = genai.GenerativeModel('gemini-1.5-flash')

image1 = {
'mime_type': 'image/jpeg',
'data': pathlib.Path('image1.jpg').read_bytes()
}

image2 = {
'mime_type': 'image/jpeg',
'data': pathlib.Path('image2.jpg').read_bytes()
}

prompt = "What's different between these pictures?"

response = model.generate_content([prompt, image1, image2])

print(response.text)

As with text-only prompting, multimodal prompting can involve various approaches and refinements. Depending on the output from this example, you might want to add steps to the prompt or be more specific in your instructions. To learn more, see File prompting strategies.

Generate a text stream

By default, the model returns a response after completing the entire text generation process. You can achieve faster interactions by not waiting for the entire result, and instead use streaming to handle partial results.

The following example shows how to implement streaming using the streamGenerateContent method to generate text from a text-only input prompt.

import google.generativeai as genai
from google.api_core.client_options import ClientOptions

genai.configure(
api_key=$ROCKAPI_API_KEY,
transport="rest",
client_options=ClientOptions(api_endpoint="https://api.rockapi.ru/google-ai-studio"))

# Choose a model that's appropriate for your use case.
model = genai.GenerativeModel('gemini-1.5-flash')

prompt = "Write a story about a magic backpack."

response = model.generate_content(prompt, stream=True)

for chunk in response:
print(chunk.text)
print("_" * 80)

Build an interactive chat

You can use the Gemini API to build interactive chat experiences for your users. Using the chat feature of the API lets you collect multiple rounds of questions and responses, allowing users to step incrementally toward answers or get help with multipart problems. This feature is ideal for applications that require ongoing communication, such as chatbots, interactive tutors, or customer support assistants.

The following code example shows a basic chat implementation:

import google.generativeai as genai
from google.api_core.client_options import ClientOptions

genai.configure(
api_key=$ROCKAPI_API_KEY,
transport="rest",
client_options=ClientOptions(api_endpoint="https://api.rockapi.ru/google-ai-studio"))

model = genai.GenerativeModel('gemini-1.5-flash')
chat = model.start_chat(history=[])

response = chat.send_message(
'In one sentence, explain how a computer works to a young child.'
)

print(response.text)

response = chat.send_message(
'Okay, how about a more detailed explanation to a high schooler?'
)

print(response.text)

Configure text generation

Every prompt you send to the model includes parameters that control how the model generates responses. You can use GenerationConfig to configure these parameters. If you don't configure the parameters, the model uses default options, which can vary by model.

The following example shows how to configure two of the available options: temperature and maxOutputTokens.

model = genai.GenerativeModel(
'gemini-1.5-flash',
generation_config=genai.GenerationConfig(
max_output_tokens=2000,
temperature=0.9,
)
)

temperature controls the randomness of the output. Use higher values for more creative responses, and lower values for more deterministic responses. Values can range from [0.0, 2.0].

maxOutputTokens sets the maximum number of tokens to include in a candidate.

You can also configure individual calls to generateContent:

response = model.generate_content(
'Write a story about a magic backpack.',
generation_config = genai.GenerationConfig(
max_output_tokens=1000,
temperature=0.1,
)
)

Any values set on the individual call override values on the model constructor.

What's next

This guide shows how to use generateContent and streamGenerateContent to generate text outputs from text-only and text-and-image inputs. To learn more about generating text using the Gemini API, see the following resources:

  • Prompting with media files: The Gemini API supports prompting with text, image, audio, and video data, also known as multimodal prompting.
  • System instructions: System instructions let you steer the behavior of the model based on your specific needs and use cases.
  • Safety guidance: Sometimes generative AI models produce unexpected outputs, such as outputs that are inaccurate, biased, or offensive. Post-processing and human evaluation are essential to limit the risk of harm from such outputs.