Intelligent Image Generator

Intelligent Image Generator

Turn casual prompts into professional-quality images with AI enhancement

18
Stars
5
Forks
9
Releases

Overview

An MCP server that integrates Google's Gemini suite to power image generation and editing through AI assistants like Codex, Cursor, and Claude Code. It uses a two-stage pipeline: prompt optimization with Gemini 2.0 Flash to enrich prompts with lighting, composition, and atmosphere while preserving intent; followed by image generation with Gemini 3 Pro Image for high-quality results. It supports 2K and 4K outputs and offers flexible aspect ratios (1:1, 16:9, 9:16, 21:9, etc.). The server enables image editing via natural language instructions, with context-aware changes that preserve the source style and visual consistency. Outputs can be PNG, JPEG, or WebP, saved to a configurable directory for easy access. Advanced options include multi-image blending, character consistency across variations, and world-knowledge integration; optional Google Search grounding can be enabled for factual prompts. Designed to integrate via standard MCP configurations for Codex, Cursor, and Claude Code, it emphasizes secure API key handling and clear path requirements.

Details

Owner
shinpr
Language
TypeScript
License
MIT License
Updated
2025-12-07

Features

AI-Powered Image Generation

Create images from text prompts using Gemini 3 Pro Image (Nano Banana Pro).

Intelligent Prompt Enhancement

Automatically optimizes prompts with Gemini 2.0 Flash, enriching lighting, composition, and atmosphere while preserving intent.

Image Editing

Transform existing images with natural language instructions, preserving style and visual consistency.

High-Resolution Output

Supports 2K and 4K image generation for detail and text rendering.

Flexible Aspect Ratios

Multiple aspect ratio options (1:1, 16:9, 9:16, 21:9, and more).

Advanced Options

Multi-image blending, character consistency, and world knowledge integration.

Multiple Output Formats

Exports PNG, JPEG, and WebP formats.

File Output

Images are saved to a configurable directory for easy access and integration.

Audience

AI developersIntegrates image generation/editing into MCP-compatible assistants and toolchains like Codex, Cursor, Claude Code.

Tags

image generationimage editingGemini 3 Pro ImageGemini 2.0 FlashMCP serverCodexCursorClaude Code2K4Kprompt enhancementmulti-image blendingaspect ratiooutput formatsfile outputAI tooling