Question 1

What types of AI models can I deploy on SiliconFlow?

Accepted Answer

SiliconFlow supports a comprehensive range of model types including large language models (DeepSeek, Qwen, GLM, Kimi, GPT series), multimodal vision-language models, image generation models (FLUX series), video generation models (Wan2.2), and audio models for speech recognition and synthesis. All models are accessible through a unified API with OpenAI-compatible endpoints.

Question 2

How does the pricing and billing structure work?

Accepted Answer

Billing is strictly usage-based with no minimum commitments or hidden fees. For chat models, you pay per token for both input and output (priced per 1 million tokens). Image generation is priced per image created, video per video generated, and audio tasks vary by specific operation. New users receive $1 in free credits, and you can set monthly spending limits in your dashboard to control costs.

Question 3

Can I customize models for my specific business needs?

Accepted Answer

Yes, SiliconFlow provides a complete fine-tuning pipeline where you can upload your proprietary dataset securely, select a base model, configure training parameters, and deploy your customized version with one click. This enables domain-specific adaptations for industries like legal, medical, or financial services without managing training infrastructure.

Question 4

Is SiliconFlow compatible with my existing OpenAI-based code?

Accepted Answer

Absolutely. SiliconFlow maintains full API compatibility with OpenAI's specification, meaning you can switch by simply changing the base URL and API key in your existing integrations. This includes support for chat completions, embeddings, and streaming responses using the same request formats and SDKs.

Question 5

How do you ensure performance and reliability for production applications?

Accepted Answer

The platform guarantees performance through multiple mechanisms: serverless auto-scaling handles traffic spikes, reserved GPUs provide isolated capacity for stable workloads, and the self-developed inference engine optimizes throughput and latency. Enterprise customers can lock in dedicated resources with predictable billing for mission-critical applications.

Question 6

What deployment options are available beyond serverless inference?

Accepted Answer

Beyond instant serverless access, SiliconFlow offers dedicated endpoints with reserved GPU capacity (NVIDIA H100/H200, AMD MI300), elastic GPU deployment for flexible FaaS patterns, and custom fine-tuning with managed training infrastructure. This spectrum allows optimization for cost, performance, or control based on workload characteristics.

Question 7

How can I control costs and prevent unexpected charges?

Accepted Answer

The platform provides multiple cost control mechanisms: you can set hard monthly spending limits in your account dashboard, use the AI Gateway for intelligent routing and rate limiting, and choose between on-demand or reserved capacity based on predictability of your workloads. Volume discounts are also available for scaling applications.

Question 8

What happens to my data during inference and fine-tuning?

Accepted Answer

SiliconFlow operates a privacy-first architecture where no customer data is stored on platform servers. Your training datasets, inference inputs, and model outputs remain under your control, with enterprise-grade security isolation for dedicated deployments and no data retention for serverless requests.

Tier	Price	Description
Serverless (Pay-per-use)	Variable per token/image/video	Input/output tokens priced per 1M tokens; images per generation; videos per creation; no minimum commitment; $1 free credits to start
DeepSeek-V3.2	$0.27/M input, $0.42/M output	164K context, high-performance reasoning and coding model
DeepSeek-R1	$0.50/M input, $2.18/M output	164K context, advanced reasoning specialist
GLM-5	$0.30/M input, $2.55/M output	205K context, state-of-the-art open-source agentic model
Kimi-K2.5	$0.23/M input, $3.00/M output	262K context, long-context leader for research and synthesis
FLUX 1.1 [pro]	$0.04/image	High-quality image generation from text prompts
Wan2.2-T2V-A14B	$0.29/video	Text-to-video generation with dynamic output
Reserved GPUs	Contact Sales	Guaranteed capacity with significant savings vs. on-demand for long-running workloads
Volume Discounts	Custom pricing	Available for high-usage customers with substantial token consumption

Siliconflow

Overview

Core Features

How to Use

Key Advantages

Pricing

FAQ

Get Help

Download Client