Implement Ollama with ASP.NET Core API

Artificial Intelligence is rapidly becoming a core part of modern web applications. From chatbots and content generation to code assistants and document analysis, developers are integrating Large Language Models (LLMs) into applications faster than ever.

While cloud AI services like OpenAI and Google are popular, many developers now prefer running AI models locally for:

Better privacy
Lower API cost
Offline access
Faster experimentation
Full control over models

This is where Ollama becomes extremely useful.

In this article, you will learn how to integrate Ollama with an ASP.NET Core API and build your own AI-powered backend service.

What is Ollama?

Ollama is a lightweight tool that allows you to run Large Language Models locally on your machine.

It supports models like:

Llama 3
Mistral
Gemma
DeepSeek
Phi
CodeLlama

Ollama exposes a local REST API, making it easy to integrate with .NET applications.

Why Use Ollama with .NET?

Benefits of integrating Ollama with ASP.NET Core:

Feature	Benefit
Local AI Processing	No external API dependency
Privacy	Data stays on your machine
No Token Billing	No per-request charges
Easy Integration	REST-based API
Fast Development	Works with HttpClient
Offline Support	Internet not required

Prerequisites

Before starting, install:

.NET 8 SDK
Ollama
Visual Studio 2022 / VS Code

Install Ollama

Download Ollama from:

Ollama Official Website

After installation, verify:

ollama --version

Pull an AI Model

Example using Llama 3:

ollama pull llama3

Run the model:

ollama run llama3

Ollama automatically starts a local API server on:

http://localhost:11434

Create ASP.NET Core Web API

Create a new project:

dotnet new webapi -n OllamaApi

Open project:

cd OllamaApi

Create Request Models

Create Models/OllamaRequest.cs

namespace OllamaApi.Models
{
    public class OllamaRequest
    {
        public string Model { get; set; }
        public string Prompt { get; set; }
        public bool Stream { get; set; } = false;
    }
}

Create Response Model

Models/OllamaResponse.cs

namespace OllamaApi.Models
{
    public class OllamaResponse
    {
        public string Response { get; set; }
    }
}

Create Ollama Service

Create Services/OllamaService.cs

using System.Text;
using System.Text.Json;
using OllamaApi.Models;

namespace OllamaApi.Services
{
    public class OllamaService
    {
        private readonly HttpClient _httpClient;

        public OllamaService(HttpClient httpClient)
        {
            _httpClient = httpClient;
        }

        public async Task<string> GenerateAsync(string prompt)
        {
            var request = new OllamaRequest
            {
                Model = "llama3",
                Prompt = prompt,
                Stream = false
            };

            var json = JsonSerializer.Serialize(request);

            var content = new StringContent(
                json,
                Encoding.UTF8,
                "application/json");

            var response = await _httpClient.PostAsync(
                "api/generate",
                content);

            response.EnsureSuccessStatusCode();

            var result = await response.Content.ReadAsStringAsync();

            using var document = JsonDocument.Parse(result);

            return document
                .RootElement
                .GetProperty("response")
                .GetString();
        }
    }
}

Register HttpClient

Update Program.cs

using OllamaApi.Services;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddControllers();

builder.Services.AddHttpClient<OllamaService>(client =>
{
    client.BaseAddress =
        new Uri("http://localhost:11434/");
});

builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

var app = builder.Build();

app.UseSwagger();
app.UseSwaggerUI();

app.MapControllers();

app.Run();

Create API Controller

Create Controllers/AiController.cs

using Microsoft.AspNetCore.Mvc;
using OllamaApi.Services;

namespace OllamaApi.Controllers
{
    [ApiController]
    [Route("api/[controller]")]
    public class AiController : ControllerBase
    {
        private readonly OllamaService _ollamaService;

        public AiController(
            OllamaService ollamaService)
        {
            _ollamaService = ollamaService;
        }

        [HttpPost("generate")]
        public async Task<IActionResult> Generate(
            [FromBody] string prompt)
        {
            var result =
                await _ollamaService.GenerateAsync(prompt);

            return Ok(new
            {
                success = true,
                data = result
            });
        }
    }
}

Run the API

Start ASP.NET Core project:

dotnet run

Swagger URL:

https://localhost:5001/swagger

Test API Request

Example request:

{
  "prompt": "Explain dependency injection in .NET"
}

Example response:

{
  "success": true,
  "data": "Dependency Injection is a design pattern..."
}

Advanced Features

You can extend the system with:

Feature	Description
Streaming Responses	Real-time token generation
Multi-Model Support	Switch models dynamically
Chat Memory	Store conversation history
Vector Database	Semantic search
RAG Pipeline	Retrieval-Augmented Generation
AI Agents	Automated workflows
Function Calling	Execute server functions

Using Streaming Responses

Ollama supports token streaming.

Request example:

{
  "model": "llama3",
  "prompt": "Write article about .NET",
  "stream": true
}

Streaming is useful for:

Chat applications
Live typing effect
AI assistants
Content generation tools

Best Practices

Use Background Services

For heavy AI tasks:

Queue requests
Use Hosted Services
Avoid blocking HTTP requests

Add Timeout Handling

_httpClient.Timeout = TimeSpan.FromMinutes(5);

Validate User Prompts

Prevent:

Prompt injection
Abuse
Extremely large requests

Cache Responses

Useful for:

Repeated prompts
SEO article generation
FAQ systems

Real-World Use Cases

Use Case	Description
AI Chatbot	Customer support
Auto Blogging	Generate articles
Code Assistant	Generate code snippets
Q&A Platform	AI answer suggestions
Document Summarizer	Summarize PDFs
AI Search	Semantic search
Email Writer	Generate professional emails

Ollama vs Cloud AI APIs

Feature	Ollama	Cloud APIs
Internet Required	No	Yes
Cost	Free	Pay-per-token
Privacy	High	Medium
Setup Complexity	Medium	Easy
Scalability	Local hardware limited	High
Offline Usage	Yes	No

Conclusion

Integrating Ollama with ASP.NET Core is a powerful way to build AI-enabled applications without depending entirely on cloud providers.

With only a few lines of C# code, you can create:

AI chat systems
Auto content generators
AI search engines
Smart assistants
Code generation tools

For developers already working in the .NET ecosystem, Ollama provides a simple and cost-effective way to bring local AI capabilities into production-ready applications.

blog