Users Pricing

blog

home / developersection / blogs / implement ollama with asp.net core api
Implement Ollama with ASP.NET Core API

Implement Ollama with ASP.NET Core API

Ravi Vishwakarma 21 09 May 2026 Updated 11 May 2026

Artificial Intelligence is rapidly becoming a core part of modern web applications. From chatbots and content generation to code assistants and document analysis, developers are integrating Large Language Models (LLMs) into applications faster than ever.

While cloud AI services like OpenAI and Google are popular, many developers now prefer running AI models locally for:

  • Better privacy
  • Lower API cost
  • Offline access
  • Faster experimentation
  • Full control over models

This is where Ollama becomes extremely useful.

In this article, you will learn how to integrate Ollama with an ASP.NET Core API and build your own AI-powered backend service.

What is Ollama?

Ollama is a lightweight tool that allows you to run Large Language Models locally on your machine.

It supports models like:

  • Llama 3
  • Mistral
  • Gemma
  • DeepSeek
  • Phi
  • CodeLlama

Ollama exposes a local REST API, making it easy to integrate with .NET applications.

Why Use Ollama with .NET?

Benefits of integrating Ollama with ASP.NET Core:

Feature Benefit
Local AI Processing No external API dependency
Privacy Data stays on your machine
No Token Billing No per-request charges
Easy Integration REST-based API
Fast Development Works with HttpClient
Offline Support Internet not required

Prerequisites

Before starting, install:

  • .NET 8 SDK
  • Ollama
  • Visual Studio 2022 / VS Code

Install Ollama

Download Ollama from:

Ollama Official Website

After installation, verify:

ollama --version

Pull an AI Model

Example using Llama 3:

ollama pull llama3

Run the model:

ollama run llama3

Ollama automatically starts a local API server on:

http://localhost:11434

Create ASP.NET Core Web API

Create a new project:

dotnet new webapi -n OllamaApi

Open project:

cd OllamaApi

Create Request Models

Create Models/OllamaRequest.cs

namespace OllamaApi.Models
{
    public class OllamaRequest
    {
        public string Model { get; set; }
        public string Prompt { get; set; }
        public bool Stream { get; set; } = false;
    }
}

Create Response Model

Models/OllamaResponse.cs

namespace OllamaApi.Models
{
    public class OllamaResponse
    {
        public string Response { get; set; }
    }
}

Create Ollama Service

Create Services/OllamaService.cs

using System.Text;
using System.Text.Json;
using OllamaApi.Models;

namespace OllamaApi.Services
{
    public class OllamaService
    {
        private readonly HttpClient _httpClient;

        public OllamaService(HttpClient httpClient)
        {
            _httpClient = httpClient;
        }

        public async Task<string> GenerateAsync(string prompt)
        {
            var request = new OllamaRequest
            {
                Model = "llama3",
                Prompt = prompt,
                Stream = false
            };

            var json = JsonSerializer.Serialize(request);

            var content = new StringContent(
                json,
                Encoding.UTF8,
                "application/json");

            var response = await _httpClient.PostAsync(
                "api/generate",
                content);

            response.EnsureSuccessStatusCode();

            var result = await response.Content.ReadAsStringAsync();

            using var document = JsonDocument.Parse(result);

            return document
                .RootElement
                .GetProperty("response")
                .GetString();
        }
    }
}

Register HttpClient

Update Program.cs

using OllamaApi.Services;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddControllers();

builder.Services.AddHttpClient<OllamaService>(client =>
{
    client.BaseAddress =
        new Uri("http://localhost:11434/");
});

builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

var app = builder.Build();

app.UseSwagger();
app.UseSwaggerUI();

app.MapControllers();

app.Run();

Create API Controller

Create Controllers/AiController.cs

using Microsoft.AspNetCore.Mvc;
using OllamaApi.Services;

namespace OllamaApi.Controllers
{
    [ApiController]
    [Route("api/[controller]")]
    public class AiController : ControllerBase
    {
        private readonly OllamaService _ollamaService;

        public AiController(
            OllamaService ollamaService)
        {
            _ollamaService = ollamaService;
        }

        [HttpPost("generate")]
        public async Task<IActionResult> Generate(
            [FromBody] string prompt)
        {
            var result =
                await _ollamaService.GenerateAsync(prompt);

            return Ok(new
            {
                success = true,
                data = result
            });
        }
    }
}

Run the API

Start ASP.NET Core project:

dotnet run

Swagger URL:

https://localhost:5001/swagger

Test API Request

Example request:

{
  "prompt": "Explain dependency injection in .NET"
}

Example response:

{
  "success": true,
  "data": "Dependency Injection is a design pattern..."
}

Advanced Features

You can extend the system with:

Feature Description
Streaming Responses Real-time token generation
Multi-Model Support Switch models dynamically
Chat Memory Store conversation history
Vector Database Semantic search
RAG Pipeline Retrieval-Augmented Generation
AI Agents Automated workflows
Function Calling Execute server functions

Using Streaming Responses

Ollama supports token streaming.

Request example:

{
  "model": "llama3",
  "prompt": "Write article about .NET",
  "stream": true
}

Streaming is useful for:

  • Chat applications
  • Live typing effect
  • AI assistants
  • Content generation tools

Best Practices

Use Background Services

For heavy AI tasks:

  • Queue requests
  • Use Hosted Services
  • Avoid blocking HTTP requests

Add Timeout Handling

_httpClient.Timeout = TimeSpan.FromMinutes(5);

Validate User Prompts

Prevent:

  • Prompt injection
  • Abuse
  • Extremely large requests

Cache Responses

Useful for:

  • Repeated prompts
  • SEO article generation
  • FAQ systems

Real-World Use Cases

Use Case Description
AI Chatbot Customer support
Auto Blogging Generate articles
Code Assistant Generate code snippets
Q&A Platform AI answer suggestions
Document Summarizer Summarize PDFs
AI Search Semantic search
Email Writer Generate professional emails

Ollama vs Cloud AI APIs

Feature Ollama Cloud APIs
Internet Required No Yes
Cost Free Pay-per-token
Privacy High Medium
Setup Complexity Medium Easy
Scalability Local hardware limited High
Offline Usage Yes No

Conclusion

Integrating Ollama with ASP.NET Core is a powerful way to build AI-enabled applications without depending entirely on cloud providers.

With only a few lines of C# code, you can create:

  • AI chat systems
  • Auto content generators
  • AI search engines
  • Smart assistants
  • Code generation tools

For developers already working in the .NET ecosystem, Ollama provides a simple and cost-effective way to bring local AI capabilities into production-ready applications.

Official Resources


Ravi Vishwakarma

IT-Hardware & Networking

Ravi Vishwakarma is a dedicated Software Developer with a passion for crafting efficient and innovative solutions. With a keen eye for detail and years of experience, he excels in developing robust software systems that meet client needs. His expertise spans across multiple programming languages and technologies, making him a valuable asset in any software development project.