# With Dockerdocker run -p 8080:8080 --name local-ai -ti localai/localai:latest
# Or with Podmanpodman run -p 8080:8080 --name local-ai -ti localai/localai:latest
This will start LocalAI. The API will be available at http://localhost:8080. For images with pre-configured models, see All-in-One images.
The fastest way to get started is with the CPU image:
docker run -p 8080:8080 --name local-ai -ti localai/localai:latest
# Or with Podman:podman run -p 8080:8080 --name local-ai -ti localai/localai:latest
This will:
Start LocalAI (you’ll need to install models separately)
Make the API available at http://localhost:8080
Image Types
LocalAI provides several image types to suit different needs. These images work with both Docker and Podman.
Standard Images
Standard images don’t include pre-configured models. Use these if you want to configure models manually.
CPU Image
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 localai/localai:latest
GPU Images
NVIDIA CUDA 13:
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-13
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device nvidia.com/gpu=all localai/localai:latest-gpu-nvidia-cuda-13
NVIDIA CUDA 12:
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device nvidia.com/gpu=all localai/localai:latest-gpu-nvidia-cuda-12
AMD GPU (ROCm):
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device rocm.com/gpu=all localai/localai:latest-gpu-hipblas
Intel GPU:
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device gpu.intel.com/all localai/localai:latest-gpu-intel
Vulkan:
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan
NVIDIA Jetson (L4T ARM64):
CUDA 12 (for Nvidia AGX Orin and similar platforms):
docker run -ti --name local-ai -p 8080:8080 --runtime nvidia --gpus all localai/localai:latest-nvidia-l4t-arm64
CUDA 13 (for Nvidia DGX Spark):
docker run -ti --name local-ai -p 8080:8080 --runtime nvidia --gpus all localai/localai:latest-nvidia-l4t-arm64-cuda-13
All-in-One (AIO) Images
Recommended for beginners - These images come pre-configured with models and backends, ready to use immediately.
CPU Image
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu
GPU Images
NVIDIA CUDA 13:
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-13
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device nvidia.com/gpu=all localai/localai:latest-aio-gpu-nvidia-cuda-13
NVIDIA CUDA 12:
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-12
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device nvidia.com/gpu=all localai/localai:latest-aio-gpu-nvidia-cuda-12
AMD GPU (ROCm):
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device rocm.com/gpu=all localai/localai:latest-aio-gpu-hipblas
Intel GPU:
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 --device gpu.intel.com/all localai/localai:latest-aio-gpu-intel
Using Compose
For a more manageable setup, especially with persistent volumes, use Docker Compose or Podman Compose:
docker compose up -d
# Or with Podman:podman-compose up -d
Persistent Storage
To persist models and configurations, mount a volume:
docker run -ti --name local-ai -p 8080:8080 \
-v $PWD/models:/models \
localai/localai:latest-aio-cpu
# Or with Podman:podman run -ti --name local-ai -p 8080:8080 \
-v $PWD/models:/models \
localai/localai:latest-aio-cpu
Or use a named volume:
docker volume create localai-models
docker run -ti --name local-ai -p 8080:8080 \
-v localai-models:/models \
localai/localai:latest-aio-cpu
# Or with Podman:podman volume create localai-models
podman run -ti --name local-ai -p 8080:8080 \
-v localai-models:/models \
localai/localai:latest-aio-cpu
What’s Included in AIO Images
All-in-One images come pre-configured with:
Text Generation: LLM models for chat and completion
Image Generation: Stable Diffusion models
Text to Speech: TTS models
Speech to Text: Whisper models
Embeddings: Vector embedding models
Function Calling: Support for OpenAI-compatible function calling
The AIO images use OpenAI-compatible model names (like gpt-4, gpt-4-vision-preview) but are backed by open-source models. See the container images documentation for the complete mapping.
Next Steps
After installation:
Access the WebUI at http://localhost:8080
Check available models: curl http://localhost:8080/v1/models
LocalAI can be built as a container image or as a single, portable binary. Note that some model architectures might require Python libraries, which are not included in the binary.
LocalAI’s extensible architecture allows you to add your own backends, which can be written in any language, and as such the container images contains also the Python dependencies to run all the available backends (for example, in order to run backends like Diffusers that allows to generate images and videos from text).
This section contains instructions on how to build LocalAI from source.
Build LocalAI locally
Requirements
In order to build LocalAI locally, you need the following requirements:
Golang >= 1.21
GCC
GRPC
To install the dependencies follow the instructions below:
Install xcode from the App Store
brew install go protobuf protoc-gen-go protoc-gen-go-grpc wget
apt install golang make protobuf-compiler-grpc
After you have golang installed and working, you can install the required binaries for compiling the golang protobuf components via the following commands
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
make build
Build
To build LocalAI with make:
git clone https://github.com/go-skynet/LocalAI
cd LocalAI
make build
This should produce the binary local-ai
Container image
Requirements:
Docker or podman, or a container engine
In order to build the LocalAI container image locally you can use docker, for example:
docker build -t localai .
docker run localai
Example: Build on mac
Building on Mac (M1, M2 or M3) works, but you may need to install some prerequisites using brew.
The below has been tested by one mac user and found to work. Note that this doesn’t use Docker to run the server:
Install xcode from the Apps Store (needed for metalkit)
If you encounter errors regarding a missing utility metal, install Xcode from the App Store.
After the installation of Xcode, if you receive a xcrun error 'xcrun: error: unable to find utility "metal", not a developer tool or in PATH'. You might have installed the Xcode command line tools before installing Xcode, the former one is pointing to an incomplete SDK.
If completions are slow, ensure that gpu-layers in your model yaml matches the number of layers from the model in use (or simply use a high number such as 256).
If you get a compile error: error: only virtual member functions can be marked 'final', reinstall all the necessary brew packages, clean the build, and try again.
brew reinstall go grpc protobuf wget
make clean
make build
Build backends
LocalAI have several backends available for installation in the backend gallery. The backends can be also built by source. As backends might vary from language and dependencies that they require, the documentation will provide generic guidance for few of the backends, which can be applied with some slight modifications also to the others.
Manually
Typically each backend include a Makefile which allow to package the backend.
In the LocalAI repository, for instance you can build a backend by doing:
git clone https://github.com/go-skynet/LocalAI.git
make -C LocalAI/backend/python/vllm
With Docker
Building with docker is simpler as abstracts away all the requirement, and focuses on building the final OCI images that are available in the gallery. This allows for instance also to build locally a backend and install it with LocalAI. You can refer to Backends for general guidance on how to install and develop backends.
In the LocalAI repository, you can build a backend by doing:
git clone https://github.com/go-skynet/LocalAI.git
make docker-build-<backend-name>
Note that make is only by convenience, in reality it just runs a simple docker command as: