Loading…

Self-Hosted

Deployment

Spalce ships a self-hosted distribution for customers with strict residency, regulatory, or operational requirements. The self-hosted edition is functionally identical to our managed offering — same APIs, same dashboards, same SDKs — but you own the operational responsibility. This guide is the entry point for sizing, installing, and operating a self-hosted cluster.

Warning

Self-hosted is a serious commitment. If you do not have a 24/7 platform team, talk to us about a managed-in-region option before you start.

Reference topology

A minimum production deployment has three tiers: a Kubernetes cluster for stateless services, a managed Postgres for transactional state, and a managed object store for blobs. We support AWS, GCP, and Azure as first-class targets, plus VMware vSphere for on-premise deployments. The same Helm chart works across all of them — only the values files differ.

bash

# Minimum recommended sizing for 10k req/min sustained
# Kubernetes cluster
  workers: 6 x (8 vCPU, 32 GiB RAM)
  pod CIDR: /16

# Postgres
  primary: 8 vCPU, 32 GiB RAM, 500 GiB io2
  replicas: 2 (same shape)

# Object store
  Class: standard, lifecycle policy enabled

Installing with Helm

We publish a signed Helm chart at oci://charts.spalce.dev/spalce. The chart manages every Spalce-owned workload, leaves your platform-team-owned resources alone, and supports both blue-green and rolling upgrades. We recommend installing into a dedicated namespace so RBAC and network policies are easy to scope.

bash

kubectl create namespace spalce
helm install spalce oci://charts.spalce.dev/spalce \
  --namespace spalce \
  --version 4.12.0 \
  --values prod.values.yaml

Operating responsibilities

On a self-hosted deployment, you own four operational responsibilities that we own on the managed platform: backups, capacity planning, security patching, and upgrade scheduling. We provide playbooks, dashboards, and a quarterly health review with our SRE team, but you decide when a change ships.

Backups — Postgres point-in-time recovery plus daily snapshots, retained 30 days.
Capacity — review request volume, queue depth, and database IOPS weekly.
Patching — apply security fixes within 14 days for high severity, 30 for medium.
Upgrades — minor versions monthly, majors quarterly, with a maintenance window.

Observability

Every service emits Prometheus metrics on /metrics, OpenTelemetry traces over OTLP, and structured JSON logs to stdout. We ship Grafana dashboards as code in the chart, plus a curated set of alerts that map to severity definitions. Hook them into your existing on-call rotation rather than building a parallel one.

// Drop-in alert rule for slow authorization path (Prometheus)
// File: alerts/authorization-slow.yaml
groups:
  - name: spalce.authorization
    rules:
      - alert: AuthorizationP99Slow
        expr: histogram_quantile(0.99,
          sum(rate(http_request_duration_seconds_bucket{job="auth"}[5m]))
          by (le)) > 0.2
        for: 10m
        labels: { severity: page }
        annotations:
          summary: "Authorization P99 above 200ms for 10m"

Tip

Run our preflight CLI in CI for every infra PR. It catches 80% of misconfigurations before they reach apply.

Upgrades and rollback

Helm upgrades are reversible for at least one prior version. Every release publishes a downgrade script and a migration plan — most upgrades are forward-compatible at the database layer, but some require a brief read-only window. We post a release calendar 30 days in advance and never gate critical security fixes behind a feature flag.

Have a question this didn't answer?

Talk to an engineer. We respond within one business day.

Contact Support

Self-Hosted

Deployment

Warning

Self-hosted is a serious commitment. If you do not have a 24/7 platform team, talk to us about a managed-in-region option before you start.

Reference topology

bash

# Minimum recommended sizing for 10k req/min sustained
# Kubernetes cluster
  workers: 6 x (8 vCPU, 32 GiB RAM)
  pod CIDR: /16

# Postgres
  primary: 8 vCPU, 32 GiB RAM, 500 GiB io2
  replicas: 2 (same shape)

# Object store
  Class: standard, lifecycle policy enabled

Installing with Helm

bash

kubectl create namespace spalce
helm install spalce oci://charts.spalce.dev/spalce \
  --namespace spalce \
  --version 4.12.0 \
  --values prod.values.yaml

Operating responsibilities

Backups — Postgres point-in-time recovery plus daily snapshots, retained 30 days.
Capacity — review request volume, queue depth, and database IOPS weekly.
Patching — apply security fixes within 14 days for high severity, 30 for medium.
Upgrades — minor versions monthly, majors quarterly, with a maintenance window.

Observability

// Drop-in alert rule for slow authorization path (Prometheus)
// File: alerts/authorization-slow.yaml
groups:
  - name: spalce.authorization
    rules:
      - alert: AuthorizationP99Slow
        expr: histogram_quantile(0.99,
          sum(rate(http_request_duration_seconds_bucket{job="auth"}[5m]))
          by (le)) > 0.2
        for: 10m
        labels: { severity: page }
        annotations:
          summary: "Authorization P99 above 200ms for 10m"

Tip

Run our preflight CLI in CI for every infra PR. It catches 80% of misconfigurations before they reach apply.

Upgrades and rollback

Have a question this didn't answer?

Talk to an engineer. We respond within one business day.

Contact Support