Platform-wide patterns that every service implements consistently. These are not optional — they are contracts enforced through the ss3-quarkus extension, shared proto schemas, and code review.
Audit Trail#
All state-mutating actions across the platform are recorded in a dedicated audit-service. Domain services never write to audit storage directly. Instead, they publish a structured audit.event Kafka event; audit-service consumes and persists it to an append-only PostgreSQL store.
Every audit record captures: actor identity (actor_id, actor_type), action name, target entity (type + ID), originating service, OTel trace_id, timestamp, and optional before/after state snapshots.
GDPR interaction: audit records are retained on customer erasure. The action record itself is kept for financial and compliance integrity. Only the actor_id of a CUSTOMER-type actor is replaced with the sentinel value ANONYMISED. Staff actor identities are never anonymised — staff actions on financial records must remain attributable.
PII in snapshots: before/after state snapshots must never contain raw PII. Publishers are responsible for stripping PII fields using the @AuditExclude annotation and SnapshotBuilder utility provided by the ss3-quarkus extension.
See Audit Service for the full service specification.
Soft Delete#
Entities with financial, GDPR, or referential integrity implications use a deleted_at TIMESTAMPTZ column (NULL = live, non-null = soft-deleted). There is no is_deleted boolean column anywhere on the platform.
A Hibernate @Filter named excludeDeleted (with default condition deleted_at IS NULL) is declared on all soft-deletable entities. The ss3-quarkus extension activates this filter automatically on every request. Admin queries that need to include soft-deleted records must disable it explicitly.
Hard deletes apply only to records with no downstream references and no compliance requirement — ephemeral records such as superseded sandbox drafts, expired OTP codes, and ephemeral cart session artifacts.
| Entity category | Strategy | Reason |
|---|---|---|
| Products, variants, price lists | Soft | Referenced by historical order snapshots |
| Customers, addresses | Soft | GDPR retention + order history integrity |
| Orders, order items | Soft | Financial records |
| Staff accounts | Soft | Audit trail attribution |
| Promotions, gift cards | Soft | Referenced by applied discounts |
| Superseded sandbox drafts | Hard | No references; TTL-cleaned |
| OTP / verification codes | Hard | Short-lived; no references |
Error Response Format#
All REST error responses use RFC 9457 Problem Details (application/problem+json). The Quarkus extension quarkus-resteasy-problem maps exceptions to this format automatically.
{
"type": "https://errors.shopstar.io/catalog/product-not-found",
"title": "Product Not Found",
"status": 404,
"detail": "Product with id=abc123 does not exist in store=store-1.",
"instance": "/api/v1/products/abc123"
}Validation errors extend the base shape with a violations array. Rate-limited responses (429) include a Retry-After header.
gRPC calls use standard gRPC status codes (NOT_FOUND, INVALID_ARGUMENT, FAILED_PRECONDITION, etc.) — Problem Details do not apply to the gRPC layer. GraphQL errors follow the GraphQL spec errors[] format.
Typed platform exception classes (EntityNotFoundException, StoreNotFoundException, ForbiddenException, ConflictException) are provided by the ss3-quarkus extension and map to their corresponding Problem Details shapes automatically.
Rate Limiting#
Rate limiting is enforced in two layers. Individual domain services are never rate-limit-aware.
| Layer | Where | Mechanism | Scope |
|---|---|---|---|
| Coarse | gateway-service | Bucket4j + Redis sliding window | Per-store and per-client-IP |
| Fine-grained | Istio EnvoyFilter | Envoy rate limit filter | Per-route, per-service |
Store-level limits are configurable in config-service and resolved at request time. Platform defaults apply to stores without a custom config. Throttled requests receive a 429 Problem Details response with a Retry-After header.
Feature Flags#
Feature flags have two distinct scopes managed independently:
| Scope | Storage | Managed by | When to use |
|---|---|---|---|
| Application-level | Vault (ss3/kv/shared/features/*) | Platform engineering | Infrastructure kill switches, experimental rollouts, migrations |
| Store-level | store-service (DB + Redis cache) | Store admins via admin UI | Per-store merchant features (loyalty points, gift cards, referral program, etc.) |
Store-level flags are resolved at request time by calling store-service over gRPC using the generated stubs from ss3-protos. Each service that needs feature flag resolution injects its own @GrpcClient("store-service") stub directly — there is no shared wrapper. Caching strategy (e.g. Redis TTL) is the service’s own concern.
Idempotency#
All state-mutating gRPC methods across the platform require an idempotency_key (UUID v4 string) field in their request proto. This is a platform-wide contract — not a per-service option.
The ss3-quarkus extension provides an @Idempotent CDI interceptor. Annotating a gRPC service method with @Idempotent activates the full check-execute-record lifecycle automatically:
- Extract
idempotency_keyfrom the request (convention: all mutating protos exposegetIdempotencyKey()). - Check the
idempotency_logtable — if a non-expired entry exists, return the cached response immediately without re-executing. - Execute the method, then persist the response to
idempotency_logwith a 24-hour TTL.
Kafka consumers use the KafkaIdempotencyGuard bean to deduplicate against a processed_messages table with the same TTL pattern. Both tables are created automatically in each service’s database by Flyway migrations shipped with the extension.
Callers (checkout-service, saga orchestrators) are responsible for generating UUIDs and passing them with every mutating call. A recommended key format: {entity-type}-{entity-id}:{action}.
Distributed Tracing and Correlation#
The OTel trace_id is the single correlation handle across the platform. No separate business-level correlation ID is used.
| Transport | Propagation |
|---|---|
| HTTP (REST, admin SPA) | W3C traceparent header — auto-injected by quarkus-opentelemetry |
| gRPC (service-to-service) | W3C traceparent in gRPC metadata — auto-injected by quarkus-opentelemetry |
| Kafka (async) | W3C traceparent Kafka message header — consumers continue the trace by extracting it |
All structured log lines carry trace_id, span_id, service.name, and service.version. To trace a business action end-to-end — for example, a checkout flowing through cart, inventory, payment, and order services — filter Datadog by trace_id. All synchronous spans and all async Kafka processing legs appear in a single trace view.