Cross-Cutting Concerns

Platform-wide patterns that every service implements consistently. These are not optional — they are contracts enforced through the ss3-quarkus extension, shared proto schemas, and code review.

Audit Trail#

All state-mutating actions across the platform are recorded in a dedicated audit-service. Domain services never write to audit storage directly. Instead, they publish a structured audit.event Kafka event; audit-service consumes and persists it to an append-only PostgreSQL store.

Every audit record captures: actor identity (actor_id, actor_type), action name, target entity (type + ID), originating service, OTel trace_id, timestamp, and optional before/after state snapshots.

GDPR interaction: audit records are retained on customer erasure. The action record itself is kept for financial and compliance integrity. Only the actor_id of a CUSTOMER-type actor is replaced with the sentinel value ANONYMISED. Staff actor identities are never anonymised — staff actions on financial records must remain attributable.

PII in snapshots: before/after state snapshots must never contain raw PII. Publishers are responsible for stripping PII fields using the @AuditExclude annotation and SnapshotBuilder utility provided by the ss3-quarkus extension.

See Audit Service for the full service specification.

Soft Delete#

Entities with financial, GDPR, or referential integrity implications use a deleted_at TIMESTAMPTZ column (NULL = live, non-null = soft-deleted). There is no is_deleted boolean column anywhere on the platform.

A Hibernate @Filter named excludeDeleted (with default condition deleted_at IS NULL) is declared on all soft-deletable entities. The ss3-quarkus extension activates this filter automatically on every request. Admin queries that need to include soft-deleted records must disable it explicitly.

Hard deletes apply only to records with no downstream references and no compliance requirement — ephemeral records such as superseded sandbox drafts, expired OTP codes, and ephemeral cart session artifacts.

Entity categoryStrategyReason
Products, variants, price listsSoftReferenced by historical order snapshots
Customers, addressesSoftGDPR retention + order history integrity
Orders, order itemsSoftFinancial records
Staff accountsSoftAudit trail attribution
Promotions, gift cardsSoftReferenced by applied discounts
Superseded sandbox draftsHardNo references; TTL-cleaned
OTP / verification codesHardShort-lived; no references

Error Response Format#

All REST error responses use RFC 9457 Problem Details (application/problem+json). The Quarkus extension quarkus-resteasy-problem maps exceptions to this format automatically.

{
  "type": "https://errors.shopstar.io/catalog/product-not-found",
  "title": "Product Not Found",
  "status": 404,
  "detail": "Product with id=abc123 does not exist in store=store-1.",
  "instance": "/api/v1/products/abc123"
}

Validation errors extend the base shape with a violations array. Rate-limited responses (429) include a Retry-After header.

gRPC calls use standard gRPC status codes (NOT_FOUND, INVALID_ARGUMENT, FAILED_PRECONDITION, etc.) — Problem Details do not apply to the gRPC layer. GraphQL errors follow the GraphQL spec errors[] format.

Typed platform exception classes (EntityNotFoundException, StoreNotFoundException, ForbiddenException, ConflictException) are provided by the ss3-quarkus extension and map to their corresponding Problem Details shapes automatically.

Rate Limiting#

Rate limiting is enforced in two layers. Individual domain services are never rate-limit-aware.

LayerWhereMechanismScope
Coarsegateway-serviceBucket4j + Redis sliding windowPer-store and per-client-IP
Fine-grainedIstio EnvoyFilterEnvoy rate limit filterPer-route, per-service

Store-level limits are configurable in config-service and resolved at request time. Platform defaults apply to stores without a custom config. Throttled requests receive a 429 Problem Details response with a Retry-After header.

Feature Flags#

Feature flags have two distinct scopes managed independently:

ScopeStorageManaged byWhen to use
Application-levelVault (ss3/kv/shared/features/*)Platform engineeringInfrastructure kill switches, experimental rollouts, migrations
Store-levelstore-service (DB + Redis cache)Store admins via admin UIPer-store merchant features (loyalty points, gift cards, referral program, etc.)

Store-level flags are resolved at request time by calling store-service over gRPC using the generated stubs from ss3-protos. Each service that needs feature flag resolution injects its own @GrpcClient("store-service") stub directly — there is no shared wrapper. Caching strategy (e.g. Redis TTL) is the service’s own concern.

Idempotency#

All state-mutating gRPC methods across the platform require an idempotency_key (UUID v4 string) field in their request proto. This is a platform-wide contract — not a per-service option.

The ss3-quarkus extension provides an @Idempotent CDI interceptor. Annotating a gRPC service method with @Idempotent activates the full check-execute-record lifecycle automatically:

  1. Extract idempotency_key from the request (convention: all mutating protos expose getIdempotencyKey()).
  2. Check the idempotency_log table — if a non-expired entry exists, return the cached response immediately without re-executing.
  3. Execute the method, then persist the response to idempotency_log with a 24-hour TTL.

Kafka consumers use the KafkaIdempotencyGuard bean to deduplicate against a processed_messages table with the same TTL pattern. Both tables are created automatically in each service’s database by Flyway migrations shipped with the extension.

Callers (checkout-service, saga orchestrators) are responsible for generating UUIDs and passing them with every mutating call. A recommended key format: {entity-type}-{entity-id}:{action}.

Distributed Tracing and Correlation#

The OTel trace_id is the single correlation handle across the platform. No separate business-level correlation ID is used.

TransportPropagation
HTTP (REST, admin SPA)W3C traceparent header — auto-injected by quarkus-opentelemetry
gRPC (service-to-service)W3C traceparent in gRPC metadata — auto-injected by quarkus-opentelemetry
Kafka (async)W3C traceparent Kafka message header — consumers continue the trace by extracting it

All structured log lines carry trace_id, span_id, service.name, and service.version. To trace a business action end-to-end — for example, a checkout flowing through cart, inventory, payment, and order services — filter Datadog by trace_id. All synchronous spans and all async Kafka processing legs appear in a single trace view.