Date of Incident: March 27, 2025 09:29 UTC
Resolved: March 27, 2025 10:27 UTC
Duration: 58 minutes
Severity: P2
Data Reliability Features:
During the 4.2.0 release deployment, a database migration failure in the Catalog Service caused partial
system degradation. The issue stemmed from a schema validation mismatch between the database
layer and application ORM, specically around tenant_id eld length validation.
Database vs. ORM Inconsistency:
Layer | Field | Constraint |
---|---|---|
Database | tenant_id | VARCHAR(64) |
Exposed ORM | tenant_id | @Column(length=34) |
*The migration failed when processing a tenant_id with 37 characters
@Table(name = "tenants")
class Tenant : IntIdTable() {
val tenant_id = varchar("tenant_id", 34) // Constraint
}
The migration failed when encountering a tenant_id exceeding 34 characters.
Time (UTC) | Duration | Action |
---|---|---|
09:29 | Deployment failure detected | |
09:32 | 3 mins | Team mobilisation |
09:37 | 5 mins | Root cause identified |
Time (UTC) | Duration | Action |
---|---|---|
09:40 | Hotfix development started | |
09:45 | 5 mins | Core x implemented |
09:50 | 5 mins | Validation logic added |
09:52 | 2 mins | 09:52 2m Unit tests completed |
Time (UTC) | Duration | Action |
---|---|---|
09:55 | CI pipeline triggered | |
09:58 | 3 mins | Build artifacts ready |
10:00 | 2 mins | Staging deployment |
10:10 | 10 mins | Staging verication |
10:15 | 5 mins | Production rollout |
10:25 | 10 mins | Full deployment |
10:27 | 2 mins | Systems normal |
Deployed hotfix with:
Verified all tenant_id values in production
Validated cross-service compatibility
Action Item | Owner |
---|---|
Schema validation pre-checks | Data Eng |
Testing Gap: Need real-production-data migration testing