disagg: Fix unexpected object storage usage caused by pre-lock residue (#10760) by ti-chi-bot · Pull Request #10767 · pingcap/tiflash

ti-chi-bot · 2026-03-23T10:06:35Z

This is an automated cherry-pick of #10760

What problem does this PR solve?

Issue Number: close #10763

Problem Summary:

In concurrent remote write paths, PageDirectory write-group semantics could cause follower writers to miss their own applied lock-id cleanup signals.
As a result, S3LockLocalManager.pre_lock_keys could remain resident and be repeatedly written into manifest locks.
S3GC then treated many obsolete objects as still protected, leading to long-term remote storage usage inflation.

What is changed and how it works?

disagg: eliminate pre-lock key residue that lead to unexpected OSS usage

End-to-end correctness fixes for lock lifecycle
- PageDirectory::apply now returns writer-scoped applied_data_files for both write-group owner and followers, so each writer gets its own cleanup signal.
- UniversalPageStorage::write uses those per-writer ids to clean pre-locks reliably after apply.
- Added explicit failure cleanup path: cleanPreLockKeysOnWriteFailure(...) is invoked when remote write/apply fails.
- createS3LockForWriteBatch was adjusted to avoid partial pre-lock residue on partial lock-creation failures (append to pre_lock_keys after lock-creation pass), and its return value is now aligned with "newly appended keys" semantics.
Test coverage and regression guards
- Added write-group concurrency tests in PageDirectory and UniversalPageStorage paths.
- Added focused S3LockLocalManager tests for partial cleanup, failure cleanup, lock-return semantics, and partial-failure atomicity.
- Updated SyncPoint-based async tests to use std::launch::async to avoid deferred scheduling risk.
Observability and operations improvements
- Most observability change are split into seperate PR disagg: Add O11y on object store usage summary of each tiflash store #10764 to keep this logical changes clean
- Added lock-manager metrics to track pre-lock residency and cleanup outcomes (hit/miss/remaining).
- Added owner-only periodic S3 storage summary in S3GCManagerService.
- Added per-store S3 summary gauge:
  - tiflash_storage_s3_store_summary_bytes{store_id, type=data_file_bytes|dt_file_bytes}
- Added setting remote_summary_interval_seconds and wired it through TMTContext; <= 0 disables periodic summary task registration.
- Updated Grafana panels for the new S3 summary metric.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)

# Run chbenchmark workload and check the metrics of `prelock_keys` and OSS usage
tiup bench ch --host 10.2.12.81 -P 8081 --warehouses 8000 run -D chbenchmark8k -T 50 -t 0 --time 30m --ignore-error --queries q1
# Before the fix, from 23:29 to 00:00, the number of prelock_keys in memory would accumulate and increase with the write load; after the fix, from 02:00 to 02:30, there was no longer any persistent residue of prelock_keys in memory.
# Also can check the new added grafana panel "Remote Store Summary (Disagg arch)"

No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Fix an issue in disaggregated remote-write paths where pre-lock keys could remain resident under write-group concurrency or partial failure, causing S3GC to retain obsolete objects and inflate remote storage usage. Also add configurable periodic S3 storage summary and per-store summary metrics.

Summary by CodeRabbit

Bug Fixes
- Resolved S3 pre-lock key cleanup on write failures to prevent orphaned lock keys.
- Improved remote write error handling with enhanced exception logging.
New Features
- Added S3 lock manager metrics for monitoring lock creation, cleanup, and status.
- Extended S3 store summary metrics tracking.
Improvements
- Enhanced checkpoint operation logging for better visibility.
- Refined concurrent write batch processing with improved lock key tracking.

Signed-off-by: ti-chi-bot <[email protected]>

ti-chi-bot · 2026-03-23T10:06:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign guo-shaoge for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-03-23T10:06:39Z

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

It must be LGTMed and approved by the reviewers firstly.
For pull requests to TiDB-x branches, it must have no failed tests.
AFTER it has lgtm and approved labels, please wait for the cherry-pick merging approval from triage owners.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2026-03-23T10:07:15Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)

release-8.5
release-7.5
release-8.1

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8b49cbf3-ec25-4022-b6b2-7d6cd3fd05f6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

This is an automated cherry-pick of pingcap#10760

fdb8f44

Signed-off-by: ti-chi-bot <[email protected]>

ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-nextgen-20251011 labels Mar 23, 2026

ti-chi-bot bot added the do-not-merge/cherry-pick-not-approved label Mar 23, 2026

ti-chi-bot mentioned this pull request Mar 23, 2026

disagg: Fix unexpected object storage usage caused by pre-lock residue #10760

Merged

12 tasks

ti-chi-bot assigned JaySon-Huang Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disagg: Fix unexpected object storage usage caused by pre-lock residue (#10760)#10767

disagg: Fix unexpected object storage usage caused by pre-lock residue (#10760)#10767
ti-chi-bot wants to merge 1 commit intopingcap:release-nextgen-20251011from
ti-chi-bot:cherry-pick-10760-to-release-nextgen-20251011

ti-chi-bot commented Mar 23, 2026

Uh oh!

ti-chi-bot bot commented Mar 23, 2026

Uh oh!

ti-chi-bot bot commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ti-chi-bot commented Mar 23, 2026

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot bot commented Mar 23, 2026

Uh oh!

ti-chi-bot bot commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants