Skip to content

fix(htlc): use proposal timestamp for deadline validation instead of time.Now()#1537

Closed
Aman-Cool wants to merge 1 commit intohyperledger-labs:mainfrom
Aman-Cool:fix/htlc-deadline-nondeterminism
Closed

fix(htlc): use proposal timestamp for deadline validation instead of time.Now()#1537
Aman-Cool wants to merge 1 commit intohyperledger-labs:mainfrom
Aman-Cool:fix/htlc-deadline-nondeterminism

Conversation

@Aman-Cool
Copy link
Copy Markdown
Contributor

What's the problem?

HTLC claim and reclaim transactions were failing near the deadline boundary in any real deployment. The validator was calling time.Now() inside the chaincode; which means each endorsing peer checks the deadline against its own clock. If two peers have even a small clock difference, one accepts the transaction and the other rejects it. Endorsement fails, and the locked tokens go nowhere.

The same thing happens between the FSC node and the peer, the node thinks "deadline passed, let me reclaim", assembles the transaction, but the peer's clock disagrees and rejects it. Funds end up stuck with no clean way to recover.

What's the fix?

Fabric already embeds a client-supplied timestamp in every proposal (stub.GetTxTimestamp()). It's the same value on every peer for the same transaction; that's exactly what we need here. The chaincode now injects this timestamp into the validation context, and both TransferHTLCValidate implementations use it instead of time.Now().

Non-chaincode paths (local FSC validation, unit tests) are unaffected, they fall back to time.Now() if no timestamp is in the context.

Why does this matter?

Any app running HTLC with a short deadline, or on a network where peer clocks aren't perfectly in sync (which is every real network), could hit this. Tokens locked and never claimable or reclaimable is about as bad as it gets.

@Aman-Cool
Copy link
Copy Markdown
Contributor Author

Hi @adecaro, Found that both TransferHTLCValidate implementations call time.Now() inside the chaincode; so peers with slightly different clocks disagree on claim vs reclaim near the deadline, endorsement fails, tokens get stuck.

Switched to stub.GetTxTimestamp() which is the same across all peers for a given proposal. Small diff, but can be pretty nasty in prod. Non-chaincode paths fall back to time.Now() as before.

Let me know if you'd want anything changed :)

@Aman-Cool Aman-Cool force-pushed the fix/htlc-deadline-nondeterminism branch from 35cc29e to d8f77cb Compare April 13, 2026 04:22
@adecaro
Copy link
Copy Markdown
Contributor

adecaro commented Apr 13, 2026

Hi @Aman-Cool , great effort. I'll review ASAP.
Please, open a Github Issue.
Many thanks 🙏

@Aman-Cool
Copy link
Copy Markdown
Contributor Author

Created issue: #1538

@Aman-Cool Aman-Cool force-pushed the fix/htlc-deadline-nondeterminism branch from d8f77cb to 50a7097 Compare April 13, 2026 05:09
@adecaro adecaro self-requested a review April 13, 2026 05:42
@adecaro adecaro self-assigned this Apr 13, 2026
@adecaro adecaro added the bug Something isn't working label Apr 13, 2026
@adecaro adecaro added this to the Q2/26 milestone Apr 13, 2026
…time.Now()

Signed-off-by: Aman-Cool <aman017102007@gmail.com>
@adecaro adecaro force-pushed the fix/htlc-deadline-nondeterminism branch from 50a7097 to 348048d Compare April 13, 2026 12:12
@adecaro
Copy link
Copy Markdown
Contributor

adecaro commented Apr 13, 2026

Hi @Aman-Cool , very interesting PR. I think we can learn something interesting here about the complexity of a distributed system and the many ways it can be attacked.

The endorsers cannot trust the client in submitting accurate information. On the contrary, a client needs to be assumed malicious. Indeed, if merge your change, then a malicious client can make the endorser believe that a deadline passed when it didn't.

Tracking time in Fabric is a well known problem. Who deploys this needs to be very careful.

So, if this convinces you, I would close this PR. What do you think?

@Aman-Cool
Copy link
Copy Markdown
Contributor Author

Hey @adecaro, thanks for taking the time to explain this so thoughtfully; this is exactly the kind of feedback I was hoping for when I opened the PR :)

I'll be honest, when I was debugging the non-determinism issue, I kind of tunnel-visioned on "make all endorsers see the same clock" and reached for the proposal timestamp without fully thinking through who controls it. Your point about assuming a malicious client is a really fundamental shift in how I was thinking about this, I was thinking about it more like a race condition fix than a security boundary.

So if I'm understanding correctly: the original time.Now() problem hurts honest clients (MVCC conflicts, retries), but it doesn't open a new attack surface. Whereas my fix solves that but hands the attacker a lever they didn't have before. Trading an operational headache for a security hole; not a great trade.

That said, I'm curious; is there a right way to solve this class of problem on Fabric? I was thinking about two directions:

  1. Keep the proposal timestamp but have the endorser validate it's within some tight skew window of its own wall clock (so a malicious client can only manipulate by a few minutes, not arbitrarily). Since HTLC deadlines are usually hours or days out, would that be "good enough" or does it still feel too leaky to you?
  2. Move deadlines from wall-clock time to block heights; orderer-assigned, not client-controlled. Breaks the API but removes the trust problem entirely.

Or is this genuinely one of those situations where there's no clean answer and the right move is to document it as a known limitation for deployers to account for?

I'm leaning toward closing this if neither of those directions feels right to you; I'd rather ship nothing than ship something that makes the system easier to attack. But I wanted to ask before I did, because I feel like I'm learning something real here about distributed systems trust models and I'd hate to miss the lesson...

@adecaro
Copy link
Copy Markdown
Contributor

adecaro commented Apr 16, 2026

Hi @Aman-Cool , sorry for the late reply. I would definitely make sure that the developer can initialize the validator with the function to retrieve time with the default behaviour being time.Now(). This way, we allow anyone to change this behavior as it is best for their env.

What do you think? 😄

@Aman-Cool
Copy link
Copy Markdown
Contributor Author

Hey @adecaro, that makes a lot of sense to me. Here's how I'm thinking about the implementation:

Add a TimeFunc func() time.Time field to the Validator struct (defaulting to time.Now if nil), thread it into the validation Context, and have TransferHTLCValidate call it instead of time.Now() directly. I'd also revert the stub.GetTxTimestamp() injection in tcc.go entirely; that's the piece that hands control to the client.

The result: secure by default (endorser calls time.Now()), but an operator who controls their environment can swap in any time source when constructing the validator.

Does that match what you had in mind? If so, I'll close this PR and open a fresh one with that approach.

@adecaro
Copy link
Copy Markdown
Contributor

adecaro commented Apr 21, 2026

@Aman-Cool , yes, please, go ahead. Thanks much 🙏

@Aman-Cool
Copy link
Copy Markdown
Contributor Author

@adecaro Thanks for your guidance on this🙏

@Aman-Cool Aman-Cool closed this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: HTLC claim/reclaim fails near deadline due to time.Now() in chaincode validator

2 participants