22 Comments
May 9Liked by Neo Kim

I love your work !

Expand full comment
author

thank you for the kind words, Akram.

Expand full comment

Great to see how they solved an initial problem of retries!

Still, exponential backoff + jitter could create a retry storm if there are calls multiple layers deep. So use it with care! If those multiple layers exist, implementing some retry budget with a token bucket or circuit breakers can help to avoid taking everything down on retries.

Expand full comment
author

thanks, also implement exponential backoff + jitter on each inner layer could help.

Expand full comment

Idempotency is a key feature for any payment system. Thank you for writing this article. Could you also write an article about how Stripe implements dashboard search? I believe they use ElasticSearch, but I'm curious about how they manage to search across different datasets.

Expand full comment
author

I added it to my list and will try to include it when I find time, thank you so much.

Expand full comment

Great topic and good summary.It would have been great if it can go more in-depth how data looks like and flow diagrams of different use cases.

Expand full comment
author

thanks for the feedback, I'll try to cover it in a future post.

Expand full comment

Thanks for the mention!

Expand full comment
author

you're welcome

Expand full comment

I wonder how to ensure the consistency of the step 2 and 3 above.

Expand full comment
author

what steps are you referring to?

Expand full comment

May be in "Using Idempotency Key to Prevent Double Payment“ diagram. What if at the step 2,the server successfully store the idempotency key to in-memory DB, but it failed to make network request to the ACID DB at step 3?

Expand full comment
author

I see, I don't have insights to their implementation details. But I'd guess:

1) they store the idempotency key only after ACID DB passes the request.

2) Or remove the idempotency key if ACID DB fails the request.

Does that help? Perhaps I'm overseeing some details or misunderstood the question.

Expand full comment

I was thinking that too. Communication failure between cache and ACID db. I imagine they would have a unique constraint on the actually DB for idempotency in the event of 2 requests sent too closely for the in-memory cache to update and reject the request.

Expand full comment

Great Content!, I learned a lot.

Expand full comment
author

thanks

Expand full comment

Does rolling back a transaction include removing the idempotency key from the in-memory db ? Why does clients have to wait 24 hrs to retry ?

Expand full comment
author

24 hours is the expiry time of a specific idempotency key. So client could retry many times with the same key within 24 hours.

> Does rolling back a transaction include removing the idempotency key from the in-memory db

this would be my guess.

Expand full comment

I like the post but unfortunately it has a few errors. Based on https://docs.stripe.com/api/idempotent_requests i see:

- "So they create a unique string (UUID) to use as the idempotency key."

It is clear from Stripe doc that the client is responsible for generating UUID - and it can potentially be whatever. On other page they provide 2 strategies which can be used to generate UUID.

- "Also they generate a new UUID whenever the request payload changes." - it is again client responsibility to generate a new UUID for logical new request. Stripe itself provides some safety net meaning that it seems like they store in their in-memory DB a pair: UUID -> request parameters. And they return error when a new request come with same UUID but different parameters.

Another thing is when to put stuff in in memory-db and in general how to keep consistent view of 2 datasources: ACID db and in-memory DB. In stripe doc they claim "We save results only after the execution of an endpoint begins" but according to their design "request parameters validation" and "request conflicts with another request" discovery are not part of API endpoint logic so if any of conditions above fails they do not store UUID in in-memory DB and client can retry request without any problem. So can be they put UUID with parameters in in-memory db before "true" endpoint logic is executed. Probably they update in-memory DB with response status for given UUID after api endpoint is executed.

Expand full comment

You said idempotency key changes with change in the payload. So the idempotency key is generated at the client side right? Also what if they revert back to the original payload? How we'll get the same UUID?

Expand full comment
author

yes - client generates the UUID. I think client could have a state object or cache.

Does that answer your question?

Expand full comment