Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: Continue an existing transaction #329

Closed
Moortiii opened this issue Feb 28, 2024 · 3 comments · Fixed by #331
Closed

feature request: Continue an existing transaction #329

Moortiii opened this issue Feb 28, 2024 · 3 comments · Fixed by #331

Comments

@Moortiii
Copy link
Contributor

Moortiii commented Feb 28, 2024

I've come across a case where a transaction needs to be shared across multiple systems. If we wrap the REST API we can easily achieve this by setting the x-arango-trx-id header. However, we would like to be able receive transaction IDs on both ends and continue the transaction seamlessly using the python-arango interface, instead of crudely performing raw queries against /_cursor.

I've come up with the following hack, which does work, but given that _executor is private, and _executor.id specifically doesn't have a setter, I'm guessing there may be a reason it's discouraged:

from arango.database import TransactionDatabase
from arango.client import ArangoClient

def continue_transaction(db: StandardDatabase, transaction_id):
    trx = TransactionDatabase(connection=deepcopy(db.conn))
    trx._executor._id = transaction_id
    return trx

db = ArangoClient(...).db(...)
trx = continue_transaction(db=db, transaction_id="1234")
trx.collection("vertex").insert({"_key": "test"})
trx.commit_transaction() # Alternatively, don't commit here and let the client who provided the transaction commit it themselves.

Would it make sense to support something like this directly? It seems to me like a reasonable use-case. If so, I'm happy to take a stab at developing a PR for this myself.

@apetenchea
Copy link
Member

apetenchea commented Mar 10, 2024

Hi @Moortiii,

I understand your proposal, and I think it is quite sensible. Updating the same transaction concurrently can cause some uncertainty due to timing issues, but when done carefully, I can imagine some valid use-cases.

As you pointed out, the _executor.id is indeed private. While adding a setter would be the easy way out of this, it would potentially allow users to write code like this:

trx = db.begin_transaction()
col1 = trx.collection("col1")
trx._executor._id = another_transaction
col2 = trx.collection("col2")

Not only the transaction ID can easily get lost, thus preventing one from ever accessing the initial transaction again, but the problem can be easily overlooked, as it is hidden in just one line of code. Frankly, I believe even the x-arango-trx-id setting trick is way better - it may look weird, but it's "loud and clear", there will be no problem figuring out what (and why) you wrote it there.

Following up on what I would consider a reasonable solution

  • Modify the TransactionAPIExecutor constructor such that it contains a new field, transaction_id, which can be None or str (basically an Optional[str]). In case it is a str, the constructor should no longer send a request to /_api/transaction/begin, but set the _id property directly and check the status() of the transaction in order to validate it really exists.
  • The same parameter should be added to the TransactionDatabase, which would forward it to the executor. This is straight forward.
  • The StandardDatabase should get a fetch_transaction method, which takes the transaction ID and returns a TransactionDatabase. I'm suggesting fetch_transaction because it implies that a transaction may (or may not) be there, rather than continuing one (which is not necessarily "paused").

Testing
Introduce a test case test_transaction.py, something simple, just to check that we're able to use both the initial transaction and the "continued" object.

def test_transaction_fetch(db, col, docs):
    txn_db = db.begin_transaction(write=col.name)
    txn_col = txn_db.collection(col.name)
    txn_db2 = db.fetch_transaction(txn_db.transaction_id)
    # insert some documents using both txn's
    # ...

Docs
A small edit in transaction.rst would be great to showcase how fetch_transaction is supposed to be used.

I'm ready to implement the above. Or, if you want to give it a go, I'm perfectly fine with that, but don't feel pressured, I'm just mentioning since you offered. Let me know how you want to proceed.

@Moortiii
Copy link
Contributor Author

Moortiii commented Mar 11, 2024 via email

Moortiii added a commit to Moortiii/python-arango that referenced this issue Mar 11, 2024
Moortiii added a commit to Moortiii/python-arango that referenced this issue Mar 11, 2024
Moortiii added a commit to Moortiii/python-arango that referenced this issue Mar 11, 2024
@Moortiii
Copy link
Contributor Author

I've opened a PR @apetenchea.

I did consider something like this as well:

request = Request(
    method="get",
    endpoint=f"/_api/transaction/{transaction_id}",
)
resp = self._conn.send_request(request)

if not resp.is_success:
    raise TransactionInitError(resp, request)

result = resp.body["result"]

if result["status"] != "running":
    raise TransactionInitError(resp, request)

self._id = transaction_id

My intention was to prevent a user from 'continuing' a transaction that is already committed or aborted, which would be mostly pointless. However, since the response from the API when fetching the status is 200 OK, raising a TransactionInitError and feeding it the response produced results that would be confusing to the end user. I also realized that perhaps checking the status of a transaction in an external system (that may be committed already) could be useful in some niche cases.

As a sidenote, I noticed that the docs on contributing that are present in the sphinx documentation appears to be outdated. I had to follow the contribution guidelines directly in the repository to get anywhere.

Moortiii added a commit to Moortiii/python-arango that referenced this issue Mar 11, 2024
Moortiii added a commit to Moortiii/python-arango that referenced this issue Mar 11, 2024
Moortiii added a commit to Moortiii/python-arango that referenced this issue Mar 11, 2024
Moortiii added a commit to Moortiii/python-arango that referenced this issue Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants