obj: handle ENOMEM during vector allocations#5537
obj: handle ENOMEM during vector allocations#5537bfaccini wants to merge 4 commits intopmem:masterfrom
Conversation
| struct user_buffer_def *userbuf; | ||
| VEC_FOREACH_BY_PTR(userbuf, &tx->redo_userbufs) | ||
| operation_add_user_buffer(tx->lane->external, userbuf); | ||
| struct operation_context *ctx = tx->lane->external; |
There was a problem hiding this comment.
Like I indicated earlier in the issue, I don't think we need to check for this error here. Instead, we simply need to ensure that the operation_add_user_buffer never fails (like you've already done in tx_construct_user_buffer).
To have the FATAL/ASSERT that you added you can change the operation_add_user_buffer to return an error (-1) if VEC_PUSH_BACK(&ctx->next, buffer_offset); fails. This would let you avoid making all the memops structs public.
There was a problem hiding this comment.
not sure to fully follow your reasoning, but anyway I will try to push a new commit based on what I think I have understood...
| }; | ||
|
|
||
| struct operation_context; | ||
| enum operation_state { |
There was a problem hiding this comment.
It would be better if we kept the structs private and instead relied on some inter-module API if necessary.
|
|
||
| struct user_buffer_def *userbuf; | ||
| struct operation_context *ctx = tx->lane->external; | ||
| uint64_t buf_off; |
There was a problem hiding this comment.
Oops, sorry to have missed that
lukaszstolarczuk
left a comment
There was a problem hiding this comment.
Reviewable status: 0 of 5 files reviewed, 6 unresolved discussions (waiting on @bfaccini and @pbalcer)
-- commits line 20 at r3:
you can squash all the commits into a single one, please
src/libpmemobj/memops.h line 2 at r3 (raw file):
/* SPDX-License-Identifier: BSD-3-Clause */ /* Copyright 2016-2020, Intel Corporation */
Since you're using an Intel e-mail address for your commits, you have to update all dates to -2023.
E.g., in this file: 2016-2023
src/libpmemobj/tx.c line 714 at r3 (raw file):
userbuf.size - TX_INTENT_LOG_BUFFER_OVERHEAD; } else { if (operation_add_user_buffer(ctx, &userbuf) == -1);
is this ; expected here?
Ok
Ok
Oops, sorry about the typo :-( All these changes will be in next commit. |
When missing, Handle ENOMEM during vector allocations to avoid later crashing. Ref: pmem/issues#5515 Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
6b9e943 to
be023c9
Compare
lukaszstolarczuk
left a comment
There was a problem hiding this comment.
Reviewed 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @bfaccini and @pbalcer)
src/libpmemobj/memops.c line 618 at r4 (raw file):
if (VEC_PUSH_BACK(&ctx->next, buffer_offset) != 0) return -1;
just wondering, should we clean something up in this case? I can see we're persisting something above 😉
Well, I thought this kind of concern could araise when I started to work on how to fix this, as usual for error paths... |
|
@bfaccini are you going to fix this PR to pass tests? |
Looks like doing so affects VEC_PUSH_BACK() execution, surprisingly for non-debug build only, preventing next vector allocation... Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Looks like embeding VEC_PUSH_BACK() in ASSERT() affects VEC_PUSH_BACK() execution, surprisingly for non-debug build only, preventing next vector allocation... |
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
|
Well, now that I have fixed the original errors/SEGVs during "./RUNTESTS -b nondebug obj_action -s TEST0", I now get new ones/asserts/aborts, but unfortunately I am unable to reproduce any of them locally :-( |
testing return value from VEC_PUSH_BACK() equal to 0 for error is not good !!... Ref: pmem/issues#5515 Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Now that I have found that these asserts/aborts were due to the "wrong/reverse test" in my previous changes, some workflows are canceled and one has failed, but I am not able to find any reason in the logs... Can somebody help ?? |
??? |
Codecov Report
@@ Coverage Diff @@
## master #5537 +/- ##
==========================================
+ Coverage 71.62% 72.18% +0.56%
==========================================
Files 162 146 -16
Lines 24256 22655 -1601
Branches 0 3778 +3778
==========================================
- Hits 17373 16354 -1019
+ Misses 6883 6301 -582 |
|
I restarted the CI and most things pass, and the things that fail seem unrelated: Can you rebase to see if this goes away? |
Unfortunately, this is a known issue (platform setup issue), it is not related to your PR. We are working on it. |
But should I rebase finally ? |
|
No, everything seems fine. LGTM overall. |
lukaszstolarczuk
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r7, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @bfaccini and @pbalcer)
src/libpmemobj/memops.c line 618 at r4 (raw file):
@wlemkows, can you please take a look at this piece of code...?
I allow myself to quote bfaccini's words (it was out of this issues' thread):
Well, I thought this kind of concern could araise when I started to work on how to fix this, as usual for error paths...
The problem is that I don't know what exactly does/provisions pmemops_persist() nor I have been able to identify which method/operation can revert its effects...
Also, and strangely (!!), I only found one error path after pmemops_persist() has been called (other place do not show any error path at all !!...), in src/libpmemobj/obj.c:obj_runtime_init(), and there is no clean-up there too !!
When missing, Handle ENOMEM during vector allocations to avoid later crashing.
Ref: pmem/issues #5515
Signed-off-by: Bruno Faccini bruno.faccini@intel.com
This change is