Removed artificial PATH_MAX limits.#2025
Removed artificial PATH_MAX limits.#2025petterreinholdtsen wants to merge 2 commits intouxlfoundation:masterfrom
Conversation
39d0f6a to
a6123da
Compare
3a6c982 to
69fad16
Compare
|
What is the reason for memory management be C-style vs using |
|
[Alexandr-Konovalov]
What is the reason for memory management be C-style vs using
`unique_ptr` etc?
Mostly because I am mostly familir with C style, and could not get
unique_ptr with char* working with the surrounding code. Can you get it
working?
…--
Happy hacking
Petter Reinholdtsen
|
|
[Petter Reinholdtsen]
Mostly because I am mostly familir with C style, and could not get
unique_ptr with char* working with the surrounding code. Can you get it
working?
It is worth noting that the surrounding code using char* and snprintf()
instead of std::string and associated operations, made me believe C
style was the intended and wanted code style. If C++ is wanted, I
suspect the entire file should be rewritten to use std::string. I
decided against a complete rewrite to keep the change to the minimum
required to drop the arbitary limit. I was unable to drop such limit on
Windows, where the API call used seem to require the limit, not the
calling code.
…--
Happy hacking
Petter Reinholdtsen
|
aleksei-fedotov
left a comment
There was a problem hiding this comment.
Given that the patch does not take into account hard limitation on the length of a relative path, I conclude that the limitations on the maximum path length is not a real motivator for the patch. Therefore, to avoid cumbersome process of measuring performance penalties potentially implied by dynamic memory allocations, I would suggest simply defining PATH_MAX constant on the systems where it is missing.
Below are the comments to start with if we still want to go that way.
| static bool try_read_cgroup_v1_num_cpus_from(const char* dir, int& num_cpus) { | ||
| char path[PATH_MAX] = {0}; | ||
| if (std::snprintf(path, PATH_MAX, "%s/cpu.cfs_quota_us", dir) < 0) | ||
| std::size_t pathlen = strlen(dir) + 30; |
There was a problem hiding this comment.
Why +30? The longest addition is only 18 bytes, if I am not mistaken.
src/tbb/cgroup_info.h
Outdated
| char path[PATH_MAX] = {0}; | ||
| if (std::snprintf(path, PATH_MAX, "%s/cpu.cfs_quota_us", dir) < 0) | ||
| std::size_t pathlen = strlen(dir) + 30; | ||
| char *path = (char*)calloc(pathlen, 1); |
There was a problem hiding this comment.
Usually, allocating memory dynamically takes a way more time than work with static arrays. Running over arrays using strlen can also add some penalties. We need to look at the performance data to make a weighed decision.
src/tbb/cgroup_info.h
Outdated
| std::size_t pathlen = strlen(dir) + 30; | ||
| char *path = (char*)calloc(pathlen, 1); | ||
| if (std::snprintf(path, pathlen, "%s/cpu.cfs_quota_us", dir) < 0) { | ||
| free(path); |
There was a problem hiding this comment.
Please use RAII approach. It is safer and cleaner.
|
[Aleksei Fedotov]
Given that the patch does not take into account hard limitation on the
length of a relative path, I conclude that the limitations on the
maximum path length is not a real motivator for the patch.
Your conclusion is wrong. The reason I try porting software to GNU
Hurd, is that it identifies artifical limits in Linux software that have
no real purpose. Given that the limit also is not accurate, as that the
limits differ for different file system types, to me using PATH_MAX seem
like a useless limitation to include in programs.
Below are the comments to start with if we still want to go that way.
Thank you for the fedback.
Why +30? The longest addition is only 18 bytes, if I am not mistaken.
I picked a number I was sure was larger than the smallest possible
number. If it is important for the onetbb developers to stay at the
exact limit, I can do more detailed calculations.
Usually, allocating memory dynamically takes a way more time than work
with static arrays. Running over arrays using `strlen` can also add
some penalties. We need to look at the performance data to make a
weighed decision.
Yes. But as far as I can tell, the operations affected are conducted
once or very few times per program invocation, so the performance hit
should be negletiable and the advantage (ie not using the actually file
system dependent limit) should outweigh the disadvantages.
Please use RAII approach. It is safer and cleaner.
What do you mean?
```suggestion
std::size_t len = abs_path(library, NULL, 1);
```
No problem. Noticed a bit inconsistent style in the code base.
…--
Happy hacking
Petter Reinholdtsen
|
|
Can we make one step further and look at |
|
[Alexandr-Konovalov]
Can we make one step further and look at `PATH_MAX` semantics under
Linux. According to my understanding, it limits number of bytes to be
written by "system" to a buffer of unspecified size. As war as I can
tell, TBB don't use such API, so for TBB `PATH_MAX` is used only as
size of a buffer on stack that has enough size. So, can we remove
`PATH_MAX` and set it manually to OS-independent value? There is no
need in dynamic memory allocation after that.
I assume you are talking about this:
Pathname Variable Values
The values in the following list may be constants within an
implementation or may vary from one pathname to another. For example,
file systems or directories may have different characteristics.
A definition of one of the symbolic constants in the following list
shall be omitted from the <limits.h> header on specific
implementations where the corresponding value is equal to or greater
than the stated minimum, but where the value can vary depending on the
file to which it is applied. The actual value supported for a specific
pathname shall be provided by the pathconf() function.
[...]
{PATH_MAX} - Maximum number of bytes the implementation stores as a
pathname in a user-supplied buffer of unspecified size,
including the terminating null character. Minimum number
the implementation shall accept as the maximum number of
bytes in a pathname.
Minimum Acceptable Value: {_POSIX_PATH_MAX}
[...]
{PATH_MAX} - IEEE PASC Interpretation 1003.1 #15 addressed the
inconsistency in the standard with the definition of
pathname and the description of {PATH_MAX}, allowing
application developers to allocate either {PATH_MAX} or
{PATH_MAX}+1 bytes. The inconsistency has been removed by
correction to the {PATH_MAX} definition to include the
null character. With this change, applications that
previously allocated {PATH_MAX} bytes will continue to
succeed.
As far as I know, PATH_MAX usage is not limited to what the system
writes, but intended also to limit what the maximum path a file on disk
can have, but the only proper way to figure out the latter is to use the
pathconf() function when one know which file system the program will be
operating on. Such procedure is rarely worth it when the alternative is
to simply not have a limit and handle whatever string size is passed
from the user or system dynamically, which is what my patch implements.
How do you imagine determining "enough size"? What is the program is
ported to a system with deeper path depths in the future? Do you want
to keep extending the limit indefinitely to avoid dynamically handle any
size?
…--
Happy ahcking
Petter Reinholdtsen
|
|
There are actually 2 different problem here we are discussing. 1st is For microbenchmark with only
Sorry, I think only about |
| static struct ap_data_t { | ||
| char _path[PATH_MAX+1]; | ||
| char *_path; | ||
| std::size_t _capacity; |
There was a problem hiding this comment.
Is _capacity really need to be saved? If I understand correctly, it only used for doubling the buffer, so probably can be lokal.
There was a problem hiding this comment.
I believe it is required on Windows, and for consistency I used it the same way on non-Windows platforms. I was tempted to switch also this to use std::string, but decided against it as std::string is not much used in this library.
There was a problem hiding this comment.
I believe it is required on Windows.
I'm not sure in that. It seems it only used to keep value of MAX_PATH and then zeroed in case of unsuccessful allocation. At which part of the code are you talking about?
It can be something like I think. |
b5223e1 to
c3a2c7b
Compare
Rewrote all code using PATH_MAX to instead dynamically allocate the required buffers. As the actually max path length depend on the file system used, it is not a good idea to pretend a compiled in buffer limit will work everywhere. This also fixes build problems detected on GNU Hurd.
c3a2c7b to
a2e08e4
Compare
|
I switched to unique_ptr everywhere except in ap_data. I do not fully understand the purpose of the internal ap_data_t structure, and am unsure about the consequences for this code. |
|
[Alexandr-Konovalov]
But the developers are not happy with possible impact of dynamic
memory allocation on performance, so thinking about possibility to
keep buffers on stack somehow.
So, how can this fix be moved forward? It is unclear to me who are the
decision makers and what I can do to assist further.
…--
Happy hacking
Petter Reinholdtsen
|
Rewrote all code using PATH_MAX to instead dynamically allocate the required buffers. As the actually max path length depend on the file system used, it is not a good idea to pretend a compiled in buffer limit will work everywhere.
This also fixes build problems detected on GNU Hurd.
Type of change
Tests
Documentation
Breaks backward compatibility