fix #30#254
Conversation
|
Will review, thank you! |
headius
left a comment
There was a problem hiding this comment.
Minor suggestions to improve the code... at least the formatting should be fixed, and any refinement you can make for the other comments would be great!
| public String getString(long offset, int maxLength, Charset cs) { | ||
| final byte[] bytes = IO.getZeroTerminatedByteArray(address() + offset, maxLength); | ||
| return cs.decode(ByteBuffer.wrap(bytes)).toString(); | ||
| if(cs == StandardCharsets.UTF_8){ |
There was a problem hiding this comment.
Please reformat according to more common Java formatting standards:
if (cond) {
...
} else {
...
for (stuff) {
...Otherwise we get a mix of formats and future patches will likely include unhelpful formatting updates.
| public String getString(long offset, int maxLength, Charset cs) { | ||
| final byte[] bytes = IO.getZeroTerminatedByteArray(address() + offset, maxLength); | ||
| return cs.decode(ByteBuffer.wrap(bytes)).toString(); | ||
| if(cs == StandardCharsets.UTF_8){ |
There was a problem hiding this comment.
I think this logic could apply to any Charset that has a minimum character width of 1 byte, correct? All such encodings should use a single-byte \0 for C string termination I think?
At the very least this should include the ISO-8859 encodings, which are all single-byte (their CharsetEncoder.maxBytesPerChar will all be 1.0).
There was a problem hiding this comment.
agree that I should include other single byte terminator charsets, but CharsetEncoder.maxBytesPerChar won't be sufficient because it does not resolve to 1.0 for utf-8
| }else{ | ||
| byte[] bytes = new byte[maxLength]; | ||
| IO.getByteArray(address() + offset, bytes, 0, maxLength); | ||
| final byte[] nullCharBytes = new String("\0").getBytes(cs); |
There was a problem hiding this comment.
This could possibly be cached for common encodings (UTF-16, UTF-32) or it might be valid to just use a width of \0 equal to the encoding's CharsetEncoder.maxBytesPerChar value.
…tly with more charsets, fix DirectMemoryIO.getString()
|
Request another review when you are ready. Thanks for keeping at it! |
|
this part of the code is quite tricky, will keep you posted, thanks |
|
btw is the irc channel still alive ? |
…ngth() that works with wide chars, default max length to max int
|
@demon36 Ah no I expect the IRC channel, being on now-defunct FreeNode, is probably dead. I should set up a new channel on Matrix or libera. For now if you want to chat with those of us maintaining these libraries, stop by the #jruby channel on Matrix! |
|
@headius good to know that, the PR is ready for a review, please also take a look at Struct.java |
headius
left a comment
There was a problem hiding this comment.
This is looking near completion. Main issues are:
- formatting throughout does not match Java conventions or the rest of the codebase
- no tests provided for the behavior
Almost there!
| return cs.decode(ByteBuffer.wrap(bytes)).toString(); | ||
| long baseAddress = address() + offset; | ||
| int nullTermSize = StringUtil.terminatorWidth(cs); | ||
| if(nullTermSize == 1) { |
There was a problem hiding this comment.
Formatting throughout should match the rest of the code, which is intended to match typical Java coding conventions:
- four space indentation for blocks of code, 8-space indentation for line continuation
- space between keywords like
ifandwhileand their parenthesized conditionals - spaces around operators like
+and%
It's nitpicky but if we don't maintain consistent code formatting then we end up with future commits and PRs that have lots of unrelated changes.
| // Find the NUL terminator and limit to that, so the | ||
| // StringBuffer/StringBuilder does not have superfluous NUL chars | ||
| int end = indexOf(buffer, (byte) 0); | ||
| final byte[] nullCharBytes = new byte[StringUtil.terminatorWidth(charset)]; |
There was a problem hiding this comment.
Perhaps we can cache the three known lengths and not reallocate this byte[] every time?
|
Looks like this might need a merge or rebase from master to pick up that missing import too. |
|
@demon36 Should we just close this in light of the work to do this all natively in jffi? I am inclined to keep this as fallback code when we do not have an updated jffi binary, but what are your thoughts going forward? |
fix #30, DirectMemoryIO.getString() fails for non UTF-8