-
Notifications
You must be signed in to change notification settings - Fork 465
Windows portability #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows portability #496
Changes from 4 commits
4de08ce
c00892b
55af531
b117a87
352a28f
7bef41d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,9 @@ | ||
| *.o | ||
| *.pico | ||
| *.obj | ||
| *.dSYM | ||
| *.exe | ||
| *.dll | ||
| *.pc.tmp | ||
| *-uninstalled.pc | ||
| /version.h | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -54,23 +54,39 @@ int main(int argc, char *argv[]) | |
|
|
||
| while ((c = getopt(argc, argv, "IbDCSl:t:i:o:N:BZ:@:")) >= 0) { | ||
| switch (c) { | ||
| case 'S': flag |= 1; break; | ||
| case 'I': ignore_sam_err = 1; break; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding usage is most welcomed, thanks, but as for prior comment, why have the options changed order? It doesn't look to be for consistency with the usage statement.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed the order in the switch to be consistent with the string passed to getopt, right above the switch statement. For the usage instructions I aimed at what would make most sense for the user. Ideally they should all be consistent, but I didn't touch the getopt string because I'm not familiar with how it's used. I can revert the change of order, or change all of them to match the usage, whichever you prefer. |
||
| case 'b': flag |= 2; break; | ||
| case 'D': flag |= 4; break; | ||
| case 'C': flag |= 8; break; | ||
| case 'B': benchmark = 1; break; | ||
| case 'S': flag |= 1; break; | ||
| case 'l': clevel = atoi(optarg); flag |= 2; break; | ||
| case 't': fn_ref = optarg; break; | ||
| case 'I': ignore_sam_err = 1; break; | ||
| case 'i': if (hts_opt_add(&in_opts, optarg)) return 1; break; | ||
| case 'o': if (hts_opt_add(&out_opts, optarg)) return 1; break; | ||
| case 'N': nreads = atoi(optarg); break; | ||
| case 'B': benchmark = 1; break; | ||
| case 'Z': extra_hdr_nuls = atoi(optarg); break; | ||
| case '@': nthreads = atoi(optarg); break; | ||
| } | ||
| } | ||
| if (argc == optind) { | ||
| fprintf(stderr, "Usage: samview [-bSCSIB] [-N num_reads] [-l level] [-o option=value] [-Z hdr_nuls] <in.bam>|<in.sam>|<in.cram> [region]\n"); | ||
| fprintf(stderr, "Usage: test_view [-IbDCS] [-l level] [-t fn_ref] [-i option=value] [-o option=value] [-N num_reads] [-B] [-Z hdr_nuls] [-@ num_threads] <in.bam>|<in.sam>|<in.cram> [region]\n"); | ||
| fprintf(stderr, "\n"); | ||
| fprintf(stderr, "-D: read CRAM format (mode 'c')\n"); | ||
| fprintf(stderr, "-S: read compressed BCF, BAM, FAI (mode 'b')\n"); | ||
| fprintf(stderr, "-I: ignore SAM parsing errors\n"); | ||
| fprintf(stderr, "-t: fn_ref: load CRAM references from the specificed fasta file instead of @SQ headers when writing a CRAM file\n"); | ||
| fprintf(stderr, "-i: option=value: set an option for CRAM input\n"); | ||
| fprintf(stderr, "\n"); | ||
| fprintf(stderr, "-b: write compressed BCF, BAM, FAI (mode 'b')\n"); | ||
| fprintf(stderr, "-C: write CRAM format (mode 'c')\n"); | ||
| fprintf(stderr, "-l 0-9: set zlib compression level\n"); | ||
| fprintf(stderr, "-o option=value: set an option for CRAM output\n"); | ||
| fprintf(stderr, "-N: num_reads: limit the output to the first num_reads reads\n"); | ||
| fprintf(stderr, "\n"); | ||
| fprintf(stderr, "-B: enable benchmarking\n"); | ||
| fprintf(stderr, "-Z hdr_nuls: append specified number of null bytes to the SAM header\n"); | ||
| fprintf(stderr, "-@ num_threads: use thread pool with specified number of threads\n"); | ||
| return 1; | ||
| } | ||
| strcpy(moder, "r"); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. See commentary in #283.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this patch, the test cases in test_bgzf.c which use the function try_bgzf_dopen fail on Windows, because the files are opened in text mode, and then the file descriptors are passed into bgzf_dopen and hdopen.
I considered changing the test cases instead, but couldn't really see why it would be a bad thing to always set binary mode on the file descriptors. Are there any cases where it would be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmarshall, can you please clarify the intention here for binary files. You state "The general policy in htslib is to open everything in binary mode", but that is precisely what this change does. Do you mean that all opens should be "rb", "wb" or similar and specifying it at that point? (Mostly we do that already of course.) That only covers fopen style APIs though and not the Unix
openfunction used in the test case. Thebgzf_dopenfunction should perhaps add "b" to the mode string before callinghdopen, but this doesn't really seem to fit the "everything is binary" philosophy if we're only doing it for bgzf and not for SAM.Stdin/stdout as you note are the exceptions which clearly need a
setmodecall to change, but again we're turning stdin/stdout to an hfile via the dopen call once more.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relevant part from #283 (quoting @jmarshall):
So we should fix test_bgzf.c, and note in the documentation for
hdopen()that the file descriptor passed in must already be in binary mode.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TL;DR: what Rob said.]
As noted in the #283 commentary that I already pointed you both to, there are at least two problematic cases:
hdopen()can be too late to set binary mode, and it is invalid to callsetmode()on some types of file descriptor given tohdopen().If you are going to call
hdopen()orbgzf_dopen()then it is your responsibility to have opened the file descriptor in binary mode.The bug is in test_bgzf.c's
try_bgzf_dopen()which should be settingO_BINARYashfile_oflags()does, and probably (being part of htslib so kosher to include hfile_internal.h) it should just callhfile_oflags()instead of duplicating that code.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, "open" vs "dopen" in binary mode; agreed fixing the original open is correct.
We should probably document this somewhere too in the doxygen comments before the hdopen function as it's currently intuitive to believe mode should contain "b" and handle it there.