-
Notifications
You must be signed in to change notification settings - Fork 466
Windows portability #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows portability #496
Changes from 3 commits
4de08ce
c00892b
55af531
b117a87
352a28f
7bef41d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,6 +2,8 @@ | |
| *.pico | ||
| *.dSYM | ||
| *.exe | ||
| *.dll | ||
| *.obj | ||
| *.pc.tmp | ||
| *-uninstalled.pc | ||
| /version.h | ||
|
|
@@ -47,5 +49,6 @@ lib*.so.* | |
| /test/thrash_threads[1-6] | ||
| /test/*.tmp | ||
| /test/*.tmp.* | ||
| /test/*.new | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, removing |
||
|
|
||
| /TAGS | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -573,6 +573,9 @@ static hFILE *hopen_fd(const char *filename, const char *mode) | |
|
|
||
| hFILE *hdopen(int fd, const char *mode) | ||
| { | ||
| #if defined HAVE_SETMODE && defined O_BINARY | ||
| if (setmode(fd, O_BINARY) < 0) return NULL; | ||
| #endif | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is incorrect. See commentary in #283.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Without this patch, the test cases in test_bgzf.c which use the function try_bgzf_dopen fail on Windows, because the files are opened in text mode, and then the file descriptors are passed into bgzf_dopen and hdopen. I considered changing the test cases instead, but couldn't really see why it would be a bad thing to always set binary mode on the file descriptors. Are there any cases where it would be a problem?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jmarshall, can you please clarify the intention here for binary files. You state "The general policy in htslib is to open everything in binary mode", but that is precisely what this change does. Do you mean that all opens should be "rb", "wb" or similar and specifying it at that point? (Mostly we do that already of course.) That only covers fopen style APIs though and not the Unix Stdin/stdout as you note are the exceptions which clearly need a
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Relevant part from #283 (quoting @jmarshall):
So we should fix test_bgzf.c, and note in the documentation for
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [TL;DR: what Rob said.] As noted in the #283 commentary that I already pointed you both to, there are at least two problematic cases: If you are going to call The bug is in test_bgzf.c's
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, "open" vs "dopen" in binary mode; agreed fixing the original open is correct. We should probably document this somewhere too in the doxygen comments before the hdopen function as it's currently intuitive to believe mode should contain "b" and handle it there. |
||
| hFILE_fd *fp = (hFILE_fd*) hfile_init(sizeof (hFILE_fd), mode, blksize(fd)); | ||
| if (fp == NULL) return NULL; | ||
|
|
||
|
|
@@ -594,9 +597,6 @@ static hFILE *hopen_fd_fileuri(const char *url, const char *mode) | |
| static hFILE *hopen_fd_stdinout(const char *mode) | ||
| { | ||
| int fd = (strchr(mode, 'r') != NULL)? STDIN_FILENO : STDOUT_FILENO; | ||
| #if defined HAVE_SETMODE && defined O_BINARY | ||
| if (setmode(fd, O_BINARY) < 0) return NULL; | ||
| #endif | ||
| return hdopen(fd, mode); | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| # Remove the text attribute from reference files, so that git doesn't convert | ||
| # line separators on Windows machines. It causes the index files to become out | ||
| # of sync with the fasta files. | ||
| *.fa* -text | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Put this in the top-level .gitattributes file. But wouldn't it be better to regenerate the .fai files instead?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the problem here is that git clone itself changes these files. We can't really regenerate the .fai files on a git clone, unless we have the Makefile always regenerate .fai from .fa. Even this is problematic as there is no htslib command-line interface to fai_build: our only calls to it are over in
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Clearly by regenerate I'm referring to having test.pl generate the derived .fai files (and removing them from source control), just as samtools's test.pl's
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem with regenerating them ourselves is that then we are testing our own output, which can lead to bugs cancelling each other out. On the whole, I'd prefer to keep the .fai files in version control for now. I agree that the change would be best in the top-level file.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Moving to top-level .gitattributes. |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -61,16 +61,17 @@ char *slurp(const char *filename) | |
| { | ||
| char *text; | ||
| struct stat sbuf; | ||
| size_t filesize; | ||
| FILE *f = fopen(filename, "r"); | ||
| if (f == NULL) fail("fopen(\"%s\", \"r\")", filename); | ||
| size_t filesize, readsize; | ||
| FILE *f = fopen(filename, "rb"); | ||
| if (f == NULL) fail("fopen(\"%s\", \"rb\")", filename); | ||
| if (fstat(fileno(f), &sbuf) != 0) fail("fstat(\"%s\")", filename); | ||
| filesize = sbuf.st_size; | ||
|
|
||
| text = (char *) malloc(filesize + 1); | ||
| if (text == NULL) fail("malloc(text)"); | ||
|
|
||
| if (fread(text, 1, filesize, f) != filesize) fail("fread"); | ||
| readsize = fread(text, 1, filesize, f); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While I've no major objections to this style (use of a variable over inline checking), I don't particularly see the need for changing it and such non-changes make it harder to evaluate the PR.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason for splitting the line in the first case was that I wanted to disable the check on Windows. But that wasn't necessary as long as the file is opened in binary mode. Reverting this one. |
||
| if (readsize != filesize) fail("fread"); | ||
| fclose(f); | ||
|
|
||
| text[filesize] = '\0'; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -54,23 +54,39 @@ int main(int argc, char *argv[]) | |
|
|
||
| while ((c = getopt(argc, argv, "IbDCSl:t:i:o:N:BZ:@:")) >= 0) { | ||
| switch (c) { | ||
| case 'S': flag |= 1; break; | ||
| case 'I': ignore_sam_err = 1; break; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding usage is most welcomed, thanks, but as for prior comment, why have the options changed order? It doesn't look to be for consistency with the usage statement.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed the order in the switch to be consistent with the string passed to getopt, right above the switch statement. For the usage instructions I aimed at what would make most sense for the user. Ideally they should all be consistent, but I didn't touch the getopt string because I'm not familiar with how it's used. I can revert the change of order, or change all of them to match the usage, whichever you prefer. |
||
| case 'b': flag |= 2; break; | ||
| case 'D': flag |= 4; break; | ||
| case 'C': flag |= 8; break; | ||
| case 'B': benchmark = 1; break; | ||
| case 'S': flag |= 1; break; | ||
| case 'l': clevel = atoi(optarg); flag |= 2; break; | ||
| case 't': fn_ref = optarg; break; | ||
| case 'I': ignore_sam_err = 1; break; | ||
| case 'i': if (hts_opt_add(&in_opts, optarg)) return 1; break; | ||
| case 'o': if (hts_opt_add(&out_opts, optarg)) return 1; break; | ||
| case 'N': nreads = atoi(optarg); break; | ||
| case 'B': benchmark = 1; break; | ||
| case 'Z': extra_hdr_nuls = atoi(optarg); break; | ||
| case '@': nthreads = atoi(optarg); break; | ||
| } | ||
| } | ||
| if (argc == optind) { | ||
| fprintf(stderr, "Usage: samview [-bSCSIB] [-N num_reads] [-l level] [-o option=value] [-Z hdr_nuls] <in.bam>|<in.sam>|<in.cram> [region]\n"); | ||
| fprintf(stderr, "Usage: test_view [-IbDCS] [-l level] [-t fn_ref] [-i option=value] [-o option=value] [-N num_reads] [-B] [-Z hdr_nuls] [-@ num_threads] <in.bam>|<in.sam>|<in.cram> [region]\n"); | ||
| fprintf(stderr, "\n"); | ||
| fprintf(stderr, "-D: read CRAM format (mode 'c')\n"); | ||
| fprintf(stderr, "-S: read compressed BCF, BAM, FAI (mode 'b')\n"); | ||
| fprintf(stderr, "-I: ignore SAM parsing errors\n"); | ||
| fprintf(stderr, "-t: fn_ref: load CRAM references from the specificed fasta file instead of @SQ headers when writing a CRAM file\n"); | ||
| fprintf(stderr, "-i: option=value: set an option for CRAM input\n"); | ||
| fprintf(stderr, "\n"); | ||
| fprintf(stderr, "-b: write compressed BCF, BAM, FAI (mode 'b')\n"); | ||
| fprintf(stderr, "-C: write CRAM format (mode 'c')\n"); | ||
| fprintf(stderr, "-l 0-9: set zlib compression level\n"); | ||
| fprintf(stderr, "-o option=value: set an option for CRAM output\n"); | ||
| fprintf(stderr, "-N: num_reads: limit the output to the first num_reads reads\n"); | ||
| fprintf(stderr, "\n"); | ||
| fprintf(stderr, "-B: enable benchmarking\n"); | ||
| fprintf(stderr, "-Z hdr_nuls: append specified number of null bytes to the SAM header\n"); | ||
| fprintf(stderr, "-@ num_threads: use thread pool with specified number of threads\n"); | ||
| return 1; | ||
| } | ||
| strcpy(moder, "r"); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.gitignore already has appropriate patterns for
.dll, no? To date, Windows support has been aiming at Cygwin and GCC so*.ohas sufficed, but it would be good to put.objalongside.o/.pico.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are "hfile_*.dll", "cyg*.dll", and "lib*.dll" patterns but nothing to catch pthreadVC2.dll, the pthreads library for Windows. I don't see why one would ever want to check in dll's in git, so I suggest that we keep the more generic *.dll pattern instead of the more specific ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this pthreadVC2.dll of which you speak? Why would it appear in the htslib source directory rather than some compiler directory? (If there's some larger pull request you're working towards, showing the draft end product would be illuminating…)