Miscellaneous VCF formatting and typo fixes#391
Conversation
Fixes samtools#382. Fix formatting of ID and REF+ALT as separate table rows.
In particular, use a string that is not subject to change! Hat tip Thomas Colthurst. Closes samtools#363. Also fix a|b formatting and note that the inline/overflow boundary is >= 15 rather than > 15.
Also avoid using \textbackslash within \texttt{}.
Move the example \begin{enumerate} up to the otherwise blank half page
before the landscape example SV VCF.
Use "ht" in \begin{figure}[ht] where necessary, and keep some of the
associated paragraphs together.
Use \makecell to avoid an underful hbox.
Also make "dbsnp" and "138" verbatim, as the double quotes are part of what is being shown.
Compress the list of columns and encourage a pagebreak in "4 REF" to get
more vertical space for Table 1 on page 8. Use \longtable so Table 1 can
be broken across pages; don't use \begin{table}, so that the tables are
positioned within their related ("8 INFO" / "Genotype fields") text.
Keep initial "INFO keys used for structural variants" paragraphs together
so they don't creep back to page 11.
(Previously it was very easy to misread the INFO keys table as pertaining
to the genotype text surrounding it.)
The bar is only for genotype fields. Fixes samtools#390. Also for the genotype fields, use plain | rather than \mid, which adds excess space around the delimiter.
cyenyxe
left a comment
There was a problem hiding this comment.
The changes to header line and all the tables look great, thanks! Just one comment on my side.
| Multiple bases are permitted. | ||
| The value in the POS field refers to the position of the first base in the String. | ||
| For simple insertions and deletions in which either the REF or one of the ALT alleles would otherwise be null/empty, the REF and ALT Strings must include the base before the event (which must be reflected in the POS field), unless the event occurs at position 1 on the contig in which case it must include the base after the event; this padding base is not required (although it is permitted) for e.g. complex substitutions or other events where all alleles have at least one base represented in their Strings. | ||
| For simple insertions and deletions in which either the REF or one of the ALT alleles would otherwise be null/empty, the REF and ALT Strings must include the base before the event (which must be reflected in the POS field), unless the event occurs at position 1 on the contig in which case it must include the base after the event; \pagebreak[1] this padding base is not required (although it is permitted) for e.g.\ complex substitutions or other events where all alleles have at least one base represented in their Strings. |
There was a problem hiding this comment.
Given that this is a pretty plain paragraph, I'm not sure the \pagebreak[1] command is very useful.
There was a problem hiding this comment.
From the commit message: encourage a pagebreak in "4 REF" to get more vertical space for Table 1 on page 8. Without this, there is wasted space on the previous page and Table 1 is split more awkwardly.
(Also just pushed one more fix.)
|
Thanks. I would have merged that with a merge commit, as it's a sequence of distinct commits that tell a story. Squashing them into a single commit makes it a bit harder for future us to look back and understand the history, but YMMV. |
|
I actually prefer merge commits, but I thought the standard policy was always squashing... |
|
I don't think we have a policy as such. What often happens in this repository is that a single-commit PR gets improved and workshopped and acquires a bunch of “Fixed typos” and “Respond to review comments“ commits. Clearly the best practice in that case is to squash, as the fixup commits are not of later historical interest. Multi-commit PRs containing distinct fixes are a different case. |
Fix the BCF encoding errors noted in #382, and various formatting improvements.
No changes to the format or descriptions, so this should be straightforward to review and merge.
The non-TeX quotes etc in §1.4.7 Contig field format are left untouched so as not to conflict with lines changed in PR #379.