Package picard.sam.markduplicates.util
Class AbstractMarkDuplicatesCommandLineProgram
- java.lang.Object
-
- picard.cmdline.CommandLineProgram
-
- picard.sam.markduplicates.util.AbstractOpticalDuplicateFinderCommandLineProgram
-
- picard.sam.markduplicates.util.AbstractMarkDuplicatesCommandLineProgram
-
- Direct Known Subclasses:
MarkDuplicates
,MarkDuplicatesWithMateCigar
public abstract class AbstractMarkDuplicatesCommandLineProgram extends AbstractOpticalDuplicateFinderCommandLineProgram
Abstract class that holds parameters and methods common to classes that perform duplicate detection and/or marking within SAM/BAM/CRAM files.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
AbstractMarkDuplicatesCommandLineProgram.SamHeaderAndIterator
Little class used to package up a header and an iterable/iterator.
-
Field Summary
Fields Modifier and Type Field Description htsjdk.samtools.SAMFileHeader.SortOrder
ASSUME_SORT_ORDER
boolean
ASSUME_SORTED
Deprecated.List<String>
COMMENT
htsjdk.samtools.DuplicateScoringStrategy.ScoringStrategy
DUPLICATE_SCORING_STRATEGY
List<String>
INPUT
File
METRICS_FILE
File
OUTPUT
protected Set<String>
pgIdsSeen
The program groups that have been seen during the course of examining the input records.protected PGTagArgumentCollection
pgTagArgumentCollection
String
PROGRAM_GROUP_COMMAND_LINE
String
PROGRAM_GROUP_NAME
String
PROGRAM_GROUP_VERSION
String
PROGRAM_RECORD_ID
boolean
REMOVE_DUPLICATES
-
Fields inherited from class picard.sam.markduplicates.util.AbstractOpticalDuplicateFinderCommandLineProgram
LOG, MAX_OPTICAL_DUPLICATE_SET_SIZE, OPTICAL_DUPLICATE_PIXEL_DISTANCE, opticalDuplicateFinder, READ_NAME_REGEX
-
Fields inherited from class picard.cmdline.CommandLineProgram
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_ALLOWABLE_ONE_LINE_SUMMARY_LENGTH, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
-
-
Constructor Summary
Constructors Constructor Description AbstractMarkDuplicatesCommandLineProgram()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
addDuplicateReadToMetrics(htsjdk.samtools.SAMRecord rec, DuplicationMetrics metrics)
static DuplicationMetrics
addReadToLibraryMetrics(htsjdk.samtools.SAMRecord rec, htsjdk.samtools.SAMFileHeader header, LibraryIdGenerator libraryIdGenerator)
static void
addSingletonToCount(LibraryIdGenerator libraryIdGenerator)
static void
finalizeAndWriteMetrics(LibraryIdGenerator libraryIdGenerator, htsjdk.samtools.metrics.MetricsFile<DuplicationMetrics,Double> metricsFile, File outputFile)
Writes the metrics given by the libraryIdGenerator to the outputFile.protected Map<String,String>
getChainedPgIds(htsjdk.samtools.SAMFileHeader outputHeader)
We have to re-chain the program groups based on this algorithm.protected AbstractMarkDuplicatesCommandLineProgram.SamHeaderAndIterator
openInputs(boolean eagerlyDecode)
Since this may read its inputs more than once this method does all the opening and checking of the inputs.static void
trackOpticalDuplicates(List<? extends ReadEnds> ends, ReadEnds keeper, OpticalDuplicateFinder opticalDuplicateFinder, LibraryIdGenerator libraryIdGenerator)
Looks through the set of reads and identifies how many of the duplicates are in fact optical duplicates, and stores the data in the instance level histogram.-
Methods inherited from class picard.sam.markduplicates.util.AbstractOpticalDuplicateFinderCommandLineProgram
customCommandLineValidation, setupOpticalDuplicateFinder
-
Methods inherited from class picard.cmdline.CommandLineProgram
checkRInstallation, doWork, getCommandLine, getCommandLineParser, getCommandLineParserForArgs, getDefaultHeaders, getFaqLink, getMetricsFile, getPGRecord, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
-
-
-
-
Field Detail
-
pgTagArgumentCollection
@ArgumentCollection protected final PGTagArgumentCollection pgTagArgumentCollection
-
INPUT
@Argument(shortName="I", doc="One or more input SAM, BAM or CRAM files to analyze. Must be coordinate sorted.") public List<String> INPUT
-
OUTPUT
@Argument(shortName="O", doc="The output file to write marked records to") public File OUTPUT
-
METRICS_FILE
@Argument(shortName="M", doc="File to write duplication metrics to") public File METRICS_FILE
-
REMOVE_DUPLICATES
@Argument(doc="If true do not write duplicates to the output file instead of writing them with appropriate flags set.") public boolean REMOVE_DUPLICATES
-
ASSUME_SORTED
@Deprecated @Argument(shortName="AS", doc="If true, assume that the input file is coordinate sorted even if the header says otherwise. Deprecated, used ASSUME_SORT_ORDER=coordinate instead.", mutex="ASSUME_SORT_ORDER") public boolean ASSUME_SORTED
Deprecated.
-
ASSUME_SORT_ORDER
@Argument(shortName="ASO", doc="If not null, assume that the input file has this order even if the header says otherwise.", optional=true, mutex="ASSUME_SORTED") public htsjdk.samtools.SAMFileHeader.SortOrder ASSUME_SORT_ORDER
-
DUPLICATE_SCORING_STRATEGY
@Argument(shortName="DS", doc="The scoring strategy for choosing the non-duplicate among candidates.") public htsjdk.samtools.DuplicateScoringStrategy.ScoringStrategy DUPLICATE_SCORING_STRATEGY
-
PROGRAM_RECORD_ID
@Argument(shortName="PG", doc="The program record ID for the @PG record(s) created by this program. Set to null to disable PG record creation. This string may have a suffix appended to avoid collision with other program record IDs.", optional=true) public String PROGRAM_RECORD_ID
-
PROGRAM_GROUP_VERSION
@Argument(shortName="PG_VERSION", doc="Value of VN tag of PG record to be created. If not specified, the version will be detected automatically.", optional=true) public String PROGRAM_GROUP_VERSION
-
PROGRAM_GROUP_COMMAND_LINE
@Argument(shortName="PG_COMMAND", doc="Value of CL tag of PG record to be created. If not supplied the command line will be detected automatically.", optional=true) public String PROGRAM_GROUP_COMMAND_LINE
-
PROGRAM_GROUP_NAME
@Argument(shortName="PG_NAME", doc="Value of PN tag of PG record to be created.") public String PROGRAM_GROUP_NAME
-
COMMENT
@Argument(shortName="CO", doc="Comment(s) to include in the output file\'s header.", optional=true) public List<String> COMMENT
-
-
Method Detail
-
getChainedPgIds
protected Map<String,String> getChainedPgIds(htsjdk.samtools.SAMFileHeader outputHeader)
We have to re-chain the program groups based on this algorithm. This returns the map from existing program group ID to new program group ID.
-
finalizeAndWriteMetrics
public static void finalizeAndWriteMetrics(LibraryIdGenerator libraryIdGenerator, htsjdk.samtools.metrics.MetricsFile<DuplicationMetrics,Double> metricsFile, File outputFile)
Writes the metrics given by the libraryIdGenerator to the outputFile.- Parameters:
libraryIdGenerator
- ALibraryIdGenerator
object that contains the map from library toDuplicationMetrics
for that librarymetricsFile
- An emptyMetricsFile
object that will be filled, with "finalized" metrics and written out. It needs to be generated from a non-static context so that various commandline information is added to the header whenCommandLineProgram.getMetricsFile()
is called.outputFile
- The file to write the metrics to
-
addReadToLibraryMetrics
public static DuplicationMetrics addReadToLibraryMetrics(htsjdk.samtools.SAMRecord rec, htsjdk.samtools.SAMFileHeader header, LibraryIdGenerator libraryIdGenerator)
-
addDuplicateReadToMetrics
public static void addDuplicateReadToMetrics(htsjdk.samtools.SAMRecord rec, DuplicationMetrics metrics)
-
openInputs
protected AbstractMarkDuplicatesCommandLineProgram.SamHeaderAndIterator openInputs(boolean eagerlyDecode)
Since this may read its inputs more than once this method does all the opening and checking of the inputs.
-
trackOpticalDuplicates
public static void trackOpticalDuplicates(List<? extends ReadEnds> ends, ReadEnds keeper, OpticalDuplicateFinder opticalDuplicateFinder, LibraryIdGenerator libraryIdGenerator)
Looks through the set of reads and identifies how many of the duplicates are in fact optical duplicates, and stores the data in the instance level histogram. Additionally sets the transient isOpticalDuplicate flag on each read end that is identified as an optical duplicate.
-
addSingletonToCount
public static void addSingletonToCount(LibraryIdGenerator libraryIdGenerator)
-
-