Tools¶
bam_utils¶
-
class
mg_common.tool.bam_utils.
bamUtils
[source]¶ Tool for handling bam files
-
static
bam_copy
(bam_in, bam_out)[source]¶ Wrapper function to copy from one bam file to another
Parameters: - bam_in (str) – Location of the input bam file
- bam_out (str) – Location of the output bam file
-
static
bam_count_reads
(bam_file, aligned=False)[source]¶ Wrapper to count the number of (aligned) reads in a bam file
-
static
bam_filter
(bam_file, bam_file_out, filter_name)[source]¶ Wrapper for filtering out reads from a bam file
Parameters: - bam_file (str) –
- bam_file_out (str) –
- filter (str) –
- One of:
- duplicate - Read is PCR or optical duplicate (1024) unmapped - Read is unmapped or not the primary alignment (260)
-
static
bam_index
(bam_file, bam_idx_file)[source]¶ Wrapper for the pysam SAMtools index function
Parameters: - bam_file (str) – Location of the bam file that is to be indexed
- bam_idx_file (str) – Location of the bam index file (.bai)
-
static
bam_list_chromosomes
(bam_file)[source]¶ Wrapper to list the chromosome names that are present within the bam file
Parameters: bam_file (str) – Location of the bam file Returns: List of the names of the chromosomes that are present in the bam file Return type: list
-
static
bam_merge
(*args)[source]¶ Wrapper for the pysam SAMtools merge function
Parameters: - bam_file_1 (str) – Location of the bam file to merge into
- bam_file_2 (str) – Location of the bam file that is to get merged into bam_file_1
-
static
bam_sort
(bam_file)[source]¶ Wrapper for the pysam SAMtools sort function
Parameters: bam_file (str) – Location of the bam file to sort
-
static
bam_split
(bam_file_in, bai_file, chromosome, bam_file_out)[source]¶ Wrapper to extract a single chromosomes worth of reading into a new bam file
Parameters: - bam_file_in (str) – Location of the input bam file
- bai_file (str) – Location of the bam index file. This needs to be in the same directory as the bam_file_in
- chromosome (str) – Name of the chromosome whose alignments are to be extracted
- bam_file_out (str) – Location of the output bam file
-
static
bam_stats
(bam_file)[source]¶ Wrapper for the pysam SAMtools flagstat function
Parameters: bam_file (str) – Location of the bam file Returns: list – qc_passed : int qc_failed : int description : str Return type: dict
-
static
@Task Helper Functions¶
The following are helper functions for the bam_utils so that the functions can operate on tasks where the files are in COMPSs, but have not been retirned to the users workspace.
-
class
mg_common.tool.bam_utils.
bamUtils
[source] Tool for handling bam files
-
static
bam_copy
(bam_in, bam_out)[source] Wrapper function to copy from one bam file to another
Parameters: - bam_in (str) – Location of the input bam file
- bam_out (str) – Location of the output bam file
-
static
bam_count_reads
(bam_file, aligned=False)[source] Wrapper to count the number of (aligned) reads in a bam file
-
static
bam_filter
(bam_file, bam_file_out, filter_name)[source] Wrapper for filtering out reads from a bam file
Parameters: - bam_file (str) –
- bam_file_out (str) –
- filter (str) –
- One of:
- duplicate - Read is PCR or optical duplicate (1024) unmapped - Read is unmapped or not the primary alignment (260)
-
static
bam_index
(bam_file, bam_idx_file)[source] Wrapper for the pysam SAMtools index function
Parameters: - bam_file (str) – Location of the bam file that is to be indexed
- bam_idx_file (str) – Location of the bam index file (.bai)
-
static
bam_list_chromosomes
(bam_file)[source] Wrapper to list the chromosome names that are present within the bam file
Parameters: bam_file (str) – Location of the bam file Returns: List of the names of the chromosomes that are present in the bam file Return type: list
-
static
bam_merge
(*args)[source] Wrapper for the pysam SAMtools merge function
Parameters: - bam_file_1 (str) – Location of the bam file to merge into
- bam_file_2 (str) – Location of the bam file that is to get merged into bam_file_1
-
static
bam_paired_reads
(bam_file)[source] Wrapper to test if a bam file contains paired end reads
-
static
bam_sort
(bam_file)[source] Wrapper for the pysam SAMtools sort function
Parameters: bam_file (str) – Location of the bam file to sort
-
static
bam_split
(bam_file_in, bai_file, chromosome, bam_file_out)[source] Wrapper to extract a single chromosomes worth of reading into a new bam file
Parameters: - bam_file_in (str) – Location of the input bam file
- bai_file (str) – Location of the bam index file. This needs to be in the same directory as the bam_file_in
- chromosome (str) – Name of the chromosome whose alignments are to be extracted
- bam_file_out (str) – Location of the output bam file
-
static
bam_stats
(bam_file)[source] Wrapper for the pysam SAMtools flagstat function
Parameters: bam_file (str) – Location of the bam file Returns: list – qc_passed : int qc_failed : int description : str Return type: dict
-
static
check_header
(bam_file)[source] Wrapper for the pysam SAMtools for checking if a bam file is sorted
Parameters: bool – True if the file has been sorted
-
static
sam_to_bam
(sam_file, bam_file)[source] Function for converting sam files to bam files
-
static
common¶
-
class
mg_common.tool.common.
common
[source]¶ Common functions that can be used generically across tools and pipelines
-
static
to_output_file
(input_file, output_file, empty=True)[source]¶ When handling the output of files within the @task function copying the results into the correct output files should be done by reading from and writing to rather than renaming.
In cases where there are a known set of output files, if the input file is missing then a blank file should be created and handled by the run() function of the tool. If an empty file should not be created then the empty parameter should be set to False.
Parameters: - input_file (str) – Location of the input file
- output_file (str) – Location of the output file
- empty (bool) – In cases where the input_file is missing an empty output_file is created. Should be set to False if no file shold be created.
-
static