基因組實戰02: 軟件安裝和GATK數據下載

基因組實戰02: 軟件安裝和GATK數據下載,第1張

基因組實戰02: 軟件安裝和GATK數據下載,第2張download the genomics data of GATKFTP

/hc/en-us/articles/360035890811-Resource-bundle

two slow in China (23.0K/s)

# install lftp
sudo apt -y install lftp
# login into the ftp server; no password (just enter)
lftp ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/
# download all the hg38 directory
mirror hg38

use google cloud

35M/s

https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0/

基因組實戰02: 軟件安裝和GATK數據下載,第3張
micromamba create -n gsutil 
micromamba activate gsutil
micromamba install -y -c conda-forge python=3.4 gsutil 
mkdir -p ~/DataHub/Genomics/GATK
cd ~/DataHub/Genomics/GATK
gsutil -m cp -r \
 "gs://genomics-public-data/resources/broad/hg38/v0/1000G.phase3.integrated.sites_only.no_MATCHED_REV.hg38.vcf" \
 "gs://genomics-public-data/resources/broad/hg38/v0/1000G.phase3.integrated.sites_only.no_MATCHED_REV.hg38.vcf.idx" \
 "gs://genomics-public-data/resources/broad/hg38/v0/1000G_omni2.5.hg38.vcf.gz" \
 "gs://genomics-public-data/resources/broad/hg38/v0/1000G_omni2.5.hg38.vcf.gz.tbi" \
 "gs://genomics-public-data/resources/broad/hg38/v0/1000G_phase1.snps.high_confidence.hg38.vcf.gz" \
 "gs://genomics-public-data/resources/broad/hg38/v0/1000G_phase1.snps.high_confidence.hg38.vcf.gz.tbi" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Axiom_Exome_Plus.genotypes.all_populations.poly.hg38.vcf.gz" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Axiom_Exome_Plus.genotypes.all_populations.poly.hg38.vcf.gz.tbi" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.idx" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.dict" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta.64.alt" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta.64.amb" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta.64.ann" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta.64.bwt" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta.64.pac" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta.64.sa" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta.fai" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz" \
 "gs://genomics-public-data/resources/broad/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi" \
 "gs://genomics-public-data/resources/broad/hg38/v0/hapmap_3.3.hg38.vcf.gz" \
 "gs://genomics-public-data/resources/broad/hg38/v0/hapmap_3.3.hg38.vcf.gz.tbi" \
 "gs://genomics-public-data/resources/broad/hg38/v0/scattered_calling_intervals" \
 "gs://genomics-public-data/resources/broad/hg38/v0/wgs_calling_regions.hg38.interval_list" \
 .

BWA的索引文件
Homo_sapiens_assembly38.fasta
Homo_sapiens_assembly38.fasta.64.amb
Homo_sapiens_assembly38.fasta.64.ann
Homo_sapiens_assembly38.fasta.64.bwt
Homo_sapiens_assembly38.fasta.64.pac
Homo_sapiens_assembly38.fasta.64.sa
Homo_sapiens_assembly38.fasta.dict

prepare the environmentpython 2
micromamba create -n dna2 python=2
micromamba activate dna2
micromamba install -y -c bioconda bwa samtools bcftools vcftools snpeff fastqc qualimap gatk4 tabix multiqc

python 3
micromamba create -n dna3
micromamba activate dna3
micromamba install -y -c conda-forge python=3.10 python_abi xopen
micromamba install -y -c bioconda cutadapt=4.3 trim-galore

生活常識_百科知識_各類知識大全»基因組實戰02: 軟件安裝和GATK數據下載

0條評論

    發表評論

    提供最優質的資源集郃

    立即查看了解詳情