Usage

Installation

To use fake-vcf, first install:

git clone https://github.com/endast/fake-vcf.git
cd fake-vcf
make poetry-download
make install

Running

By default fake-vcf writes to stdout

poetry run fake-vcf generate -s 2 -r 2
##fileformat=VCFv4.2
##source=VCFake 0.2.2
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##contig=<ID=chr1>
##reference=ftp://ftp.example.com/sample.fa
##INFO=<ID=AF,Number=A,Type=Float,Description="Estimated allele frequency in the range (0,1)">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Phased Genotype">
#CHROM        POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  S0000001        S0000002
chr1  63      rs143   C       A       96      PASS    DP=10;AF=0.5;NS=2       GT      0|0     0|0
chr1  71      rs31    A       T       37      PASS    DP=10;AF=0.5;NS=2       GT      0|0     0|0

You can write to a vcf file by piping the output to a file:

poetry run fake-vcf generate -s 2 -r 2 > fake_file.vcf
ls -lah
total 1
-rw-r--r--   1 magnus  staff   682B Jul 28 16:48 fake_file.vcf

Or let the script write to a file directly using -o:

poetry run fake-vcf generate -s 2 -r 2 -o fake_file.vcf

Writing to file fake_file.vcf
(No compression)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 50942.96it/s]
Done, data written to fake_file.vcf
ls -lah
total 1
-rw-r--r--   1 magnus  staff   682B Jul 28 16:48 fake_file.vcf

And if you want the file gzipped add .gz to the file name:

poetry run fake-vcf generate -s 2 -r 2 -o fake_file.vcf.gz

Writing to file fake_file.vcf
(No compression)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 50942.96it/s]
Done, data written to fake_file.vcf
ls -lah
total 2
-rw-r--r--   1 magnus  staff   682B Jul 28 16:56 fake_file.vcf
-rw-r--r--   1 magnus  staff   436B Jul 28 16:57 fake_file.vcf.gz

To see all options use –help

 Usage: fake-vcf generate [OPTIONS]

 Generate fake VCF data
 Args:     fake_vcf_path (Path): Path to fake VCF file or None to write to standard output.     num_rows (int): Number of rows.     num_samples (int): Number of samples.     chromosome (str): Chromosome identifier.     seed (int): Random seed for reproducibility.     sample_prefix (str): Prefix for sample
 names.     phased (bool): Simulate phased genotypes.     large_format (bool): Write large format VCF.     print_version (bool): Flag to print the version of the fake-vcf package.     reference_dir (Path): Path to directory containing imported reference_data.

╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --fake_vcf_path       -o                       PATH     Path to fake vcf file. If the path ends with .gz the file will be gzipped. [default: None]                                                                                                                                                                 │
│ --num_rows            -r                       INTEGER  Nr rows to generate (variants) [default: 10]                                                                                                                                                                                                               │
│ --num_samples         -s                       INTEGER  Nr of num_samples to generate. [default: 10]                                                                                                                                                                                                               │
│ --chromosome          -c                       TEXT     chromosome default chr1 [default: chr1]                                                                                                                                                                                                                    │
│ --seed                                         INTEGER  Random seed to use, default none. [default: None]                                                                                                                                                                                                          │
│ --sample_prefix       -p                       TEXT     Sample prefix ex: SAM =>  SAM0000001    SAM0000002 [default: S]                                                                                                                                                                                            │
│ --phased                  --no-phased                   Simulate phased [default: phased]                                                                                                                                                                                                                          │
│ --large-format            --no-large-format             Write large format vcf [default: large-format]                                                                                                                                                                                                             │
│ --version             -v                                Prints the version of the fake-vcf package.                                                                                                                                                                                                                │
│ --reference-dir-path  -f                       PATH     Path to imported refernce directory. [default: None]                                                                                                                                                                                                       │
│ --help                                                  Show this message and exit.                                                                                                                                                                                                                                │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯