Set QC for the Atac Seq experiments
The goal of SetQC is to provide a detailed and useful reports for the collaborator to:
- Have a sense of data quality
- Allow them explore the data for an initial analysis.
Pre-requirement
- results are transfereed to the data storage at the end of libQC step.
- runFastQC.sh: fastqc for each libs
- all scripts are in the
${EPIGEN_FOLDER}/bin/
The procedures
input:
- give the lib numbers to include in the set
- Set number: (4 or 4_2)
- with trim step or not
Steps:
- runFastQC
- runMultiQC.sh: Run multiQC for the selected libs - get assembled pngs and data
- Need to tell whether there was a adapter trim step (name contain
trim
) or not in the process pipeline - cp the results to final report folder
- Need to tell whether there was a adapter trim step (name contain
- Run setBamQC: To be done
- Run setPeakQC: To be done
gether
Run genSetTrackJson: set up the json files for the signal tracks
- cp signal tracks to vm share folder (allow browser api to visit)
- cat all the json from libQC
- use R code sub function to remove redundent Genome tracks
- Upload the report to VM server for sharing
- generate
- Run setQCreport to generate the html report
setPeakQC
Use merged peak list
- idenfity a merged peak set for all the libs
- get the average fold-enrichement matrix for all the libs * the merged peak locations
- Use the matrix to calculate correlation matrix and PCA
Paired peak overlapping fraction matrix
Essentially function as a venn diagram for the peaks:
- for each lib pair, get a overlapped peak set
- Defined overlapping (variable - one base-pair or threshold in overalpping fraction)
setQCReport
- Modified default template to allow lg screen browse (need to replace the old template in the lib folder
- the track iframe can either set to a fix width or update it to the bootsrap responsive but the max width will restrict to the value defined in the template.
- [ ] use the flexdashboard instead of TOC structure for the report