Recent technological advancements in high-throughput barcoding of single cells now enable thousands of cells to be analysed in one experiment. However, as these methods rely on short-read Illumina sequencing where the full-length barcoded transcript is fragmented, sequence information of regions more than 150-300bp away from the barcode such as the variable regions of B-cell and T-cell receptors is lost. Consequently, the cells transcriptome is largely described by copy number variance. More recent technologies attempting to overcome these limitations, such as through targeted enrichment of the VDJ region, provide an incomplete reconstruction of the B- and T-cell receptor.
Here we describe “Repertoire and Gene Expression by sequencing” (RAGE-seq), a method that combines short-read Illumina sequencing of transcriptomes with long-read Oxford Nanopore sequencing of the full-length B-cell and T-cell receptors, profiling thousands of single lymphocytes in parallel. To validate this method, RAGE-seq was performed on B-Cell and T-Cell enriched PBMC spiked in with BCR expressing Raji cells. Approximately 99 % of the entire cell barcode library has been found to be represented across both platforms. Furthermore, the full length BCR of 181 out of 216 Raji cells was correctly called with 100% accuracy. To further improve nanopore yields and accuracy, we are employing SCuiggleplex, an unsupervised barcode clustering algorithm based on the raw current signal which can demultiplex cell barcodes with more than double the efficiency of sequence-based methods.