Expression-based species deconvolution and realignment removes misalignment error in multispecies single-cell data.
2022; 23 (1): 157
BACKGROUND: Although single-cell RNA sequencing of xenograft samples has been widely used, no comprehensive bioinformatics pipeline is available for human and mouse mixed single-cell analyses. Considering the numerous homologous genes across the human and mouse genomes, misalignment errors should be evaluated, and a new algorithm is required. We assessed the extents and effects of misalignment errors and exonic multi-mapping events when using human and mouse combined reference data and developed a new bioinformatics pipeline with expression-based species deconvolution to minimize errors. We also evaluated false-positive signals presumed to originate from ambient RNA of the other species and address the importance to computationally remove them.RESULT: Error when using combined reference account for an average of 0.78% of total reads, but such reads were concentrated to few genes that were greatly affected. Human and mouse mixed single-cell data, analyzed using our pipeline, clustered well with unmixed data and showed higher k-nearest-neighbor batch effect test and Local Inverse Simpson's Index scores than those derived from Cell Ranger (10*Genomics). We also applied our pipeline to multispecies multisample single-cell library containing breast cancer xenograft tissue and successfully identified all samples using genomic array and expression. Moreover, diverse cell types in the tumor microenvironment were well captured.CONCLUSION: We present our bioinformatics pipeline for mixed human and mouse single-cell data, which can also be applied to pooled libraries to obtain cost-effective single-cell data. We also address misalignment, multi-mapping error, and ambient RNA as a major consideration points when analyzing multispecies single-cell data.
View details for DOI 10.1186/s12859-022-04676-0
View details for PubMedID 35501695
Changes in blood Krebs von den Lungen-6 predict the mortality of patients with acute exacerbation of interstitial lung disease.
2022; 12 (1): 4916
Acute exacerbation (AE) significantly affects the prognosis of patients with interstitial lung disease (ILD). This study aimed to investigate the best prognostic biomarker for patients with AE-ILD. Clinical data obtained during hospitalization were retrospectively analyzed for 96 patients with AE-ILD at three tertiary hospitals. The mean age of all subjects was 70.1years; the percentage of males was 66.7%. Idiopathic pulmonary fibrosis accounted for 60.4% of the cases. During follow-up (median: 88days), in-hospital mortality was 24%. Non-survivors had higher lactate dehydrogenase and C-reactive protein (CRP) levels, lower ratio of partial pressure of oxygen to the fraction of inspiratory oxygen (P/F ratio), and higher relative change in Krebs von den Lungen-6 (KL-6) levels over 1week after hospitalization than survivors. In multivariable analysis adjusted by age, the 1-week change in KL-6-along with baseline P/F ratio and CRP levels-was an independent prognostic factor for in-hospital mortality (odds ratio 1.094, P=0.025). Patients with remarkable increase in KL-6 (≥10%) showed significantly worse survival (in-hospital mortality: 63.2 vs. 6.1%) than those without. In addition to baseline CRP and P/F ratio, the relative changes in KL-6 over 1week after hospitalization might be useful for predicting in-hospital mortality in patients with AE-ILD.
View details for DOI 10.1038/s41598-022-08965-9
View details for PubMedID 35318424
Image correlation-based method to assess ciliary beat frequency in human airway organoids.
IEEE transactions on medical imaging
Ciliary movements within the human airway are essential for maintaining a clean lung environment. Motile cilia have a characteristic ciliary beat frequency (CBF). However, CBF measurement with current video microscopic techniques can be error-prone due to the use of the single-point Fourier transformation, which is often biased for ciliary measurements. Herein, we describe a new video microscopy technique that harnesses a metric of motion-contrast imaging and image correlation for CBF analysis. It can provide objective and selective CBF measurements for individual motile cilia and generate CBF maps for the imaged area. The measurement performance of our methodology was validated with in vitro human airway organoid models that simulated an actual human airway epithelium. The CBF determined for the region of interest (ROI) was equal to that obtained with manual counting. The signal redundancy problem of conventional methods was not observed. Moreover, the obtained CBF measurements were robust to optical focal shifts, and exhibited spatial heterogeneity and temperature dependence. This technique can be used to evaluate ciliary movement in respiratory tracts and determine whether it is non-synchronous or aperiodic in patients. Therefore, our observations suggest that the proposed method can be clinically adapted as a screening tool to diagnose ciliopathies.
View details for DOI 10.1109/TMI.2021.3112992
View details for PubMedID 34524956
Engineered prime editors with PAM flexibility
2021; 29 (6): 2001-2007
Although prime editors are a powerful tool for genome editing, which can generate various types of mutations such as nucleotide substitutions, insertions, and deletions in the genome without double-strand breaks or donor DNA, the conventional prime editors are still limited to their target scopes because of the PAM preference of the Streptococcus pyogenes Cas9 (spCas9) protein. Here, we describe the engineered prime editors to expand the range of their target sites using various PAM-flexible Cas9 variants. Using the engineered prime editors, we could successfully generate more than 50 types of mutations with up to 51.7% prime-editing activity in HEK293T cells. In addition, we successfully introduced the BRAF V600E mutation, which could not be induced by conventional prime editors. These variants of prime editors will broaden the applicability of CRISPR-based prime editing technologies in biological research.
View details for DOI 10.1016/j.ymthe.2021.02.022
View details for Web of Science ID 000657395200011
View details for PubMedID 33636398
View details for PubMedCentralID PMC8178456
Outbreak investigation of Serratia marcescens neurosurgical site infections associated with a contaminated shaving razors
ANTIMICROBIAL RESISTANCE AND INFECTION CONTROL
2020; 9 (1): 64
Surgical site infection (SSI) is the most common healthcare-associated infection. We report an outbreak of neurosurgical site infections caused by Serratia marcescens after craniotomy in a tertiary care hospital.Between August 6 and 21, 2018, five cases of early-onset SSI caused by S. marcescens after craniotomy were recorded in a 1786-bed tertiary care hospital. Cultures were collected from potential environmental sources and healthcare workers. Whole-genome sequencing (WGS) was used to investigate the genetic relationships among S. marcescens isolates.The outbreak involved five patients; S. marcescens was isolated from the cerebrospinal fluid, pus, tissue, and blood samples from these patients. S. marcescens was also isolated from shaving razors and brushes. All S. marcescens isolates from the infected patients and razors showed the same resistance patterns on antibiotic-susceptibility tests. WGS revealed close clustering among four of five isolates from the patients and among three of four isolates from the razors. No additional patient developed S. marcescens infection after we stopped using the razors for scalp shaving.We report an outbreak of neurosurgical site infections after craniotomy, which was associated with shaving razors contaminated by S. marcescens. Shaving scalps with razors should be avoided to prevent SSI.
View details for DOI 10.1186/s13756-020-00725-6
View details for Web of Science ID 000535609700001
View details for PubMedID 32398063
View details for PubMedCentralID PMC7216399
Whole genome sequencing of Nontuberculous Mycobacterium (NTM) isolates from sputum specimens of co-habiting patients with NTM pulmonary disease and NTM isolates from their environment
2020; 21 (1): 322
Nontuberculous mycobacterium (NTM) species are ubiquitous microorganisms. NTM pulmonary disease (NTM-PD) is thought to be caused not by human-to-human transmission but by independent environmental acquisition. However, recent studies using next-generation sequencing (NGS) have reported trans-continental spread of Mycobacterium abscessus among patients with cystic fibrosis.We investigated NTM genomes through NGS to examine transmission patterns in three pairs of co-habiting patients with NTM-PD who were suspected of patient-to-patient transmission. Three pairs of patients with NTM-PD co-habiting for at least 15 years were enrolled: a mother and a daughter with M. avium-PD, a couple with M. intracellulare-PD, and a second couple, one of whom was infected with M. intracellulare and the other of whom was infected with M. abscessus. Whole genome sequencing was performed using patients' NTM isolates as well as environmental specimens. Genetic distances were estimated based on single nucleotide polymorphisms (SNPs). By comparison with the genetic distances among 78 publicly available NTM genomes, NTM isolates derived from the two pairs of patients infected with the same NTM species were not closely related to each other. In phylogenetic analysis, the NTM isolates from patients with M. avium-PD clustered with isolates from different environmental sources.In conclusion, considering the genetic distances between NTM strains, the likelihood of patient-to-patient transmission in pairs of co-habiting NTM-PD patients without overt immune deficiency is minimal.
View details for DOI 10.1186/s12864-020-6738-2
View details for Web of Science ID 000530103400004
View details for PubMedID 32326890
View details for PubMedCentralID PMC7181514
The Association Between Eosinophil Variability Patterns and the Efficacy of Inhaled Corticosteroids in Stable COPD Patients
INTERNATIONAL JOURNAL OF CHRONIC OBSTRUCTIVE PULMONARY DISEASE
2020; 15: 2061-2069
Blood eosinophils are a predictive marker for the use of inhaled corticosteroids (ICS). However, there is concern over whether a single measure of blood eosinophils is sufficient for outlining a treatment plan. Here, we evaluated the association between variability in blood eosinophils and the effects of ICS in stable COPD cohorts.COPD patients in the Korean COPD Subtype Study and the Seoul National University Airway Registry from 2011 to 2018 were analyzed. Based on blood eosinophils at baseline and at 1-year follow-up, the patients were classified into four groups with 250/μL as a cutoff value: consistently high (CH), consistently low (CL), variably increasing (VI), and variably decreasing (VD). We compared rates of acute exacerbations (AEs) according to ICS use in each group after calibration of severity using propensity score matching.Of 2,221 COPD patients, 618 were analyzed and a total of 125 (20%), 355 (57%), 63 (10%), and 75 (12%) patients were classified into the CH, CL, VI, and VD groups, respectively. After calibration, we found that ICS users tended to have a lower AE rate in the CH group (RR 0.41, 95% CI 0.21-0.74) and VI group (RR 0.45, 95% CI 0.22-0.88), but not in the CL group (RR 1.42, 95% CI 1.08-1.89) and VD group (RR 1.71, 95% CI 1.00-2.96).More than one-fifth of patients had an inconsistent blood eosinophil level after the 1-year follow-up, and the AE-COPD rate according to ICS differed based on variability in eosinophils. Regular follow-up of blood eosinophils is required for COPD patients.
View details for DOI 10.2147/COPD.S258353
View details for Web of Science ID 000565659900001
View details for PubMedID 32943859
View details for PubMedCentralID PMC7473991
- A case of immune thrombocytopenia associated with invasive thymoma successfully treated with eltrombopag BLOOD RESEARCH 2019; 54 (1): 74-77
De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries
2015; 6: 8351
Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.
View details for DOI 10.1038/ncomms9351
View details for Web of Science ID 000363022000006
View details for PubMedID 26387459
View details for PubMedCentralID PMC4595759
microDuMIP: target-enrichment technique for microarray-based duplex molecular inversion probes
NUCLEIC ACIDS RESEARCH
2015; 43 (5): e28
Molecular inversion probe (MIP)-based capture is a scalable and effective target-enrichment technology that can use synthetic single-stranded oligonucleotides as probes. Unlike the straightforward use of synthetic oligonucleotides for low-throughput target capture, high-throughput MIP capture has required laborious protocols to generate thousands of single-stranded probes from DNA microarray because of multiple enzymatic steps, gel purifications and extensive PCR amplifications. Here, we developed a simple and efficient microarray-based MIP preparation protocol using only one enzyme with double-stranded probes and improved target capture yields by designing probes with overlapping targets and unique barcodes. To test our strategy, we produced 11 510 microarray-based duplex MIPs (microDuMIPs) and captured 3554 exons of 228 genes in a HapMap genomic DNA sample (NA12878). Under our protocol, capture performance and precision of calling were compatible to conventional MIP capture methods, yet overlapping targets and unique barcodes allowed us to precisely genotype with as little as 50 ng of input genomic DNA without library preparation. microDuMIP method is simpler and cheaper, allowing broader applications and accurate target sequencing with a scalable number of targets.
View details for DOI 10.1093/nar/gku1188
View details for Web of Science ID 000352487100001
View details for PubMedID 25414325
View details for PubMedCentralID PMC4357688
Tumor evolution and intratumor heterogeneity of an epithelial ovarian cancer investigated using next-generation sequencing
2015; 15: 85
The extent to which metastatic tumors further evolve by accumulating additional mutations is unclear and has yet to be addressed extensively using next-generation sequencing of high-grade serous ovarian cancer.Eleven spatially separated tumor samples from the primary tumor and associated metastatic sites and two normal samples were obtained from a Stage IIIC ovarian cancer patient during cytoreductive surgery prior to chemotherapy. Whole exome sequencing and copy number analysis were performed. Omental exomes were sequenced with a high depth of coverage to thoroughly explore the variants in metastatic lesions. Somatic mutations were further validated by ultra-deep targeted sequencing to sort out false positives and false negatives. Based on the somatic mutations and copy number variation profiles, a phylogenetic tree was generated to explore the evolutionary relationship among tumor samples.Only 6% of the somatic mutations were present in every sample of a given case with TP53 as the only known mutant gene consistently present in all samples. Two non-spatial clusters of primary tumors (cluster P1 and P2), and a cluster of metastatic regions (cluster M) were identified. The patterns of mutations indicate that cluster P1 and P2 diverged in the early phase of tumorigenesis, and that metastatic cluster M originated from the common ancestral clone of cluster P1 with few somatic mutations and copy number variations.Although a high level of intratumor heterogeneity was evident in high-grade serous ovarian cancer, our results suggest that transcoelomic metastasis arises with little accumulation of somatic mutations and copy number alterations in this patient.
View details for DOI 10.1186/s12885-015-1077-4
View details for Web of Science ID 000350327300001
View details for PubMedID 25881093
View details for PubMedCentralID PMC4346117
Identification of somatic mutations in EGFR/KRAS/ALK-negative lung adenocarcinoma in never-smokers
2014; 6: 18
Lung adenocarcinoma is a highly heterogeneous disease with various etiologies, prognoses, and responses to therapy. Although genome-scale characterization of lung adenocarcinoma has been performed, a comprehensive somatic mutation analysis of EGFR/KRAS/ALK-negative lung adenocarcinoma in never-smokers has not been conducted.We analyzed whole exome sequencing data from 16 EGFR/KRAS/ALK-negative lung adenocarcinomas and additional 54 tumors in two expansion cohort sets. Candidate loci were validated by target capture and Sanger sequencing. Gene set analysis was performed using Ingenuity Pathway Analysis.We identified 27 genes potentially implicated in the pathogenesis of lung adenocarcinoma. These included targetable genes involved in PI3K/mTOR signaling (TSC1, PIK3CA, AKT2) and receptor tyrosine kinase signaling (ERBB4) and genes not previously highlighted in lung adenocarcinomas, such as SETD2 and PBRM1 (chromatin remodeling), CHEK2 and CDC27 (cell cycle), CUL3 and SOD2 (oxidative stress), and CSMD3 and TFG (immune response). In the expansion cohort (N = 70), TP53 was the most frequently altered gene (11%), followed by SETD2 (6%), CSMD3 (6%), ERBB2 (6%), and CDH10 (4%). In pathway analysis, the majority of altered genes were involved in cell cycle/DNA repair (P <0.001) and cAMP-dependent protein kinase signaling (P <0.001).The genomic makeup of EGFR/KRAS/ALK-negative lung adenocarcinomas in never-smokers is remarkably diverse. Genes involved in cell cycle regulation/DNA repair are implicated in tumorigenesis and represent potential therapeutic targets.
View details for DOI 10.1186/gm535
View details for Web of Science ID 000334631900001
View details for PubMedID 24576404
View details for PubMedCentralID PMC3979047
Arginine deprivation therapy for malignant melanoma
CLINICAL PHARMACOLOGY-ADVANCES AND APPLICATIONS
2013; 5: 11-19
Despite recent development of promising immunotherapeutic and targeted drugs, prognosis in patients with advanced melanoma remains poor, and a cure for this disease remains elusive in most patients. The success of melanoma therapy depends on a better understanding of the biology of melanoma and development of drugs that effectively target the relevant genes or proteins essential for tumor cell survival. Melanoma cells frequently lack argininosuccinate synthetase, an essential enzyme for arginine synthesis, and as a result they become dependent on the availability of exogenous arginine. Accordingly, a therapeutic approach involving depletion of available arginine has been shown to be effective in preclinical studies. Early clinical studies have demonstrated sufficient antitumor activity to give rise to cautious optimism. In this article, the rationale for arginine deprivation therapy is discussed. Additionally, various strategies for depleting arginine are discussed and the preclinical and clinical investigations of arginine deprivation therapy in melanoma are reviewed.
View details for DOI 10.2147/CPAA.S37350
View details for Web of Science ID 000213876300002
View details for PubMedID 23293541
View details for PubMedCentralID PMC3534294
Multiple target loci assembly sequencing (mTAS)
2011; 415 (2): 218-220
Here we present multiple target loci assembly sequencing (mTAS), a method for examining multiple genomic loci in a single DNA sequencing read. The key to the success of mTAS target sequencing is the uniform amplification of multiple target genomic loci into a single DNA fragment using polymerase cycling assembly (PCA). Using this strategy, we successfully collected multiloci sequence information from a single DNA sequencing run. We applied mTAS to examine 29 different sets of human genomic loci, each containing from 2 to 11 single-nucleotide polymorphisms (SNP) present at different exons. We believe mTAS can be used to reduce the cost of Sanger sequencing-based genetic analysis.
View details for DOI 10.1016/j.ab.2011.04.012
View details for Web of Science ID 000291904700021
View details for PubMedID 21536013
Multiplex padlock targeted sequencing reveals human hypermutable CpG variations
2009; 19 (9): 1606-1615
Utilizing the full power of next-generation sequencing often requires the ability to perform large-scale multiplex enrichment of many specific genomic loci in multiple samples. Several technologies have been recently developed but await substantial improvements. We report the 10,000-fold improvement of a previously developed padlock-based approach, and apply the assay to identifying genetic variations in hypermutable CpG regions across human chromosome 21. From approximately 3 million reads derived from a single Illumina Genome Analyzer lane, approximately 94% (approximately 50,500) target sites can be observed with at least one read. The uniformity of coverage was also greatly improved; up to 93% and 57% of all targets fell within a 100- and 10-fold coverage range, respectively. Alleles at >400,000 target base positions were determined across six subjects and examined for single nucleotide polymorphisms (SNPs), and the concordance with independently obtained genotypes was 98.4%-100%. We detected >500 SNPs not currently in dbSNP, 362 of which were in targeted CpG locations. Transitions in CpG sites were at least 13.7 times more abundant than non-CpG transitions. Fractions of polymorphic CpG sites are lower in CpG-rich regions and show higher correlation with human-chimpanzee divergence within CpG versus non-CpG sites. This is consistent with the hypothesis that methylation rate heterogeneity along chromosomes contributes to mutation rate variation in humans. Our success suggests that targeted CpG resequencing is an efficient way to identify common and rare genetic variations. In addition, the significantly improved padlock capture technology can be readily applied to other projects that require multiplex sample preparation.
View details for DOI 10.1101/gr.092213.109
View details for Web of Science ID 000269482200011
View details for PubMedID 19525355
View details for PubMedCentralID PMC2752131
Genome-Wide Identification of Human RNA Editing Sites by Parallel DNA Capturing and Sequencing
2009; 324 (5931): 1210-1213
Adenosine-to-inosine (A-to-I) RNA editing leads to transcriptome diversity and is important for normal brain function. To date, only a handful of functional sites have been identified in mammals. We developed an unbiased assay to screen more than 36,000 computationally predicted nonrepetitive A-to-I sites using massively parallel target capture and DNA sequencing. A comprehensive set of several hundred human RNA editing sites was detected by comparing genomic DNA with RNAs from seven tissues of a single individual. Specificity of our profiling was supported by observations of enrichment with known features of targets of adenosine deaminases acting on RNA (ADAR) and validation by means of capillary sequencing. This efficient approach greatly expands the repertoire of RNA editing targets and can be applied to studies involving RNA editing-related human diseases.
View details for DOI 10.1126/science.1170995
View details for Web of Science ID 000266410100049
View details for PubMedID 19478186
Investigation of the Catalytic Mechanism of a Synthetic DNAzyme with Protein-like Functionality: An RNaseA Mimic?
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY
2009; 131 (15): 5648-5658
The protein enzyme ribonuclease A (RNaseA) cleaves RNA with catalytic perfection, although with little sequence specificity, by a divalent metal ion (M(2+))-independent mechanism in which a pair of imidazoles provides general acid and base catalysis, while a cationic amine provides electrostatic stabilization of the transition state. Synthetic imitation of this remarkable organo-catalyst ("RNaseA mimicry") has been a longstanding goal in biomimetic chemistry. The 9(25)-11 DNAzyme contains synthetically modified nucleotides presenting both imidazole and cationic amine side chains, and catalyzes RNA cleavage with turnover in the absence of M(2+) similarly to RNaseA. Nevertheless, the catalytic roles, if any, of the "protein-like" functional groups have not been defined, and hence the question remains whether 9(25)-11 engages any of these functionalities to mimic aspects of the mechanism of RNaseA. To address this question, we report a mechanistic investigation of 9(25)-11 catalysis wherein we have employed a variety of experiments, such as DNAzyme functional group deletion, mechanism-based affinity labeling, and bridging and nonbridging phosphorothioate substitution of the scissile phosphate. Several striking parallels exist between the results presented here for 9(25)-11 and the results of analogous experiments applied previously to RNaseA. Specifically, our results implicate two particular imidazoles in general acid and base catalysis and suggest that a specific cationic amine stabilizes the transition state via diastereoselective interaction with the scissile phosphate. Overall, 9(25)-11 appears to meet the minimal criteria of an RNaseA mimic; this demonstrates how added synthetic functionality can expand the mechanistic repertoire available to a synthetic DNA-based catalyst.
View details for DOI 10.1021/ja900125n
View details for Web of Science ID 000265268100049
View details for PubMedID 20560639
Strength of C-alpha-H center dot center dot center dot O=C hydrogen bonds in transmembrane proteins
JOURNAL OF PHYSICAL CHEMISTRY B
2008; 112 (3): 1041-1048
A large number of Calpha-H...O contacts are present in transmembrane protein structures, but contribution of such interactions to protein stability is still not well understood. According to previous ab initio quantum calculations, the stabilization energy of a Calpha-H...O contact is about 2-3 kcal/mol. However, experimental studies on two different Calpha-H...O hydrogen bonds present in transmembrane proteins lead to conclusions that one contact is only weakly stabilizing and the other is not even stabilizing. We note that most previous computational studies were on optimized geometries of isolated molecules, but the experimental measurements were on those in the structural context of transmembrane proteins. In the present study, 263 Calpha-H...O=C contacts in alpha-helical transmembrane proteins were extracted from X-ray crystal structures, and interaction energies were calculated with quantum mechanical methods. The average stabilization energy of a Calpha-H...O=C interaction was computed to be 1.4 kcal/mol. About 13% of contacts were stabilizing by more than 3 kcal/mol, and about 11% were destabilizing. Analysis of the relationships between energy and structure revealed four interaction patterns: three types of attractive cases in which additional Calpha-H...O or N-H...O contact is present and a type of repulsive case in which repulsion between two carbonyl oxygen atoms occur. Contribution of Calpha-H...O=C contacts to protein stability is roughly estimated to be greater than 5 kcal/mol per helix pair for about 16% of transmembrane helices but for only 3% of soluble protein helices. The contribution would be larger if Calpha-H...O contacts involving side chain oxygen were also considered.
View details for DOI 10.1021/jp077285n
View details for Web of Science ID 000252484200050
View details for PubMedID 18154287