주요 논문
5
*2026년 기준 최근 6년 이내 논문에 한해 Impact Factor가 표기됩니다.
1
article
|
인용수 7
·
2025Automated and Efficient Sampling of Chemical Reaction Space
Minhyeok Lee, Umit Volkan Ucak, Jinyoung Jeong, Islambek Ashyrmamatov, Juyong Lee, Eunji Sim
IF 14.1 (2025)
Advanced Science
Machine learning interatomic potentials (MLIPs) promise quantum-level accuracy at classical force field speeds, but their performance hinges on the quality and diversity of training data. An efficient and fully automated approach to sample chemical reaction space without relying on human intuition, addressing a critical gap in MLIP development is presented. The method combines the speed of tight-binding calculations with selective high-level refinement, generating diverse datasets that capture both equilibrium and reactive regions of potential energy surfaces. By employing single-ended growing string and nudged elastic band methods, reaction pathways previously underrepresented in MLIP training sets, particularly near transition states are systematically explored. This approach yields datasets with rich structural and chemical diversity, essential for robust MLIP development. Open-source code is provided for the entire workflow, facilitating the integration of the approach into existing MLIP development pipelines.
https://doi.org/10.1002/advs.202409009
Chemical space
Sampling (signal processing)
Space (punctuation)
Computer science
Biochemical engineering
Environmental science
Process engineering
Chemistry
Engineering
Telecommunications
2
article
|
인용수 1
·
2025A survey on large language models in biology and chemistry
Islambek Ashyrmamatov, Su Ji Gwak, Su-Young Jin, Ikhyeong Jun, Umit Volkan Ucak, Jay-Yoon Lee, Juyong Lee
IF 12.9 (2025)
Experimental & Molecular Medicine
Artificial intelligence (AI) is reshaping biomedical research by providing scalable computational frameworks suited to the complexity of biological systems. Central to this revolution are bio/chemical language models, including large language models, which are reconceptualizing molecular structures as a form of 'language' amenable to advanced computational techniques. Here we critically examine the role of these models in biology and chemistry, tracing their evolution from molecular representation to molecular generation and optimization. This review covers key molecular representation strategies for both biological macromolecules and small organic compounds-ranging from protein and nucleotide sequences to single-cell data, string-based chemical formats, graph-based encodings and three-dimensional point clouds-highlighting their respective advantages and inherent limitations in AI applications. The discussion further explores core model architectures, such as bidirectional encoder representations from transformers-like encoders, generative pretrained transformer-like decoders and encoder-decoder transformers, alongside their sophisticated pretraining strategies such as self-supervised learning, multitask learning and retrieval-augmented generation. Key biomedical applications, spanning protein structure and function prediction, de novo protein design, genomic analysis, molecular property prediction, de novo molecular design, reaction prediction and retrosynthesis, are explored through representative studies and emerging trends. Finally, the review considers the emerging landscape of agentic and interactive AI systems, showcasing briefly their potential to automate and accelerate scientific discovery while addressing critical technical, ethical and regulatory considerations that will shape the future trajectory of AI in biomedicine.
https://doi.org/10.1038/s12276-025-01583-1
Representation (politics)
Key (lock)
Function (biology)
Generative grammar
Scalability
Biomedicine
Tracing
Computational model
3
article
|
인용수 0
·
2024Molecular basis of facilitated target search and sequence discrimination of TALE homeodomain transcription factor Meis1
Seo‐Ree Choi, Juyong Lee, Yeo‐Jin Seo, Ho-Seong Jin, Hye-Bin Ahn, Youyeon Go, Nak‐Kyoon Kim, Kyoung‐Seok Ryu, Joon‐Hwa Lee
IF 15.7 (2024)
Nature Communications
Transcription factors specifically bind to their consensus sequence motifs and regulate transcription efficiency. Transcription factors are also able to non-specifically contact the phosphate backbone of DNA through electrostatic interaction. The homeodomain of Meis1 TALE human transcription factor (Meis1-HD) recognizes its target DNA sequences via two DNA contact regions, the L1-α1 region and the α3 helix (specific binding mode). This study demonstrates that the non-specific binding mode of Meis1-HD is the energetically favored process during DNA binding, achieved by the interaction of the L1-α1 region with the phosphate backbone. An NMR dynamics study suggests that non-specific binding might set up an intermediate structure which can then rapidly and easily find the consensus region on a long section of genomic DNA in a facilitated binding process. Structural analysis using NMR and molecular dynamics shows that key structural distortions in the Meis1-HD-DNA complex are induced by various single nucleotide mutations in the consensus sequence, resulting in decreased DNA binding affinity. Collectively, our results elucidate the detailed molecular mechanism of how Meis1-HD recognizes single nucleotide mutations within its consensus sequence: (i) through the conformational features of the α3 helix; and (ii) by the dynamic features (rigid or flexible) of the L1 loop and the α3 helix. These findings enhance our understanding of how single nucleotide mutations in transcription factor consensus sequences lead to dysfunctional transcription and, ultimately, human disease.
https://doi.org/10.1038/s41467-024-51297-7
Homeobox
Transcription factor
Sequence (biology)
Computational biology
Genetics
Biology
Basis (linear algebra)
EMX2
Bioinformatics
Evolutionary biology
4
article
|
인용수 1
·
2022Cytosolic microRNA-inducible nuclear translocation of Cas9 protein for disease-specific genome modification
Cheol-Hee Shin, Su Chan Park, Il-Geun Park, Hye‐Rim Kim, Byoungha An, Choongil Lee, Sang‐Heon Kim, Juyong Lee, Ji Min Lee, Seung Ja Oh
IF 14.9 (2022)
Nucleic Acids Research
MicroRNA-dependent mRNA decay plays an important role in gene silencing by facilitating posttranscriptional and translational repression. Inspired by this intrinsic nature of microRNA-mediated mRNA cleavage, here, we describe a microRNA-targeting mRNA as a switch platform called mRNA bridge mimetics to regulate the translocation of proteins. We applied the mRNA bridge mimetics platform to Cas9 protein to confer it the ability to translocate into the nucleus via cleavage of the nuclear export signal. This system performed programmed gene editing in vitro and in vivo. Combinatorial treatment with cisplatin and miR-21-EZH2 axis-targeting CRISPR Self Check-In improved sensitivity to chemotherapeutic drugs in vivo. Using the endogenous microRNA-mediated mRNA decay mechanism, our platform is able to remodel a cell's natural biology to allow the entry of precise drugs into the nucleus, devoid of non-specific translocation. The mRNA bridge mimetics strategy is promising for applications in which the reaction must be controlled via intracellular stimuli and modulates Cas9 proteins to ensure safe genome modification in diseased conditions.
https://doi.org/10.1093/nar/gkac431
Biology
Chromosomal translocation
Cytosol
Genome
microRNA
Nuclear protein
Genetics
Cas9
Cell biology
Gene
5
article
|
인용수 102
·
2022Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments
Umit Volkan Ucak, Islambek Ashyrmamatov, Junsu Ko, Juyong Lee
IF 16.6 (2022)
Nature Communications
Designing efficient synthetic routes for a target molecule remains a major challenge in organic synthesis. Atom environments are ideal, stand-alone, chemically meaningful building blocks providing a high-resolution molecular representation. Our approach mimics chemical reasoning, and predicts reactant candidates by learning the changes of atom environments associated with the chemical reaction. Through careful inspection of reactant candidates, we demonstrate atom environments as promising descriptors for studying reaction route prediction and discovery. Here, we present a new single-step retrosynthesis prediction method, viz. RetroTRAE, being free from all SMILES-based translation issues, yields a top-1 accuracy of 58.3% on the USPTO test dataset, and top-1 accuracy reaches to 61.6% with the inclusion of highly similar analogs, outperforming other state-of-the-art neural machine translation-based methods. Our methodology introduces a novel scheme for fragmental and topological descriptors to be used as natural inputs for retrosynthetic prediction tasks.
https://doi.org/10.1038/s41467-022-28857-w
Retrosynthetic analysis
Translation (biology)
Computer science
Machine translation
Artificial intelligence
Representation (politics)
Atom (system on chip)
Machine learning
Biological system
Chemistry