발행물

전체 논문

111

71

LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked
Arxiv, 2024

72

Meta Large Language Model Compiler: Foundation Models of Compiler Optimization
Arxiv, 2024

73

Unlearning Bias in Language Models by Partitioning Gradients
ACL, 2023

74

ProPILE: Probing Privacy Leakage in Large Language Models
NeurIPS, 2024

75

Improving Real-world Password Guessing Attacks via Bi-directional Transformers
USENIX, 2023

76

Semantic Ranking for Automated Adversarial Technique Annotation in Security Text
ASIA CCS, 2024

77

LogBERT: Log Anomaly Detection via BERT
IJCNN, 2021

78

Universal and Transferable Adversarial Attacks on Aligned Language Models
arxiv, 2023

79

Membership Inference via Backdooring
IJCAI, 2022

80

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Meta, 2023