'input attribution' 태그의 글 목록

input attribution 1

[Paper review] Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

https://arxiv.org/pdf/2406.13663RAG document에서 찾은 문서에서 정답 생성시 attribution (citation)을 생성Related worksanswer attributionRAG에서 retrieved document중 어느것이 생성된 answer를 support하는지 찾아낸는것https://aclanthology.org/2023.emnlp-main.398.pdf에서는 hard prompt + ICL(few shot example)로 LLM이 citation을 작성하게 하고response를 sampling할때 여러개(4개)를 뽑아서 위와같이 NLI를 사용해 citation을 평가, citation recall이 가장 좋은 response를 선택하여 사용 https:..

mechanistic interpretability 2024.09.09

mech. interp blogpost

mechanistic interpretability. 딥러닝 모델을 리버스 엔지니어링하는 연구입니다. alien neuroscience :)

multi-hop qa, activation steering, mechanistical interpretability, multitoken, input attribution, tuned lens, future lens, activation patching, linear representation, patch patching, representation engineering, answer attribution, mechanistic interpretability, 논문리뷰, toxicity, reft, controllable generation, XAI, linear representation hypothesis, supporting factor,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

input attribution 1

티스토리툴바