'supporting factor' 태그의 글 목록

supporting factor 1

[Paper review] Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

https://arxiv.org/pdf/2406.13663RAG document에서 찾은 문서에서 정답 생성시 attribution (citation)을 생성Related worksanswer attributionRAG에서 retrieved document중 어느것이 생성된 answer를 support하는지 찾아낸는것https://aclanthology.org/2023.emnlp-main.398.pdf에서는 hard prompt + ICL(few shot example)로 LLM이 citation을 작성하게 하고response를 sampling할때 여러개(4개)를 뽑아서 위와같이 NLI를 사용해 citation을 평가, citation recall이 가장 좋은 response를 선택하여 사용 https:..

mechanistic interpretability 2024.09.09

mech. interp blogpost

mechanistic interpretability. 딥러닝 모델을 리버스 엔지니어링하는 연구입니다. alien neuroscience :)

XAI, input attribution, future lens, patch patching, tuned lens, multitoken, 논문리뷰, linear representation hypothesis, controllable generation, representation engineering, activation steering, toxicity, answer attribution, multi-hop qa, linear representation, reft, supporting factor, mechanistical interpretability, activation patching, mechanistic interpretability,

Today :
Yesterday :

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

supporting factor 1

티스토리툴바