UC 버클리 정보학과의 Bamman 교수가 참여한 논문. 인공지능에게 소설 문장을 주고 주인공의 이름을 맞추는 (빈칸 뚫어 있는 것을 맞추기, fill-mask라고 불리는 task) 실험을 실시함. GPT4, ChatGPT, BERT 마다 정확도가 다른 것을 확인. 또한 유명한 소설이라고해서 정확도가 높지 않다는 것도 확인.
In this work, we carry out a data archaeology to infer books that are known to ChatGPT and GPT-4 using a name cloze membership inference query. We find that OpenAI models have memorized a wide collection of copyrighted materials, and that the degree of memorization is tied to the frequency with which passages of those books appear on the web. The ability of these models to memorize an unknown set of books complicates assessments of measurement validity for cultural analytics by contaminating test data; we show that models perform much better on memorized books than on non-memorized books for downstream tasks. We argue that this supports a case for open models whose training data is known.
Chang, K. K., Cramer, M., Soni, S., & Bamman, D. (2023). Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4. arXiv preprint arXiv:2305.00118.
[2305.00118] Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4 (arxiv.org)