Logan Bolton | Portfolio

[ACCV 2024 Oral]
Vision Language Models are Blind

This paper shows the limitations of VLMs, such as GPT-4o, in extremely simple, abstract vision tasks. Despite their high scores on multimodal benchmarks, these models often fail on very basic cases.

This research has been featured by OpenAI, TechCrunch, and Ars Technica.

Webpage PDF

[CVPR 2025 Workshop]
Understanding Generative AI Capabilities in Everyday Image Editing Tasks

Native multi-modal image-editing models like GPT-4o and Gemini have shown an impressive ability to edit images through natural language prompts. This paper examines the strengths and weaknesses of these models when pitted against one-to-one matchups against human editors.

Webpage PDF

[Arxiv Preprint]
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

This paper explores how allowing LLMs to highlight their chain of thought to visually link information in the question and answer can increase LLM accuracy while also improving user experience.

Webpage PDF

AI/ML Experience

AI Undergraduate Researcher

Explainable AI Lab - Auburn University

I am an Undergraduate Researcher in Dr. Anh Nguyen's Explainable AI lab. Since joining in July 2024, I have co-authored three different papers on understanding the strengths and weaknesses of multimodal LLMs/image-editing models and improving the reliability and interpretability of LLMs.

This work has led to me becoming an Auburn Undergraduate Research Fellow and receiving an honorable mention for the Outstanding Undergraduate Researcher Award from the Computing Research Association.

Machine Learning Intern

Corvid Technologies

Summer 2025 intern at the defense contractor Corvid Technologies. I am working on applying deep learning to develop more sophisticated methods to predict how 3D objects affect radar response signals.

Biomechanics Deep Learning Lab Assistant

Sport Biomechanics Lab - Auburn University

I have worked as a lab assistant at the Auburn Sport Biomechanics lab since Fall 2024. I have helped advise and develop a cross-disciplinary project that utilizes skeletal pose estimation models to study how different movements can affect the long-term health of canines.

Projects

Reverse Engineering Information from LLM Attention Values

For the final project of my graph theory class, I trained a model to reconstruct the adjacency matrix of a graph based off the attention values of a small LLM. This project demonstrates an example of how the attention map patterns of LLMs can be used to help interpret the inner workings of language models.

Project Report View Code

Neural Network from Scratch

Created a neural network from scratch using Python and Numpy without the use of modern libraries like PyTorch. Trained an MLP on the MNIST dataset and achieved 92% accuracy.

View Code

Hi, my name is Logan Bolton

About Me

Papers

[ACCV 2024 Oral]
Vision Language Models are Blind

[CVPR 2025 Workshop]
Understanding Generative AI Capabilities in Everyday Image Editing Tasks

[Arxiv Preprint]
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

AI/ML Experience

AI Undergraduate Researcher

Machine Learning Intern

Biomechanics Deep Learning Lab Assistant

Projects

Reverse Engineering Information from LLM Attention Values

Neural Network from Scratch

Contact

Hi, my name is Logan Bolton

About Me

Papers

[ACCV 2024 Oral] Vision Language Models are Blind

[CVPR 2025 Workshop]Understanding Generative AI Capabilities in Everyday Image Editing Tasks

[Arxiv Preprint]HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

AI/ML Experience

AI Undergraduate Researcher

Machine Learning Intern

Biomechanics Deep Learning Lab Assistant

Projects

Reverse Engineering Information from LLM Attention Values

Neural Network from Scratch

Contact

[ACCV 2024 Oral]
Vision Language Models are Blind

[CVPR 2025 Workshop]
Understanding Generative AI Capabilities in Everyday Image Editing Tasks

[Arxiv Preprint]
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs