AI learned to "look" like a human: Japanese scientists revealed the abilities of Vision Transformers

Vision Transformers

Japanese researchers from Osaka University presented the results of a unique experiment that demonstrated that generative models of artificial intelligence Vision Transformers (ViT) can develop visual processing skills similar to human ones. These abilities emerged in the models spontaneously-without explicit instructions or predefined filters, thanks to a specific training method.

As part of the new study, the researchers applied a self-supervised learning technique called DINO (self-distillation with no labels), which allowed models to independently form the mechanisms of perception of visual scenes. Instead of setting AI fixed rules, the scientists allowed the systems to learn visual information in a natural environment by analyzing a vast array of video content.

Lead author of the study, Dr Takuto Yamamoto, explained: "Our models didn't just randomly switch between image elements. They spontaneously developed specialized functions. One group of models learned to consistently focus on faces, another — on the contours of shapes, and the third-on the background. This reflects the same segmentation and scene perception strategy that is typical of the human visual system."

To test the hypothesis, the researchers compared the models ' visual strategies with data obtained from tracking eye movements in people who watched the same video clips. The results were striking: the models trained by the DINO method showed behavior almost identical to that of humans. In contrast, systems that used traditional algorithms with fixed filters showed unnatural and fragmented ways of image perception.

Particular attention was drawn to the fact that none of the models received preliminary instructions on which objects should be considered significant. However, AI independently began to give priority to individuals, which, according to scientists, is associated with their high information content. Senior author of the study Professor Shigeru Kitazawa noted: "This is strong evidence that self-supervised learning can capture something fundamental about the nature of learning intelligent systems — both artificial and biological."

Further analysis confirmed that VIT models trained with DINO not only formed structures similar to human visual perception, but also quantitatively reproduced typical patterns of gaze fixation. This was especially evident in scenes involving humans, where the overlap between human and AI behavior was maximal.

Это исследование поднимает новые вопросы о границах возможностей искусственного интеллекта в понимании и интерпретации окружающего мира. Результаты, полученные в Университете Осаки, не только приближают нас к созданию по-настоящему “зрячих” машин, но и открывают путь к лучшему пониманию самого процесса человеческого восприятия.

3 months ago

Maili News

Maili.uz -news portal of Uzbekistan.

Next BMW готовит полное обновление модельного ряда: все автомобили перейдут на стандарты Neue Klasse »

Previous « Карл Пэй предрекает возврат к истокам: будущее смартфонов — без приложений

Russia: the founder of Wildberries again topped the ranking of the richest women

Основательница и генеральный директор компании Wildberries Татьяна Ким в четвертый раз подряд признана самой богатой женщиной России. Ее состояние оценивается…

1 week ago

The science

US: Musk unveils Starship V4, a rocket designed to fly to Mars

Илон Маск объявил о планах SpaceX по созданию четвертой версии ракеты Starship, которая должна стать крупнейшей в истории космонавтики и…

1 week ago

Digital

USA: Google launches beta version of YouTube for Android TV

Компания Google объявила о запуске программы бета-тестирования приложения YouTube для Android TV, открыв пользователям доступ к новым экспериментальным возможностям сервиса.…

1 week ago

Brands

Switzerland: pistachio color has become the main trend of the watch industry in 2025

В 2025 году одним из ключевых направлений в дизайне швейцарских люксовых часов стал фисташковый оттенок. Этот мягкий и прохладный тон,…

1 week ago

Events

China: the expansion of electric vehicles is changing the global auto industry

Китайская индустрия электромобилей за последние годы превратилась в один из главных факторов трансформации мирового автомобильного рынка. Стремительные темпы роста производства…

1 week ago

Economy

Uzbekistan: Asian Development Bank to finance A380 highway upgrade

Азиатский банк развития одобрил выделение кредита в размере 233,1 миллиона долларов США для реализации крупного инфраструктурного проекта в Каракалпакстане. Средства…

1 week ago

AI learned to "look" like a human: Japanese scientists revealed the abilities of Vision Transformers

Related Post

Recent Posts

Russia: the founder of Wildberries again topped the ranking of the richest women

US: Musk unveils Starship V4, a rocket designed to fly to Mars

USA: Google launches beta version of YouTube for Android TV

Switzerland: pistachio color has become the main trend of the watch industry in 2025

China: the expansion of electric vehicles is changing the global auto industry

Uzbekistan: Asian Development Bank to finance A380 highway upgrade