Speaking in the Key of Science

GTI Data

Open databases created and software developed by the GTI and supplemental material to papers.

Databases

SportCLIP (2025): Multi-sport dataset for text-guided video summarization.
Ficosa (2024): The FNTVD dataset has been generated using the Ficosa's recording car.
MATDAT (2023): More than 90K labeled images of martial arts tricking.
SEAW – DATASET (2022): 3 stereoscopic contents in 4K resolution at 30 fps.
UPM-GTI-Face dataset (2022): 11 different subjects captured in 4K, under 2 scenarios, and 2 face mask conditions.
LaSoDa (2022): 60 annotated images from soccer matches in five stadiums with different characteristics and light conditions.
PIROPO Database (2021):People in Indoor ROoms with Perspective and Omnidirectional cameras.
EVENT-CLASS (2021): High-quality 360-degree videos in the context of tele-education.
Parking Lot Occupancy Database (2020)
Nighttime Vehicle Detection database (NVD) (2019)
Hand gesture dataset (2019): Multi-modal Leap Motion dataset for Hand Gesture Recognition.
ViCoCoS-3D (2016): VideoConference Common Scenes in 3D.
LASIESTA database (2016): More than 20 sequences to test moving object detection and tracking algorithms.
Hand gesture database (2015): Hand-gesture database composed by high-resolution color images acquired with the Senz3D sensor.
HRRFaceD database (2014):Face database composed by high resolution images acquired with Microsoft Kinect 2 (second generation).
Lab database (2012): Set of 6 sequences to test moving object detection strategies.
Vehicle image database (2012): More than 7000 images of vehicles and roads.

Software

Empowering Computer Vision in Higher Education(2024): A Novel Tool for Enhancing Video Coding Comprehension.
Engaging students in audiovisual coding through interactive MATLAB GUIs (2024)
TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem (2023)
Solving Routing Problems for Multiple Cooperative Unmanned Aerial Vehicles using Transformer Networks (2023)
Vision Transformers and Traditional Convolutional Neural Networks for Face Recognition Tasks (2023)
Faster GSAC-DNN (2023): A Deep Learning Approach to Nighttime Vehicle Detection Using a Fast Grid of Spatial Aware Classifiers.
SETForSeQ (2020): Subjective Evaluation Tool for Foreground Segmentation Quality.
SMV Player for Oculus Rift (2016)
Bag-D3P (2016): Face recognition using depth information.
TSLAB (2015): Tool for Semiautomatic LABeling.

Supplementary material

Soccer line mark segmentation and classification with stochastic watershed transform (2022)
A fully automatic method for segmentation of soccer playing fields (2022)
Grass band detection in soccer images for improved image registration (2022)
Evaluating the Influence of the HMD, Usability, and Fatigue in 360VR Video Quality Assessments (2020)
Automatic soccer field of play registration (2020)
Augmented reality tool for the situational awareness improvement of UAV operators (2017)
Detection of static moving objects using multiple nonparametric background-foreground models on a Finite State Machine (2015)
Real-time nonparametric background subtraction with tracking-based foreground update (2015)
Camera localization using trajectories and maps (2014)

Foundational Models for Action and Behavior Recognition in the Animal Kingdom

The Grupo de Tratamiento de Imágenes (GTI) has successfully inaugurated its new seminar series, “Speaking in the Key of Science,” aimed at exploring the latest advancements in Artificial Intelligence, Computer Vision, Extended Reality (XR), and related fields. This initiative seeks to foster discussions on cutting-edge research and its applications across diverse domains.

A Promising Start: First Seminar on Foundational Models for Animal Behavior Recognition

The first seminar of the series, titled “Foundational Models for Action and Behavior Recognition in the Animal Kingdom,” took place on Friday, February 28, 2025, at the ETSI Telecomunicación, Universidad Politécnica de Madrid. The session featured insightful presentations by Enmin Zhong and Carlos R. del-Blanco, who shared their expertise on how deep learning, computer vision, and neuroscience are revolutionizing the recognition of animal actions and behaviors.

Key Insights and Discussions

Understanding animal behavior is a complex and multidisciplinary challenge with implications in biology, ecology, and artificial intelligence. The seminar provided an in-depth analysis of how foundational AI models are being leveraged to advance this field, addressing topics such as:

- - Training methodologies for action and behavior recognition models.
  - Challenges in generalizing AI models across different species.
  - Ethical considerations in the use of AI for studying animal behavior.
  - Practical applications in wildlife monitoring, conservation, and behavioral neuroscience.

A Well-Received Event with Engaging Discussions

The seminar attracted a diverse audience of researchers, students, and professionals interested in the intersection of AI and behavioral sciences. The engaging Q&A session allowed participants to delve deeper into the potential and limitations of current methodologies. Attendees also had the opportunity to receive a certificate of attendance, further encouraging participation in future sessions.

Looking Ahead: More Seminars to Come

The successful launch of “Speaking in the Key of Science” sets the stage for a series of thought-provoking discussions on cutting-edge technologies and scientific advancements. GTI is committed to continuing this initiative, bringing together experts from various disciplines to share knowledge and insights.

Research

Projects

Publications

GTI Blog

GTI Data

Quality of Experience tests