Real-Time Pose Action Recognition on Embedded Edge Devices

Konzulens:

Barcza Bende

Tárgy:

Önálló laboratórium - Irányítórendszerek ágazat, BSc Vill.
Önálló laboratórium 1 - Egészségügyi mérnök, MSc Eü.
Önálló laboratórium 2 - Egészségügyi mérnök, MSc Eü.
Önálló laboratórium 1 - Irányító és látórendszerek MSc. főspec.
Önálló laboratórium 1 - Vizuális informatika MSc. főspec.
Önálló laboratórium 2 - Irányító és látórendszerek MSc. főspec.
Önálló laboratórium 2 - Vizuális informatika MSc. főspec.
Önálló laboratórium - Szoftverfejlesztés és rendszertervezés specializáció, BSc Info.
Projektfeladat mechatronikusoknak

Hallgatói létszám:

Folytatás:

Szakdolgozat / Diplomaterv
TDK dolgozat

Leírás:

AI vision capabilities are stronger than ever, Vision Language Models (VLM) are able to understand and describe a lot of images, with high precision and detail. The generalisation of VLMs are comes at a price of large number of parameters, makes it unreasonably computationally expensive for downstream tasks. The possibility to run VLM on edge devices became important for robotics or autonomous vehicles. Their datasets highly varies for different scenarios, so the models both has to have high capabilities, but also to run them in a reasonable frequency.

Within this field, the following tasks can be performed:

Literature Review & Technology Selection: Investigating state-of-the-art, lightweight pose estimation frameworks and edge-compatible machine learning tools.
Data Processing Pipeline: Creating an automated pipeline to extract temporal sequences of skeletal keypoints or other compact spatial features from video datasets, bypassing the computational cost of processing raw image pixels.
Sequence Modeling: Designing or adapting a lightweight neural network architecture capable of classifying temporal sequences (e.g., recognizing specific actions or intentions over time) with a minimal parameter count.
Model Optimization: Exploring and applying model compression techniques (such as quantization or pruning) to shrink the model footprint and prepare it for embedded deployment.
Edge Deployment & Benchmarking: Deploying the end-to-end pipeline onto a target embedded edge device and analyzing the trade-offs between classification accuracy, resource utilization, and real-time inference speed.

Requirements for the topic:

Knowledge of python programming
Good English communication skills

Recommended for the topic:

Basic knowledge of machine learning/deep learning concepts.
Minimal familiarity with Linux environments or single-board embedded systems.
Basic mathematical foundation (linear algebra and matrix operations).