VISION
Mirco Planamente
INTELLIGENCE

Mirco Planamente — Transforming Vision Into Intelligence

From Academia to Industry

Research Identity

👁️
[01] Primary Focus

Egocentric Vision

First-person action recognition & understanding from wearable cameras. Bridging human activity analysis with computer vision in unconstrained environments.

CVPR WACV IEEE RA-L IJCV
🎬
[02] Core Research

Video Understanding

Temporal video understanding and classification. Designing architectures that capture motion dynamics across diverse domains and perspectives.

2D/3D Trimmed/Untrimmed Summarization Retrieval
🔄
[03] Generalization

Domain Adaptation

Building AI systems that generalize across domains without retraining. Unsupervised and semi-supervised transfer learning for cross-domain robustness.

UDA Open-set Closed-set Partial UDA Universal UDA
🔍
[04] Safety & QA

Anomaly Detection

Identifying rare, unusual or dangerous events in industrial and video streams. Unsupervised and one-class approaches for zero-shot anomaly localization.

Industry Production
🗺️
[05] Scene Understanding

Semantic Segmentation

Pixel-wise classification for dense scene understanding. Applications in autonomous systems, industrial inspection and AR interfaces.

SAM DINO U-Net Few-Shot SegFormer Weakly-Supervised
🎵
[06] Fusion

Multi-Modal Learning

Integrating audio, visual and sensor signals for comprehensive scene understanding. Joint representation learning across modalities.

Depth RGB Optical Flow Audio Event IMU Keypoints Text
🎯
[07] Localization

Object Detection

Real-time and high-accuracy object detection pipelines for unconstrained scenes. From anchor-based to transformer-based detectors deployed in production.

YOLO Mask R-CNN RT-DETR DEIM
🔎
[08] Recognition

Object Recognition

Fine-grained visual recognition of objects across categories, instances and contexts. Metric learning and embedding-based approaches for few-shot and open-set scenarios.

ConvNeXt ViT Swin CLIP DINO
🧍
[09] Body Understanding

Human Pose Estimation

Skeleton-based and heatmap-driven pose estimation for activity monitoring, action anticipation and human-machine interaction.

MoveNet BlazePose YOLO-Pose MediaPipe HRNet ViTPose
Actual conversations. Actual pain.

Real Industrial Problems

don't panic. we'll figure it out.

mio padre dice sempre: "calma e sangue freddo"

👷 it was working fine yesterday. today it's broken. nothing changed
🧑‍🔧 why did it fail on this one? it looks the same as all the others
👩‍💻 we have only 3 defect images. how do I train a model?
🧑‍🏭 how many images do we actually need?
👩‍🔬 the model isn't learning anything
👩‍💼 they changed the setup last week. accuracy dropped 30 points. do I need to redo everything?
🧑‍🔧 new camera arrives monday. do we start from scratch AGAIN
👩‍🏭 the model forgot everything it learned before
👷 training takes 6 hours. every time. I can't keep doing this
👩‍💻 can this run on a phone?
🧑‍🔬 how do I even know what it's actually doing under the hood?
👨‍🏭 what does fine-tuning actually mean?
🧑‍💼 can't we just use ChatGPT for this?
👩‍💼 I saw a LinkedIn post saying AI can do this in 5 minutes. why is ours taking weeks
👨‍💼 can we plug GPT-4 into the camera and just let it decide?
🧑‍💼 the CEO wants an AI strategy by friday. what do I tell him
From Google Scholar

Publications

Live data from Google Scholar
The hard truth

Not many.

But every single one cost

blood, sweat & deadline nights.

— quality, not quantity.

View All ↗
Open Source & Side Projects

Projects

Speaking & Achievements

Featured Talks

EPIC-KITCHENS 100 Challenge 2022
🏆
CVPR Workshop · 2021–2022

EPIC-KITCHENS-100 Challenge

Unsupervised Domain Adaptation track — Top 3 for two consecutive years.

EK100 Challenge winners 2021
Codemotion 2025 talk
🎤
Codemotion Conference · 2025

From Pixels to Features

From hand-crafted descriptors to foundation models — one backbone to rule them all.

Codemotion talk audience view
Py4AI 2024 poster
🐍
Py4AI Conference · 2024

Egocentric Vision: AI Through the Eyes of Users

Presented cutting-edge first-person action recognition to Italy's Python AI community.

Egocentric Vision Logo
Let's Connect

Interested in collaboration?

Open to research collaborations, industry projects and speaking engagements. Currently at ARGO Vision · Politecnico di Torino · IIT.