当前位置: 首页 > article >正文

CVPR 2024 视频处理方向总汇(视频监控、视频理解、视频识别和视频预测等)

1、视频处理总汇

  • Learning from One Continuous Video Stream
  • Deep Video Inverse Tone Mapping Based on Temporal Clues
  • VTimeLLM: Empower LLM to Grasp Video Moments
  • Combining Frame and GOP Embeddings for Neural Video Representation
  • Learning to Predict Activity Progress by Self-Supervised Video Alignment
  • CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
  • vid-TLDR: Training Free Token Merging for Light-weight Video Transformer
    ⭐code
  • Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video
    ⭐code
  • Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement
  • Understanding Video Transformers via Universal Concept Discovery
  • Video Recognition in Portrait Mode
    🏠project
  • VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams
    🏠project
  • Just Add π! Pose Induced Video Transformers for Understanding Activities of Daily Living
    ⭐code
  • A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
  • [Reliable Video Teller via Equal Distance to Visual Tokens]
  • Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens
    🏠project
  • Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings
  • Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models
  • 睡眠监测
    • SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers
  • 视频理解
    • Compositional Video Understanding with Spatiotemporal Structure-based Transformers
    • Action Scene Graphs for Long-Form Understanding of Egocentric Videos
    • HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
    • A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives
      🏠project
    • Koala: Key Frame-Conditioned Long Video-LLM
    • MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
      ⭐code
    • Abductive Ego-View Accident Video Understanding for Safe Driving Perception
      🏠project
    • OmniVid: A Generative Framework for Universal Video Understanding
      ⭐code
    • A Unified Framework for Human-centric Point Cloud Video Understanding
    • Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
    • MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
      🏠project
    • TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
      ⭐code
    • Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
      ⭐code
  • 视频摘要
    • Previously on ... From Recaps to Story Summarization
      🏠project
    • Scaling Up Video Summarization Pretraining with Large Language Models
    • CSTA: CNN-based Spatiotemporal Attention for Video Summarization
      ⭐code
  • 视频重建
    • HDRFlow: Real-Time HDR Video Reconstruction with Large Motions
      ⭐code
  • 视频表示
    • DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes
      🏠project
  • 视频判读
    • Visual Objectification in Films: Towards a New AI Task for Video Interpretation
  • 电影描述
    • MICap: A Unified Model for Identity-Aware Movie Descriptions
      🏠project
  • 视频监控
    • Towards Surveillance Video-and-Language Understanding: New Dataset Baselines and Challenges
      🌻dataset
  • 视频预测
    • Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes
    • ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction
      ⭐code
      🏠project
  • 视频稳定
    • Harnessing Meta-Learning for Improving Full-Frame Video Stabilization
    • 3D Multi-frame Fusion for Video Stabilization
  • 视频识别
    • OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
      ⭐code
      🏠project
  • 视频对话
    • BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning
      ⭐code
  • 视频重照明
    • Real-time 3D-aware Portrait Video Relighting
  • 视频和谐化
    • Video Harmonization with Triplet Spatio-Temporal Variation Patterns
      👍VILP
  • 视频帧插值
    • Video Frame Interpolation via Direct Synthesis with the Event-based Reference
    • IQ-VFI: Implicit Quadratic Motion Estimation for Video Frame Interpolation
    • EVS-assisted Joint Deblurring Rolling-Shutter Correction and Video Frame Interpolation through Sensor Inverse Modeling
    • TTA-EVF: Test-Time Adaptation for Event-based Video Frame Interpolation via Reliable Pixel and Sample Estimation
    • Sparse Global Matching for Video Frame Interpolation with Large Motion
      ⭐code
    • Perception-Oriented Video Frame Interpolation via Asymmetric Blending
      ⭐code
      👍视频插帧视觉效果新突破!上海交大提出PerVFI,视频插帧新范式
    • SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
      🏠project
  • 视频主题交换
    • VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
      🏠project
  • 视频异常检测
    • Open-Vocabulary Video Anomaly Detection
    • Multi-Scale Video Anomaly Detection by Multi-Grained Spatio-Temporal Representation Learning
    • Harnessing Large Language Models for Training-free Video Anomaly Detection
      ⭐code
    • Collaborative Learning of Anomalies with Privacy (CLAP) for Unsupervised Video Anomaly Detection: A New Baseline
      ⭐code
    • Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
    • MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection
    • PREGO: Online Mistake Detection in PRocedural EGOcentric Videos
    • Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
      ⭐code
    • Text Prompt with Normality Guidance for Weakly Supervised Video Anomaly Detection
    • GlitchBench: Can Large Multimodal Models Detect Video Game Glitches?
      🏠project大型多模态模型能否检测视频游戏故障
  • 视频场景检测
    • Neighbor Relations Matter in Video Scene Detection
  • 视频镜像检测
    • Effective Video Mirror Detection with Inconsistent Motion Cues
  • 自动生成电影预告片
    • Towards Automated Movie Trailer Generation
  • 视频对话式音乐推荐系统
    • MuseChat: A Conversational Music Recommendation System for Videos
  • Video Paragraph Grounding
    • Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
  • video Grounding
    • SnAG: Scalable and Accurate Video Grounding
      ⭐code
    • Context-Guided Spatio-Temporal Video Grounding
      ⭐code
    • Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding
    • What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

http://www.kler.cn/a/506690.html

相关文章:

  • 大疆最新款无人机发布,可照亮百米之外目标
  • [Qualcomm]Qualcomm MDM9607 SDK代码下载操作说明
  • Python在DevOps中的应用:自动化CI/CD管道的实现
  • 通过maven命令上传jar包至nexus v3.7.1
  • Maven 配置本地仓库
  • 09.VSCODE:安装 Git for Windows
  • JavaScript系列(29)--设计模式详解
  • 2025年01月15日Github流行趋势
  • 火绒剑独立版 - 强大的Windows系统安全分析工具
  • 基于 Python 的毕设选题管理系统设计与实现
  • 前端如何创建微任务
  • 【gRPC】clientPool 客户端连接池简单实现与go案例
  • Go语言之路————条件控制:if、for、switch
  • Oracle EBS GL定期盘存WIP日记账无法过账数据修复
  • Go语言封装加解密包(AES/DES/RSA)
  • Sprint Boot教程之五十八:动态启动/停止 Kafka 监听器
  • 说说Babylon.js中scene.deltaTime的大坑
  • 如何异地远程访问本地部署的Web-Check实现团队远程检测与维护本地站点
  • 《DeepSeek V3:重新定义AI大模型的效率与成本》
  • Qt实现防止程序多次运行
  • Java学习教程,从入门到精通,JDBC数据库连接语法知识点及案例代码(92)
  • Outlook 无网络连接[2604] 错误解决办法
  • python批量doc转pdf调用提示库未注册
  • 华北水利水电大学第十届ACM/ICPC程序设计新生赛题解
  • Django Admin 实战:实现 ECS 集群批量同步功能
  • 【6】Word:海名公司文秘❗