The field of artificial intelligence is advancing quickly. Reading research papers and source code, if released, is inevitable to stay up to date. This is a list of articles I came across which I consider interesting. It does not necessarily mean that all of them are ground breaking or hyped - I simply considered them interesting while reading.
Q4/2024
-
Cheng et al. (2024): Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss. arXiv:2410.17243
-
Huang et al. (2024): LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation. arXiv:2411.04997
- Li et al. (2024): AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions. arXiv:2410.20424
-
Lin et al. (FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors). arXiv:2410.16271
- Xu et al. (2024): LLaVA-CoT: Let Vision Language Models Reason Step-by-Step. arXiv:2411.10440
- Xu et al. (2024): No More Adam: Learning Rate Scaling at Initialization is All You Need. arXiv:2412.11768
Q3/2024
-
Fleischer et al. (2024): RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation. arXiv:2408.02545
-
Ruiz et al. (2024): Magic Insert: Style-Aware Drag-and-Drop. arXiv:2407.02489
-
Xiao et al. (2024): Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation. arXiv:2407.07871
Q2/2024
-
Castells et al. (2024): EdgeFusion: On-Device Text-to-Image Generation. arXiv:2404.11925
-
Deep et al.(2024): DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling. arXiv:2406.11617
-
Faysee et al. (2024): ColPali: Efficient Document Retrieval with Vision Language Models. arXiv:2407.01449
-
Gagrani et al. (2024): On Speculative Decoding for Multimodal Large Language Models. arXiv:2404.08856
Q1/2024
-
Han et al. (2024): COCO is “ALL’’ You Need for Visual Instruction Fine-tuning. arXiv:2401.08968
-
Lu et al. (2024): From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities. arXiv:2401.15071
-
Ma et al. (2024): The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits. arXiv:2402.17764
-
Sun et al. (2024): EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters. arXiv:2402.04252
-
Wang et al. (2024): YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv:2402.13616
Q4/2023
-
Alizadeh et al. (2023): LLM in a flash: Efficient Large Language Model Inference with Limited Memory. arXiv:2312.11514
-
Garza and Mergenthaler-Canseco (2023): TimeGPT-1. arXiv:2310.03589
-
Li et al. (2023): Domain Generalization of 3D Object Detection by Density-Resampling. arXiv:2311.10845
-
Seras et al. (2023): Efficient Object Detection in Autonomous Driving using Spiking Neural Networks: Performance, Energy Consumption Analysis, and Insights into Open-set Object Discovery. arXiv:2312.07466
-
Wang et al. (2023): BitNet: Scaling 1-bit Transformers for Large Language Models. arXiv:2310.11453
-
Zhou et al. (2023): WaterHE-NeRF: Water-ray Tracing Neural Radiance Fields for Underwater Scene Reconstruction. arXiv:2312.06946
Q3/2023
-
Ding et al. (2023): LongNet: Scaling Transformers to 1,000,000,000 Tokens. arXiv:2307.02486
-
Karaev et al. (2023): CoTracker: It is Better to Track Together. arXiv:2307.07635
-
Sun et al. (2023): Retentive Network: A Successor to Transformer for Large Language Models. arXiv:2307.08621
-
Touvron et al. (2023): Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288
Q2/2023
- Barchid et al. (2023): Spiking-Fer: Spiking Neural Network for Facial Expression Recognition With Event Cameras. arXiv:2304.10211
-
Bulatov et al. (2023): Scaling Transformer to 1M tokens and beyond with RMT. arXiv:2304.11062
-
Ducoffe et al. (2023): LARD – Landing Approach Runway Detection – Dataset for Vision Based Landing. arXiv:2304.09938
-
Kirillov et al. (2023): Segment Anything. arXiv:2304.02643
-
Lv et al. (2023): DETRs Beat YOLOs on Real-time Object Detection. arXiv:2304.08069
-
Pernias et al. (2023): Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models. arXiv:2306.00637
- Zhoug et al. (2023): LIMA: Less Is More for Alignment. arXiv:2305.11206
Q1/2023
-
Bauer et al. (2023): Human-Timescale Adaptation in an Open-Ended Task Space. arXiv:2301.07608
-
Cuarado et al. (2023): Optical Flow estimation with Event-based Cameras and Spiking Neural Networks. arXiv:2302.06492
-
Li et al. (2023): BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. arXiv:2301.12597
- Sahak et al. (2023): Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild. arXiv:2302.07864
- Sauer et al. (2023): StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis. arXiv:2301.09515
- Serych and Matas (2023): Planar Object Tracking via Weighted Optical Flow. arXiv:2301.10057
-
Shinn et al. (2023): Reflexion: an autonomous agent with dynamic memory and self-reflection. arXiv:2303.11366
-
Trabucco et al. (2023): Effective Data Augmentation With Diffusion Models. arXiv:2302.07944
-
Vallés-Pérez et al. (2023): Empirical study of the modulus as activation function in computer vision applications. arXiv:2301.05993
-
Wen et al. (2023): BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects. arXiv:2303.14158
-
Yang et al. (2023): Event Camera Data Pre-training. arXiv:2301.01928
- Zhang et al. (2023): Multimodal Chain-of-Thought Reasoning in Language Models. arXiv:2302.00923
Q4/2022
-
Beyer et al. (2022): FlexiViT: One Model for All Patch Sizes. arXiv:2212.08013
-
Défossez et al. (2022): High Fidelity Neural Audio Compression. arXiv:2210.13438
-
Fan et al. (2022): Rolling Shutter Inversion: Bring Rolling Shutter Images to High Framerate Global Shutter Video. arXiv:2210.03040
-
Ghiasi et al. (2022): What do Vision Transformers Learn? A Visual Exploration. arXiv:2212.06727
-
Hinton (2022): The Forward-Forward Algorithm: Some Preliminary Investigations. arXiv:2212.13345
- Li et al. (2022): Rethinking Vision Transformers for MobileNet Size and Speed. arXiv:2212.08059
-
Liu et al. (2022): Event-based Monocular Dense Depth Estimation with Recurrent Transformers. arXiv:2212.02791
-
Radford et al. (2022): Robust Speech Recognition via Large-Scale Weak Supervision. arXiv:2212.04356
-
Shaker et al. (2022): UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation. arXiv:2212.04497
- Taylor et al. (2022): Galactica: A Large Language Model for Science. arXiv:2211.09085
Q3/2022
-
Boegner et al. (2022): Large Scale Radio Frequency Signal Classification. arXiv:2207.09918
-
Izacard et al. (2022): Few-shot Learning with Retrieval Augmented Language Models. arXiv:2208.03299
-
Hu and Li (2022): Early Stopping for Iterative Regularization with General Loss Functions. JMLR 23. pdf
-
Renzulli and Grangetto (2022): Towards Efficient Capsule Networks. arXiv:2208.09203
-
Singer et al. (2022): Make-A-Video: Text-to-Video Generation without Text-Video Data. arXiv:2209.14792
-
Thai et al. (2022): Riesz-Quincunx-UNet Variational Auto-Encoder for Satellite Image Denoising. arXiv:2208.12810
- Wang et al. (2022): YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv:2207.02696
- Wen et al. (2022): CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition. arXiv:2207.10644
-
Wu et al. (2022): TinyViT: Fast Pretraining Distillation for Small Vision Transformers. arXiv:2207.10666
- Yao et al. (2022): Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning. arXiv:2207.04978
Q2/2022
- Balestriero et al. (2022): The Effects of Regularization and Data Augmentation are Class Dependent. arXiv:2204.03632
-
Boutros et al. (2022): SFace: Privacy-friendly and Accurate Face Recognition using Synthetic Data. arXiv:2206.10520
-
Cao et al. (2022): Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. arXiv:2203.14360
-
De Soursa Riberio et al. (2022): Learning with Capsules: A Survey. arXiv:2206.02664
-
Gava et al. (2022): PUCK: Parallel Surface and Convolution-kernel Tracking for Event-Based Cameras. arXiv:2205.07657
-
Imbiriba et al. (2022): Hybrid Neural Network Augmented Physics-based Models for Nonlinear Filtering. arXiv:2204.06471
-
Lee et al. (2022): Fix the Noise: Disentangling Source Feature for Transfer Learning of StyleGAN. arXiv:2204.14079
-
Marchisio et al. (2022): Enabling Capsule Networks at the Edge through Approximate Softmax and Squash Operations. arXiv:2206.10200
-
Öztürk et al. (2022): Zero-Shot AutoML with Pretrained Models. arXiv: 2206.08476
- Reed et al. (2022): A Generalist Agent. arXiv:2205.06175
- Renzulli et al. (2022): REM: Routing Entropy Minimization for Capsule Networks. arXiv:2204.01298
-
Rombach et al. (2022): High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10684-10695. pdf
- Scholl (2022): RF Signal Classification with Synthetic Training Data and its Real-World Performance. arXiv:2206.12967
-
Sun and Boning (2022): FreDo: Frequency Domain-based Long-Term Time Series Forecasting. arXiv:2205.12301
-
Wang et al. (2022): Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey. arXiv:2205.10766
- Zhang et al. (2022): MiniViT: Compressing Vision Transformers with Weight Multiplexing. arXiv:2204.07154
- Zhang et al. (2022): OPT: Open Pre-trained Transformer Language Models. arXiv:2205.01068
Q1/2022
- An et al. (2022): Killing Two Birds with One Stone:Efficient and Robust Training of Face Recognition CNNs by Partial FC. arXiv:2203.15565
-
Akyon et al. (2022): Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection. arXiv:2202.06934
-
Bright et al. (2022): ME-CapsNet: A Multi-Enhanced Capsule Networks with Routing Mechanism. arXiv:2203.15547
-
Cao et al. (2022): Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. arXiv:2203.14360
- Drefs et al. (2022): Evolutionary Variational Optimization of Generative Models . JMLR 23(21).
-
Du et al. (2022): StrongSORT: Make DeepSORT Great Again. arXiv:2202.13514
- Huang et al. (2022): 1000x Faster Camera and Machine Vision with Ordinary Devices. arXiv:2201.09302
-
Huang et al. (2022): Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agent.s arXiv:2201.07207
-
Jin et al. (2022): Full RGB Just Noticeable Difference (JND) Modelling. arXiv:2203.00629
- Lämsä et al. (2022): Video2IMU: Realistic IMU features and signals from videos. arXiv:2202.06547
- Li et al. (2022): Brain-inspired Multilayer Perceptron with Spiking Neurons. arXiv:2203.14679
- Li et al. (2022): SimpleTrack: Rethinking and Improving the JDE Approach for Multi-Object Tracking. arXiv:2203.03985
-
Liu et al. (2022): A ConvNet for the 2020s. arXiv:2201.03545
-
Manita et al. (2022): Universal Approximation in Dropout Neural Networks . JMLR 23. pdf
-
Roros et al. (2022): maskGRU: Tracking Small Objects in the Presence of Large Background Motions. arXiv:2201.00467
-
Wang et al. (2022): OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework. arXiv:2202.03052
- Yang et al. (2022): Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer. arXiv:2203.03466
-
Yu et al. (2022): HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule Network. arXiv:2203.10699
- Zhou et al. (2022): TransVOD: End-to-end Video Object Detection with Spatial-Temporal Transformers. arXiv:2201.05047
Q4/2021
-
Chen and Shrivastava (2021): HR-RCNN: Hierarchical Relational Reasoning for Object Detection. arXiv:2110.13892
- Datta and Beerel (2021): Can Deep Neural Networks be Converted to Ultra Low-Latency Spiking Neural Networks?. arXiv:2112.12133
-
Du et al. (2021): Learning Signal-Agnostic Manifolds of Neural Fields. arXiv:2111.06387
-
Eichenberg et al. (2021): MAGMA – Multimodal Augmentation of Generative Models through Adapter-based Finetuning. arXiv:2112.05253
- Kirby et al. (2021): Reliability of Event Timing in Silicon Neurons. arXiv:2112.14134
- Koutini et al. (2021): Efficient Training of Audio Transformers with Patchout. arXiv:2110.05069
-
Kovachki et al. (2021): On Universal Approximation and Error Bounds for Fourier Neural Operators . JMLR 22(290)
-
Laakom et al. (2021): Learning to ignore: rethinking attention in CNNs. arXiv:2111.05684
-
Vinci et al. (2021): Self-consistent stochastic dynamics for finite-size networks of spiking neurons. arXiv:2112.14867
- Yuan et al. (2021): Florence: A New Foundation Model for Computer Vision. arXiv:2111.11432
Q3/2021
-
Chae et al. (2021): SiamEvent: Event-based Object Tracking via Edge-aware Similarity Learning with Siamese Networks. arXiv:2109.13456
-
Guo et al. (2021): Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces. arXiv:2109.00162
-
He et al. (2021): Integrating Circle Kernels into Convolutional Neural Networks. arXiv:2107.02451
-
Keller & Welling (2021): Topographic VAEs learn Equivariant Capsules. arXiv:2109.01394
-
Liu et al. (2021): Infrared Small-Dim Target Detection with Transformer under Complex Backgrounds. arXiv:2109.14379
-
Machado et al. (2021): HSMD: An object motion detection algorithm using a Hybrid Spiking Neural Network Architecture. arXiv:2109.04119
- Park et al. (2021): Is Pseudo-Lidar needed for Monocular 3D Object detection? arXiv:2108.06417
-
Peng et al. (2021): Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation. arXiv:2109.12484
-
Shi et al. (2021): Reinforcement Learning with Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion. arXiv:2109.06409
-
Yao et al. (2021): Temporal-wise Attention Spiking Neural Networks for Event Streams Classification. arXiv:2107.11711
- Zhao & Cheng (2021): Capsule networks with non-iterative cluster routing. arXiv:2109.09213
- Zheng & Zhang (2021): RockGPT: Reconstructing three-dimensional digital rocks from single two-dimensional slice from the perspective of video generation. arXiv:2108.03132
Q2/2021
- Bonnaerens et al. (2021): Anchor Pruning for Object Detection. arXiv:2104.00432
-
Bykov et al. (2021): NoiseGrad: enhancing explanations by introducing stochasticity to model weights. arXiv:2106.10185
- Chakraborty et al. (2021): A Fully Spiking Hybrid Neural Network for Energy-Efficient Object Detection. arXiv:2104.10719
- Chen et al. (2021): How to Accelerate Capsule Convolutions in Capsule Networks. arXiv:2104.02621
-
Chen et al. (2021): “BNN - BN = ?”: Training Binary Neural Networks without Batch Normalization. arXiv:2104.08215
- Liu et al. (2021): Pay Attention to MLPs. arXiv:2105.08050
-
Liu et al. (2021): Video Swin Transformer. arXiv:2106.13230
-
Ney et al. (2021): HALF: Holistic Auto Machine Learning for FPGAs. arXiv:2106.14771
-
Wu et al. (2021): Poisoning the Search Space in Neural Architecture Search. arXiv:2106.14406
-
Xiao et al. (2021): Early Convolutions Help Transformers See Better. arXiv:2106.14881
- Zhang et al. (2021): Hallucination Improves Few-Shot Object Detection. arXiv:2105.01294
- Zhao et al. (2021): Neko: a Library for Exploring Neuromorphic Learning Rules. arXiv:2105.00324
- Zhao et al. (2021): TrTr: Visual Tracking with Transformer. arXiv:2105.03817
Q1/2021
-
Ding et al. (2021): Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges. arXiv:2102.12219
-
Han et al. (2021): ReDet: A Rotation-equivariant Detector for Aerial Object Detection. arXiv:2103.07733
- Jaegle et al. (2021): Perceiver: General Perception with Iterative Attention. arXiv:2103.03206
-
Joseph et al. (2021): Towards Open World Object Detection. arXiv:2103.02603
- Lee et al. (2021): Detecting Micro Fractures with X-ray Computed Tomography. arXiv:2103.12821
- Li et al. (2021): Involution: Inverting the Inherence of Convolution for Visual Recognition. arXiv:2103.06255
-
Liu et al (2021): Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv:2103.14030
-
Mazzia et al. (2021): Efficient-CapsNet: Capsule Network with Self-Attention Routing. arXiv:2101.12491
-
Northcutt et al. (2021): Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks. arXiv:2103.14749
- Ren et al. (2021): Deep Texture-Aware Features for Camouflaged Object Detection. arXiv:2102.02996
-
Runkel et al. (2021): Depthwise Separable Convolutions Allow for Fast and Memory-Efficient Spectral Normalization. arXiv:2102.06496
- Titirsha et al. (2021): Endurance-Aware Mapping of Spiking Neural Networks to Neuromorphic Hardware. arXiv:2103.05707
-
Tuggener at al. (2021): Is it Enough to Optimize CNN Architectures on ImageNet?. arXiv:2103.09108
- Zhou et al. (2021): Probabilistic two-stage detection. arXiv:2103.07461
Q4/2020
-
Awad et al. (2020): Differential Evolution for Neural Architecture Search. arXiv:2012.06400
-
Chen et al. (2020): A Group-Theoretic Framework for Data Augmentation. JMLR 21(245): 1-71
-
Gerg and Monga (2020): Deep Autofocus for Synthetic Aperture Sonar. arXiv:2010.15687
-
Hu et al. (2020): Multi-objective Neural Architecture Search with Almost No Training. arXiv:2011.13591
- Kedziora et al. (2020): AutonoML: Towards an Integrated Framework for Autonomous Machine Learning. arXiv:2012.12600
- Keller et al. (2020): Self Normalizing Flows. arXiv:2011.07248
-
Kileel et al. (2020): Manifold learning with arbitrary norms. arXiv:2012.14172
- Li and Jordan (2020): Stochastic Approximation for Online Tensorial Independent Component Analysis. arXiv:2012.14415
- Li et al. (2020): Underwater image filtering: methods, datasets and evaluation. arXiv:2012.12258
- Lindauer and Hutter (2020): Best Practices for Scientific Research on Neural Architecture Search. JMLR 21(243): 1-18
- Liu et al. (2020): YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS). arXiv:2012.12259
-
Luo and Jennings (2020): A Differential Privacy Mechanism that Accounts for Network Effects for Crowdsourcing Systems. JAIR 69, 1127-1164. doi: 10.1613/jair.1.12158
-
Neekhara et al. (2020): Adversarial Threats to DeepFake Detection: A Practical Perspective. arXiv:2011.09957
-
Pang et al. (2020): TROJANZOO: Everything you ever wanted to know about neural backdoors (but were afraid to ask). arXiv:2012.09302
-
Rock et al. (2020): Quantized Neural Networks for Radar Interference Mitigation. 2011.12706
- Salman et al. (2020): Unadversarial Examples: Designing Objects for Robust Vision. arXiv:2012.12235
- Schrittwieser et al. (2020): Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588: 604-609. doi:10.1038/s41586-020-03051-4
- Sheeny (2020): All-Weather Object Recognition Using Radar and Infrared Sensing. arXiv:2010.16285
- Shen et al. (2020): DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation. arXiv:2011.09876
- Sushko et al. (2020): You Only Need Adversarial Supervision for Semantic Image Synthesis. arXiv:2012.04781
- Sun et al. (2020): Extreme Value Preserving Networks. arXiv:2011.08367
- Sun et al. (2020): Identifying Invariant Texture Violation for Robust Deepfake Detection. arXiv:2012.10580
-
Svendsen et al. (2020): Deep Gaussian Processes for geophysical parameter retrieval. arXiv:2012.12099
- Wandl et al. (2020): Fast Fluid Simulations in 3D with Physics-Informed Deep Learning. arXiv:2012.11893
-
Weston et al. (2020): There and Back Again: Learning to Simulate Radar Data for Real-World Applications. arXiv:2011.14389
-
Xie et al. (2020): Skillearn: Machine Learning Inspired by Humans’ Learning Skills. arXiv:2012.04863
- Yu et al (2020): HMFlow: Hybrid Matching Optical Flow Network for Small and Fast-Moving Objects. arXiv:2011.09654
-
Yue et al (2020): Effective, Efficient and Robust Neural Architecture Search. arXiv:2011.09820
- Zhang et al. (2020): FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations. arXiv:2012.12206
- Zhu et al. (2020): Integrating Deep Neural Networks with Full-waveform Inversion: Reparametrization, Regularization, and Uncertainty Quantification. arXiv:2012.11149
Q3/2020
-
Agrawal et al. (2020): Wide Neural Networks with Bottlenecks are Deep Gaussian Processes. JMLR 21 (175)
-
Bonald et al. (2020): Scikit-network: Graph Analysis in Python. JMLR 21(185)
- Chen et al (2020): Learning Deep ReLU Networks Is Fixed-Parameter Tractable. arXiv:2009.13512
-
Chen et al. (2020): WaveGrad: Estimating Gradients for Waveform Generation. arXiv:2009.00713
-
Davies et al. (2020): Overfit Neural Networks as a Compact Shape Representation. arXiv:2009.09808
- Feurer et al. (2020): Auto-Sklearn 2.0: The Next Generation. arXiv:2007.04074
-
Fuchs and Pernkopf (2020): Wasserstein Routed Capsule Networks. arXiv:2007.11465
-
Guo et al. (2020): Variational Temporal Deep Generative Model for Radar HRRP Target Recognition. arXiv:2009.13011
-
Kidger et al. (2020): “Hey, that’s not an ODE”: Faster ODE Adjoints with 12 Lines of Code. arXiv:2009.09457
-
Long et al. (2020): PP-YOLO: An Effective and Efficient Implementation of Object Detector. arXiv:2007.12099
-
Morrill et al. (2020): Neural CDEs for Long Time-Series via the Log-ODE Method. arXiv:2009.08295
-
Nguyen et al (2020): Quaternion Graph Neural Networks. arXiv:2008.05089
-
Obukhov et al. (2020): T-Basis: a Compact Representation for Neural Networks. arXiv:2007.06631
-
Perot et al. (2020): Learning to Detect Objects with a 1 Megapixel Event Camera. arXiv:2009.13436
-
Reuther et al. (2020): Survey of Machine Learning Accelerators. arXiv:2009.00993
-
Shen and Savvides (2020): MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks. arXiv:2009.08453
-
Tek et al. (2020): Adaptive Convolution Kernel for Artificial Neural Networks. arXiv:2009.06385
-
Wunderlich and Pehle (2020): EventProp: Backpropagation for Exact Gradients in Spiking Neural Networks. arXiv:2009.08378
- Xiang et al. (2020): KIT MOMA: A Mobile Machines Dataset. arXiv:2007.04198
Q2/2020
-
Ahmed et al. (2020): Reinforcement Learning based Beamforming for Massive MIMO Radar Multi-target Detection. arXiv:2005.04708
-
Brown et al. (2020): Language Models are Few-Shot Learners. arXiv:2005.14165
- Carion et al. (2020): End-to-End Object Detection with Transformers. arXiv:2005.12872
- Cheng et al. (2020): Detecting and Tracking Communal Bird Roosts in Weather Radar Data. arXiv:2004.12819
-
Cui et al. (2020): Fully Convolutional Online Tracking. arXiv:2004.07109
-
Dogra and Redman (2020): Optimizing Neural Networks via Koopman Operator Theory. arXiv:2006.02361
-
Geirhos et al. (2020): Shortcut Learning in Deep Neural Networks. arXiv:2004.07780
- Hernandex and Brown (2020): Measuring the Algorithmic Efficiency of Neural Networks. arXiv:2005.04305
- Huang et al. (2020): SQE: a Self Quality Evaluation Metric for Parameters Optimization in Multi-Object Tracking. arXiv:2004.07472
-
Hupkees et al. (2020): Compositionality Decomposed: How do Neural Networks Generalise? . JAIR (67), 757 - 795. doi:10.1613/jair.1.11674
- Lee et al. (2020): Continual Learning with Extended Kronecker-factored Approximate Curvature. arXiv:2004.07507
- Lelekas et al. (2020): Top-Down Networks: A coarse-to-fine reimagination of CNNs. arXiv:2004.07629
-
Li et al. (2020): SmallBigNet: Integrating Core and Contextual Views for Video Classification. arXiv:2006.14582
- Marchisio et al. (2020): Q-CapsNets: A Specialized Framework for Quantizing Capsule Networks. arXiv:2004.07116
- Marvasti-Zadeh et al. (2020): COMET: Context-Aware IoU-Guided Network for Small Object Tracking. arXiv:2006.02597
-
Mobiny et al. (2020): Radiologist-Level COVID-19 Detection Using CT Scans with Detail-Oriented Capsule Networks. arXiv:2004.07407
- Palffy et al. (2020): CNN based Road User Detection using the 3D Radar Cube. arXiv:2004.12165
-
Park et al. (2020): Variational Bayes In Private Settings (VIPS). JAIR (68), 109 -157. doi:10.1613/jair.1.11763
- Quaknine et al. (2020): CARRADA Dataset: Camera and Automotive Radar with Range-Angle-Doppler Annotations. arXiv:2005.01456
-
Qui et al. (2020): Quaternion Neural Networks for Multi-channel Distant Speech Recognition. arXiv:2005.08566
- Scheiner et al. (2020): Off-the-shelf sensor vs. experimental radar – How much resolution is necessary in automotive radar classification?. arXiv:2006.05485
- Shuai et al. (2020): Multi-Object Tracking with Siamese Track-RCNN. arXiv:2004.07786
-
Sitzmann et al. (2020): Implicit Neural Representations with Periodic Activation Functions. 2006.09661
- Thornton et al. (2020): Deep Reinforcement Learning Control for Radar Detection and Tracking in Congested Spectral Environments. arXiv:2006.13173
-
Toyer et al. (2020): ASNets: Deep Learning for Generalised Planning. JARI (68), 1 - 68. doi:10.1613/jair.1.11633
-
Valery et al. (2020): Self-Supervised training for blind multi-frame video denoising. arXiv:2004.06957
- Wang et al. (2020): Residual-driven Fuzzy C-Means Clustering for Image Segmentation. arXiv:2004.07160
-
Wiedemann et al. (2020): Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training. arXiv:2004.04729
- Zhang et al. (2020): Ocean: Object-aware Anchor-free Tracking. arXiv2006.10721
- Zhao et al. (2020): TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator. arXiv:2005.04063
Q1/2020
-
Arias-Castro et al. (2020): Perturbation Bounds for Procrustes, Classical Scaling, and Trilateration, with Applications to Manifold Learning . JMLR 21
-
Blondel et al. (2020): Learning with Fenchel-Young losses. JMLR 21(35):1-69
- Danelljan et al. (2020): Probabilistic Regression for Visual Tracking. arXiv:2003.12565
-
Deng et al. (2020): Self-attention-based BiGRU and capsule network for named entity recognition. arXiv:2002.00735
-
Edraki et al. (2020): Subspace Capsule Network. arXiv:2002.02924v1
-
Hadjeres and Nielsen (2020): Schoenberg-Rao distances: Entropy-based and geometry-aware statistical Hilbert distances. arXiv:2002.08345
-
Jia et al. (2020): Entangled Watermarks as a Defense against Model Extraction. arXiv:2002.12200
-
Kadeethum et al. (2020): Physics-informed Neural Networks for Solving Nonlinear Diffusivity and Biot’s equations. arXiv:2002.08235
-
Liu et al. (2020): Are Labels Necessary for Neural Architecture Search? arXiv:2003.12056
- Manchev and Spratling (2020): Target Propagation in Recurrent Neural Networks . JMLR 21(7):1−33.
-
Molnar and Culurciello et al. (2020): Capsule Network Performance with Autonomous Navigation. arXiv:2002.03181v1
-
Punjabi et al. (2020): Examining the Benefits of Capsule Neural Networks. arXiv:2001.10964
- Radosavovic et al. (2020): Designing Network Design Spaces. arXiv:2003.13678
- Rogers et al. (2020): A Primer in BERTology: What we know about how BERT works. arXiv:2002.12327
- Romero et al. (2020): Attentive Group Equivariant Convolutional Networks. arXiv:2002.03830
-
Ruby et al. (2020: The Mertens Unrolled Network (MU-Net): A High Dynamic Range Fusion Neural Network for Through the Windshield Driver Recognition. arXiv:2002.12257
-
Schmitt et al. (2020): Weakly Supervised Semantic Segmentation of Satellite Images for Land Cover Mapping – Challenges and Opportunities. arXiv:2002.08254v1
-
Vecchi et al. (2020): Compressing deep quaternion neural networks with targeted regularization. arXiv:1907.11546v2
- Tang et al. (2020): RSL-Net: Localising in Satellite Images From a Radar on the Ground. arXiv:2001.03233
- Thornton et al. (2020): Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar. arXiv:2001.01799
-
Tsai et al. (2020): Capsules with Inverted Dot-Product Attention Routing. ICLR 2020
-
Wang et al. (2020): Multi-wavelet residual dense convolutional neural network for image denoising. arXiv:2002.08254
- Yoo and Owhadi (2020): Deep regularization and direct training of the inner layers of Neural Networks with Kernel Flows. arXiv:2002.08335
Q4/2019
-
Dovesi et al. (2019): Real-Time Semantic Stereo Matching. arXiv:1910.00541
-
Gu and Tresp (2019): Improving the Robustness of Capsule Networks to Image Affine Transformations. arXiv:1911.0796
- Hoogi et al. (2019): Self-Attention Capsule Networks for Object Classification. arXiv:1904.12483
-
Hwang et al. (2019): SegSort: Segmentation by Discriminative Sorting of Segments. arXiv:1910.0696
-
Jegorova et al. (2019): Full-Scale Continuous Synthetic Sonar Data Generation with Markov Conditional Generative Adversarial Networks. arXiv:1910.06750
-
Liu et al. (2019): GPRInvNet: Deep Learning-Based Ground Penetrating Radar Data Inversion for Tunnel Lining. arXiv:1912.05759
-
Nguyen et al. (2019): Use of a Capsule Network to Detect Fake Images and Videos. arXiv:1910.12467
-
Scheiner et al. (2019): Seeing Around Street Corners: Non-Line-of-Sight Detection and Tracking In-the-Wild Using Doppler Radar. arXiv:1912.06613
-
Varadarajan et al. (2019): Benchmark for Generic Product Detection: A strong baseline for Dense Object Detection. arXiv:1912.09476
- Wang et al. (2019): CSPNet: A New Backbone that can Enhance Learning Capability of CNN. arXiv:1911.11929
-
Weissman et al. (2019): JackHammer: Efficient Rowhammer on Heterogeneous FPGA-CPU Platform. arXiv:1912.11523
- Zhang et al. (2019): 3D-Rotation-Equivariant Quaternion Neural Networks. arXiv:1911.09040
- Zhao et al. (2019): Quaternion Equivariant Capsule Networks for 3D Point Clouds. arXiv:1912.12098
Q3/2019
-
Andraghetti et al. (2019): Enhancing self-supervised monocular depth estimation with traditional visual odometry. arXiv:1908.03127
- Caliva et al. (2019): Distance Map Loss Penalty Term for Semantic Segmentation. arXiv:1908.03679
- Chen et al. (2019): Fast Point R-CNN. arXiv:1908.02990
-
Choi et al. (2019): Attention routing between capsules. arXiv:1907.01750
-
Duggal et al. (2019): DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch. arXiv:1909.05845
- Garnier et al. (2019): A review on Deep Reinforcement Learning for Fluid Mechanics. arXiv:1908.04127
-
Gong et al. (2019): AutoGAN: Neural Architecture Search for Generative Adversarial Networks. arXiv:1908.03835
- He et al. (2019): Constructing an Associative Memory System Using Spiking Neural Network. Front. Neurosci., DOI:10.3389/fnins.2019.00650
-
Huegle et al. (2019): Dynamic Input for Deep Reinforcement Learning in Autonomous Driving. arXiv:1907.10994
- Kim and Ganapathi (2019): LumièreNet: Lecture Video Synthesis from Audio. arXiv:1907.02253
-
Kulhánek et al. (2019): Vision-based Navigation Using Deep Reinforcement Learning. arXiv:1908.03627
- Lee et al. (2019): On-Device Neural Net Inference with Mobile GPUs. arXiv:1907.01989
- Li et al. (2019): Deformable Tube Network for Action Detection in Videos. arXiv:1907.01847
- Li et al. (2019): Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation. arXiv:1907.10982
- Li et al. (2019): Differentially Private Meta-Learning. arXiv:1909.05830
-
Liu et al. (2019): On the Variance of the Adaptive Learning Rate and Beyond. arXiv:1908.03265
-
Misra et al. (2019): Mish: A Self Regularized Non-Monotonic Neural Activation Function. arXiv:1908.08681
-
Qin et al. (2019: Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions. arXiv:1907.02957
-
Soures and Kudithipudi (2019): Deep Liquid State Machines With Neural Plasticity for Video Activity Recognition. Front. Neurosci., DOI:10.3389/fnins.2019.00686
-
Wang and Shen (2019): Flow-Motion and Depth Network for Monocular Stereo and Beyond. arXiv:1909.05452
- You et al. (2019): Tracking system of Mine Patrol Robot for Low Illumination Environment. arXiv:1907.01806
-
You et al. (2019): Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks. arXiv:1909.08174
- Zhao et al. (2019): UER: An Open-Source Toolkit for Pre-training Models. arXiv:1909.05658
- Zhang et al. (2019): Lookahead Optimizer: k steps forward, 1 step back. arXiv:1907.08610
- Zhang et al. (2019): SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications. arXiv:1907.11093
- Zhou et al (2019): One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud. arXiv:1907.10763
Q2/2019
- Alekseev and Bobe (2019): GaborNet: Gabor filters with learnable parameters in deep convolutional neural networks. arXiv:1904.13204
-
Ardila et al. (2019): End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. doi:10.1038/s41591-019-0447-x
- Bai et al. (2019): Deep Learning Based Robot for Automatically Picking up Garbage on the Grass. arXiv:1904.13034
- Balog et al. (2019): Fast Training of Sparse Graph Neural Networks on Dense Hardware. arXiv:1906.11786
- Becker et al. (2019): Deep Optimal Stopping. Journal of Machine Learning Research 20 (2019) 1-25
- Berner et al. (2019): How degenerate is the parametrization of neural networks with the ReLU activation function? arXiv:1905.09803
-
Brandt J. (2019): Spatio-temporal crop classification of low-resolution satellite imagery with capsule layers and distributed attention. arXiv:1904.10130
- Danzer et al. (2019): 2D Car Detection in Radar Data with PointNets. arXiv:1904.08414
-
Drori et al. (2019): Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar. arXiv:1905.10345
-
Eggensperger (2019): Pitfalls and Best Practices in Algorithm Configuration. Journal of Artificial Intelligence Research 64 (2019) 861-893
- Harikrishnan and Nagaraj (2019): A Novel Chaos Theory Inspired Neural Architecture. arXiv:1905.12601
- Hoogi et al. (2019): Self-Attention Capsule Networks for Image Classification. arXiv:1904.12483
- Hu et al. (2019): Optimal Sparse Decision Trees. arXiv:1904.12847
-
Hughes et al. (2019): Wave Physics as an Analog Recurrent Neural Network. arXiv:1904.12831
-
Jia et al. (2019): Direct speech-to-speech translation with a sequence-to-sequence model. arXiv:1904.06037
- Klemmer et al. (2019): Augmenting correlation structures in spatial data using deep generative models. arXiv:1905.09796
-
Kosiorek et al. (2019): Stacked Capsule Autoencoders. arXiv:1906.06818
- Leite and Enembreck (2019): Using Collective Behavior of Coupled Oscillators for Solving DCOP. Journal of Artificial Intelligence Research 64 (2019) 987-1023
-
Li (2019): Graph Matching Networks for Learning the Similarity of Graph Structured Objects. arXiv:1904.12787
-
Nguyen and Holmes (2019): Ten quick tips for effective dimensionality reduction. PLoS Comput Biol 15(6): e1006907. DOI:10.1371/journal.pcbi.1006907
-
Oh et al. (2019): Speech2Face: Learning the Face Behind a Voice. arXiv:1905.09773
-
Park et al. (2019): SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. arXiv:1904.08779
-
Rajasegaran et al. (2019): DeepCaps: Going Deeper with Capsule Networks. arXiv:1904.09546
- Sanyal et al. (2019): Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision. arXiv:1905.06817
- Sherry et al. (2019): Learning the Sampling Pattern for MRI. arXiv:1906.08754
- Shin (2019): Encoding Database Schemas with Relation-Aware Self-Attention for Text-to-SQL Parsers. arXiv:11790
-
Sun et al. (2019): GeoCapsNet: Aerial to Ground view Image Geo-localization using Capsule Network. arXiv:1904.06281
-
Thomas et al. (2019): DeLiO: Decoupled LiDAR Odometry. arXiv:1904.12667
-
Valade et al. (2019): Towards Global Volcano Monitoring Using Multisensor Sentinel Missions and Artificial Intelligence: The MOUNTS Monitoring System. DOI:10.3390/rs11131528
-
Wang et al. (2019): Monocular Plan View Networks for Autonomous Driving. arXiv: 1905.06937
- Zhang (2019): Making Convolutional Networks Shift-Invariant Again. arXiv:1904.11486
- Zhang et al. (2019): Quaternion Knowledge Graph Embedding. arXiv:1904.10281
- Zhang et al. (2019): You Only Propagate Once: Accelerate Adversarial Training via Maximal Principle. arXiv:1905.00877
- Zhao et al. (2019): Fast Inference in Capsule Networks Using Accumulated Routing Coefficients. arXiv:1904.07304
- Zhao et al. (2019): PyOD: A Python Toolbox for Scalable Outlier Detection. JMLR 20(96):1−7. http://jmlr.org/papers/v20/19-011.html
- Zhu et al. (2019): Transferable Clean-Label Poisoning Attacks on Deep Neural Nets. arXiv:1905.05897
Q1/2019
-
Barz and Denzler (2019): Deep Learning on Small Datasets without Pre-Training using Cosine Loss. arXiv:1901.09054v1
-
Cheng, S. et al. (2019): MeshGAN: Non-linear 3D Morphable Models of Faces. arXiv:1903.10384
-
Duarte, A. et al. (2019): Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks. arXiv:1903.10195
-
Elser, V. et al. (2019): Monotone Learning with Rectified Wire Networks. Journal of Machine Learning Research (20), 1 - 42. link
- Fey, M. and Lenssen, J. E. (2019): Fast Graph Representation Learning with PyTorch Geometric. arXiv:1903.02428
-
Francis, A. et al. (2019): Long-Range Indoor Navigation with PRM-RL. arXiv:1902.09458
-
Ge et al. (2019): DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images. arXiv:1901.07973v1
-
Hawkins et al. (2019): A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex. doi:10.3389/fncir.2018.00121
-
Kreiss et al. (2019): PifPaf: Composite Fields for Human Pose Estimation. arXiv:1903.06593
-
Li et al. (2019): Rethinking on Multi-Stage Networks for Human Pose Estimation. arXiv:1901.00148
-
Mirsky, Y. et al. (2019): CT-GAN: Malicious Tampering of 3D Medical Imagery using Deep Learning. arXiv:1901.03597
- Sonoda, S. and Murata, N. (2019): Transport Analysis of Infinitely Deep Neural Network. Journal of Machine Learning Research (20), 1-52. link
-
Sun, K. et al (2019): Deep High-Resolution Representation Learning for Human Pose Estimation. arXiv:1902.09212
-
Tang, Z. and Hwang, J.-N. (2019): MOANA: An Online Learned Adaptive Appearance Model for Robust Multiple Object Tracking in 3D. arXiv:1901.02626
-
Voigtlaender et al. (2019): FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation. arXiv:1902.09513
- Wofk, D. et al. (2019): FastDepth: Fast Monocular Depth Estimation on Embedded Systems. arXiv:1903.03273
-
Wu et al. (2019): Simplifying Graph Convolutional Networks. arXiv:1902.07153
- Xinyi, Z. and Chen, L. (2019): Capsule Graph Neural Network. ICLR 2019. link
- Xu, B. et al. (2019): Graph Wavelet Neural Network. ICLR 2019. link
Q4/2018
-
Istrate et al. (2018): TAPAS: Train-less Accuracy Predictor for Architecture Search. https://www.ibm.com/blogs/research/2018/12/tapas/; preprint: arXiv: 1806.00250
-
Jang et al. (2018): Spiking Neural Networks: A Stochastic Signal Processing Perspective. arXiv: 1812.03929v2
-
Leike et al. (2018): Scalable agent alignment via reward modeling: a research direction. arXiv: 1811.07871
-
O’Keeffe et al. (2018): Adaptive Online Fault Diagnosis in Autonomous Robot Swarms. doi: 10.3389/frobt.2018.00131
-
Prenger et al. (2018): WaveGlow: A Flow-based Generative Network for Speech Synthesis. arXiv: 1811.00002. Code available here: https://github.com/NVIDIA/waveglow