Hugging Face ‧ Daily Papers
订阅
1. Toon3D: Seeing Cartoons from a New Perspective Ethan Weber, Riley Peterlinz, Rohan Mathur, Frederik Warburg, Alexei A. Efros, Angjoo Kanazawa
2. TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction Yunfan Jiang, Chen Wang, Ruohan Zhang, Jiajun Wu, Li Fei-Fei
3. Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Tianhe Ren, Qing Jiang, Shilong Liu, Zhaoyang Zeng, Wenlong Liu, Han Gao, Hongjie Huang, Zhengyu Ma, Xiaoke Jiang, Yihao Chen, Yuda Xiong, Hao Zhang, Feng Li, Peijun Tang, Kent Yu, Lei Zhang
4. CAT3D: Create Anything in 3D with Multi-View Diffusion Models Ruiqi Gao, Aleksander Holynski, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T. Barron, Ben Poole
5. LoRA Learns Less and Forgets Less Dan Biderman, Jose Gonzalez Ortiz, Jacob Portes, Mansheej Paul, Philip Greengard, Connor Jennings, Daniel King, Sam Havens, Vitaliy Chiley, Jonathan Frankle, Cody Blakeney, John P. Cunningham
6. Many-Shot In-Context Learning in Multimodal Foundation Models Yixing Jiang, Jeremy Irvin, Ji Hun Wang, Muhammad Ahmed Chaudhry, Jonathan H. Chen, Andrew Y. Ng
7. Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion Xinyang Li, Zhangyu Lai, Linning Xu, Jianfei Guo, Liujuan Cao, Shengchuan Zhang, Bo Dai, Rongrong Ji
8. Chameleon: Mixed-Modal Early-Fusion Foundation Models Chameleon Team
更新于 6 分钟前

近期历史最近 100 条记录

2024-05-17 Toon3D: Seeing Cartoons from a New Perspective Ethan Weber, Riley Peterlinz, Rohan Mathur, Frederik Warburg, Alexei A. Efros, Angjoo Kanazawa
2024-05-17 TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction Yunfan Jiang, Chen Wang, Ruohan Zhang, Jiajun Wu, Li Fei-Fei
2024-05-17 Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Tianhe Ren, Qing Jiang, Shilong Liu, Zhaoyang Zeng, Wenlong Liu, Han Gao, Hongjie Huang, Zhengyu Ma, Xiaoke Jiang, Yihao Chen, Yuda Xiong, Hao Zhang, Feng Li, Peijun Tang, Kent Yu, Lei Zhang
2024-05-17 CAT3D: Create Anything in 3D with Multi-View Diffusion Models Ruiqi Gao, Aleksander Holynski, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T. Barron, Ben Poole
2024-05-17 LoRA Learns Less and Forgets Less Dan Biderman, Jose Gonzalez Ortiz, Jacob Portes, Mansheej Paul, Philip Greengard, Connor Jennings, Daniel King, Sam Havens, Vitaliy Chiley, Jonathan Frankle, Cody Blakeney, John P. Cunningham
2024-05-17 Many-Shot In-Context Learning in Multimodal Foundation Models Yixing Jiang, Jeremy Irvin, Ji Hun Wang, Muhammad Ahmed Chaudhry, Jonathan H. Chen, Andrew Y. Ng
2024-05-17 Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion Xinyang Li, Zhangyu Lai, Linning Xu, Jianfei Guo, Liujuan Cao, Shengchuan Zhang, Bo Dai, Rongrong Ji
2024-05-17 Chameleon: Mixed-Modal Early-Fusion Foundation Models Chameleon Team
2024-05-16 Naturalistic Music Decoding from EEG Data via Latent Diffusion Models Emilian Postolache, Natalia Polouliakh, Hiroaki Kitano, Akima Connelly, Emanuele Rodolà, Taketo Akama
2024-05-16 BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Ma
2024-05-16 ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Siwei Wang, Yifei Shen, Shi Feng, Haoran Sun, Shang-Hua Teng, Wei Chen
2024-05-16 Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Wanting Xu, Yang Liu, Langping He, Xucheng Huang, Ling Jiang
2024-05-16 No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Yingjie Zhai, Wenshuo Li, Yehui Tang, Xinghao Chen, Yunhe Wang
2024-05-16 Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Xueyan Niu, Bo Bai, Lei Deng, Wei Han
2024-05-16 Understanding the performance gap between online and offline alignment algorithms Yunhao Tang, Daniel Zhaohan Guo, Zeyu Zheng, Daniele Calandriello, Yuan Cao, Eugene Tarassov, Rémi Munos, Bernardo Ávila Pires, Michal Valko, Yong Cheng, Will Dabney
2024-05-16 SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models Raghuveer Peri, Sai Muralidhar Jayanthi, Srikanth Ronanki, Anshu Bhatia, Karel Mundnich, Saket Dingliwal, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Srikanth Vishnubhotla, Daniel Garcia-Romero, Sundararajan Srinivasan, Kyu J Han, Katrin Kirchhoff
2024-05-16 SpeechVerse: A Large-scale Generalizable Audio Language Model Nilaksh Das, Saket Dingliwal, Srikanth Ronanki, Rohit Paturi, David Huang, Prashant Mathur, Jie Yuan, Dhanush Bekal, Xing Niu, Sai Muralidhar Jayanthi, Xilai Li, Karel Mundnich, Monica Sunkara, Sundararajan Srinivasan, Kyu J Han, Katrin Kirchhoff
2024-05-16 Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning Wenqi Dong, Bangbang Yang, Lin Ma, Xiao Liu, Liyuan Cui, Hujun Bao, Yuewen Ma, Zhaopeng Cui
2024-05-16 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxia
2024-05-15 Compositional Text-to-Image Generation with Dense Blob Representations Weili Nie, Sifei Liu, Morteza Mardani, Chao Liu, Benjamin Eckart, Arash Vahdat
2024-05-14 What matters when building vision-language models? Hugo Laurençon, Léo Tronchon, Matthieu Cord, Victor Sanh
2024-05-14 Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training Junqin Huang, Zhongjie Hu, Zihao Jing, Mengya Gao, Yichao Wu
2024-05-14 MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels Qi Chen, Xiubo Geng, Corby Rosset, Carolyn Buractaon, Jingwen Lu, Tao Shen, Kun Zhou, Chenyan Xiong, Yeyun Gong, Paul Bennett, Nick Craswell, Xing Xie, Fan Yang, Bryan Tower, Nikhil Rao, Anlei Dong, Wenqi Jiang, Zheng Liu, Mingqin Li, Chuanjie Liu, Zengzh
2024-05-14 LogoMotion: Visually Grounded Code Generation for Content-Aware Animation Vivian Liu, Rubaiat Habib Kazi, Li-Yi Wei, Matthew Fisher, Timothy Langlois, Seth Walker, Lydia Chilton
2024-05-14 Large Language Models as Planning Domain Generators James Oswald, Kavitha Srinivas, Harsha Kokel, Junkyu Lee, Michael Katz, Shirin Sohrabi
2024-05-14 RLHF Workflow: From Reward Modeling to Online RLHF Hanze Dong, Wei Xiong, Bo Pang, Haoxiang Wang, Han Zhao, Yingbo Zhou, Nan Jiang, Doyen Sahoo, Caiming Xiong, Tong Zhang
2024-05-14 SUTRA: Scalable Multilingual Language Model Architecture Abhijit Bendale, Michael Sapienza, Steven Ripplinger, Simon Gibbs, Jaewon Lee, Pranav Mistry
2024-05-14 Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo
2024-05-14 Scaling the AI Memory Wall with Dataflow and Composition of Experts belter
2024-05-03 Customizing Text-to-Image Models with a Single Image Pair Maxwell Jones, Sheng-Yu Wang, Nupur Kumari, David Bau, Jun-Yan Zhu
2024-05-03 FLAME: Factuality-Aware Alignment for Large Language Models Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen
2024-05-03 NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Gerald Shen, Zhilin Wang, Olivier Delalleau, Jiaqi Zeng, Yi Dong, Daniel Egert, Shengyang Sun, Jimmy Zhang, Sahil Jain, Ali Taghibakhshi, Markel Sanz Ausin, Ashwath Aithal, Oleksii Kuchaiev
2024-05-03 Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo
2024-05-03 LLM-AD: Large Language Model based Audio Description System Peng Chu, Jiang Wang, Andre Abrantes
2024-05-03 WildChat: 1M ChatGPT Interaction Logs in the Wild Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng
2024-05-03 LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Justin Zhao, Timothy Wang, Wael Abid, Geoffrey Angus, Arnav Garg, Jeffery Kinnison, Alex Sherstinsky, Piero Molino, Travis Addair, Devvret Rishi
2024-05-03 StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Yupeng Zhou, Daquan Zhou, Ming-Ming Cheng, Jiashi Feng, Qibin Hou
2024-05-02 Automatic Creative Selection with Cross-Modal Matching Alex Kim, Jia Huang, Rob Monarch, Jerry Kwac, Anikesh Kamath, Parmeshwar Khurd, Kailash Thiyagarajan, Goodman Gu
2024-05-02 Paint by Inpaint: Learning to Add Image Objects by Removing Them First Navve Wasserman, Noam Rotstein, Roy Ganz, Ron Kimmel
2024-05-02 Self-Play Preference Optimization for Language Model Alignment Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu
2024-05-02 STT: Stateful Tracking with Transformers for Autonomous Driving cs.RO ‧ Longlong Jing, Ruichi Yu, Xu Chen, Zhengli Zhao, Shiwei Sheng, Colin Graber, Qi Chen, Qinru Li, Shangxuan Wu, Han Deng, Sangjin Lee, Chris Sweeney, Qiurui He, Wei-Chih Hung, Tong He, Xingyi Zhou, Farshid Moussavi, Zijian Guo, Yin Zhou, Mingxing Ta
2024-05-02 Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge Bin Xiao, Chunan Shi, Xiaonan Nie, Fan Yang, Xiangwei Deng, Lei Su, Weipeng Chen, Bin Cui
2024-05-02 SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Haohe Liu, Xuenan Xu, Yi Yuan, Mengyue Wu, Wenwu Wang, Mark D. Plumbley
2024-05-02 Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 Junsang Yoon, Akshat Gupta, Gopala Anumanchipalli
2024-05-02 A Careful Examination of LLM Performance on Grade School Arithmetic andy99
2024-05-02 Spectrally Pruned Gaussian Fields with Neural Compensation Runyi Yang, Zhenxin Zhu, Zhou Jiang, Baijun Ye, Xiaoxue Chen, Yifei Zhang, Yuantao Chen, Jian Zhao, Hao Zhao
2024-05-01 Lightplane: Highly-Scalable Components for Neural 3D Fields Ang Cao, Justin Johnson, Andrea Vedaldi, David Novotny
2024-05-01 MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model Wenxun Dai, Ling-Hao Chen, Jingbo Wang, Jinpeng Liu, Bo Dai, Yansong Tang
2024-05-01 Kan: Kolmogorov-Arnold Networks chuckhend
2024-05-01 Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting Paul Engstler, Andrea Vedaldi, Iro Laina, Christian Rupprecht
2024-05-01 MicroDreamer: Zero-shot 3D Generation in $sim$20 Seconds by Score-based Iterative Reconstruction Luxi Chen, Zhengyi Wang, Chongxuan Li, Tingting Gao, Hang Su, Jun Zhu
2024-05-01 Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui
2024-05-01 Extending Llama-3's Context Ten-Fold Overnight Peitian Zhang, Ninglu Shao, Zheng Liu, Shitao Xiao, Hongjin Qian, Qiwei Ye, Zhicheng Dou
2024-05-01 Octopus v4: Graph of language models Wei Chen, Zhiyuan Li
2024-05-01 Iterative Reasoning Preference Optimization Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston
2024-05-01 SAGS: Structure-Aware 3D Gaussian Splatting Evangelos Ververas, Rolandos Alexandros Potamias, Jifei Song, Jiankang Deng, Stefanos Zafeiriou
2024-05-01 Better & Faster Large Language Models via Multi-token Prediction Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozière, David Lopez-Paz, Gabriel Synnaeve
2024-05-01 GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu
2024-05-01 DOCCI: Descriptions of Connected and Contrasting Images Yasumasa Onoe, Sunayana Rane, Zachary Berger, Yonatan Bitton, Jaemin Cho, Roopal Garg, Alexander Ku, Zarana Parekh, Jordi Pont-Tuset, Garrett Tanzer, Su Wang, Jason Baldridge
2024-05-01 InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Chanran Kim, Jeongin Lee, Shichang Joung, Bongmo Kim, Yeul-Min Baek
2024-04-30 Stylus: Automatic Adapter Selection for Diffusion Models Michael Luo, Justin Wong, Brandon Trabucco, Yanping Huang, Joseph E. Gonzalez, Zhifeng Chen, Ruslan Salakhutdinov, Ion Stoica
2024-04-30 DressCode: Autoregressively Sewing and Generating Garments from Text Guidance Kai He, Kaixin Yao, Qixuan Zhang, Jingyi Yu, Lingjie Liu, Lan Xu
2024-04-30 Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting Fangcheng Liu, Yehui Tang, Zhenhua Liu, Yunsheng Ni, Kai Han, Yunhe Wang
2024-04-30 BlenderAlchemy: Editing 3D Graphics with Vision-Language Models PaulHoule
2024-04-30 Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Pat Verga, Sebastian Hofstatter, Sophia Althammer, Yixuan Su, Aleksandra Piktus, Arkady Arkhangorodsky, Minjie Xu, Naomi White, Patrick Lewis
2024-04-30 Capabilities of Gemini Models in Medicine Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G. T. Barrett, Cathy Cheung, B
2024-04-30 LEGENT: Open Platform for Embodied Agents Zhili Cheng, Zhitong Wang, Jinyi Hu, Shengding Hu, An Liu, Yuge Tu, Pengkai Li, Lei Shi, Zhiyuan Liu, Maosong Sun
2024-04-30 Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations Puhao Li, Tengyu Liu, Yuyang Li, Muzhi Han, Haoran Geng, Shu Wang, Yixin Zhu, Song-Chun Zhu, Siyuan Huang
2024-04-29 MaPa: Text-driven Photorealistic Material Painting for 3D Shapes Shangzhan Zhang, Sida Peng, Tao Xu, Yuanbo Yang, Tianrun Chen, Nan Xue, Yujun Shen, Hujun Bao, Ruizhen Hu, Xiaowei Zhou
2024-04-29 AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs zerojames
2024-04-29 HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections Chen Dudai, Morris Alper, Hana Bezalel, Rana Hanocka, Itai Lang, Hadar Averbuch-Elor
2024-04-29 PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Lin Xu, Yilin Zhao, Daquan Zhou, Zhijie Lin, See Kiong Ng, Jiashi Feng
2024-04-26 Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Olivia Wiles, Chuhan Zhang, Isabela Albuquerque, Ivana Kajić, Su Wang, Emanuele Bugliarello, Yasumasa Onoe, Chris Knutsen, Cyrus Rashtchian, Jordi Pont-Tuset, Aida Nematzadeh
2024-04-26 List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs An Yan, Zhengyuan Yang, Junda Wu, Wanrong Zhu, Jianwei Yang, Linjie Li, Kevin Lin, Jianfeng Wang, Julian McAuley, Jianfeng Gao, Lijuan Wang
2024-04-26 NeRF-XL: Scaling NeRFs with Multiple GPUs Ruilong Li, Sanja Fidler, Angjoo Kanazawa, Francis Williams
2024-04-26 ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Jiehui Huang, Xiao Dong, Wenhui Song, Hanhui Li, Jun Zhou, Yuhao Cheng, Shutao Liao, Long Chen, Yiqiang Yan, Shengcai Liao, Xiaodan Liang
2024-04-26 Make Your LLM Fully Utilize the Context Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou
2024-04-26 Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu
2024-04-26 Tele-FLM Technical Report Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
2024-04-26 How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang,
2024-04-26 SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension Bohao Li, Yuying Ge, Yi Chen, Yixiao Ge, Ruimao Zhang, Ying Shan
2024-04-26 Interactive3D: Create What You Want by Interactive 3D Generation Shaocong Dong, Lihe Ding, Zhanpeng Huang, Zibin Wang, Tianfan Xue, Dan Xu
2024-04-26 MoDE: CLIP Data Experts via Clustering Jiawei Ma, Po-Yao Huang, Saining Xie, Shang-Wen Li, Luke Zettlemoyer, Shih-Fu Chang, Wen-Tau Yih, Hu Xu
2024-04-26 MaGGIe: Masked Guided Gradual Human Instance Matting Chuong Huynh, Seoung Wug Oh, Abhinav Shrivastava, Joon-Young Lee
2024-04-26 XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference João Monteiro, Étienne Marcotte, Pierre-André Noël, Valentina Zantedeschi, David Vázquez, Nicolas Chapados, Christopher Pal, Perouz Taslakian
2024-04-26 Editable Image Elements for Controllable Synthesis Jiteng Mu, Michaël Gharbi, Richard Zhang, Eli Shechtman, Nuno Vasconcelos, Xiaolong Wang, Taesung Park
2024-04-26 BASS: Batched Attention-optimized Speculative Sampling Haifeng Qian, Sujan Kumar Gonugondla, Sungsoo Ha, Mingyue Shang, Sanjay Krishna Gouda, Ramesh Nallapati, Sudipta Sengupta, Xiaofei Ma, Anoop Deoras
2024-04-26 CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Sachin Mehta, Maxwell Horton, Fartash Faghri, Mohammad Hossein Sekhavat, Mahyar Najibi, Mehrdad Farajtabar, Oncel Tuzel, Mohammad Rastegari
2024-04-25 MotionMaster: Training-free Camera Motion Transfer For Video Generation Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma
2024-04-25 PuLID: Pure and Lightning ID Customization via Contrastive Alignment Zinan Guo, Yanze Wu, Zhuowei Chen, Lang Chen, Qian He
2024-04-25 ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning Weifeng Chen, Jiacheng Zhang, Jie Wu, Hefeng Wu, Xuefeng Xiao, Liang Lin
2024-04-24 Transformers Can Represent $n$-gram Language Models Anej Svete, Ryan Cotterell
2024-04-24 FlashSpeech: Efficient Zero-Shot Speech Synthesis Zhen Ye, Zeqian Ju, Haohe Liu, Xu Tan, Jianyi Chen, Yiwen Lu, Peiwen Sun, Jiahao Pan, Weizhen Bian, Shulin He, Qifeng Liu, Yike Guo, Wei Xue
2024-04-24 Pegasus-v1 Technical Report Raehyuk Jung, Hyojun Go, Jaehyuk Yi, Jiho Jang, Daniel Kim, Jay Suh, Aiden Lee, Cooper Han, Jae Lee, Jeff Kim, Jin-Young Kim, Junwan Kim, Kyle Park, Lucas Lee, Mars Ha, Minjoon Seo, Abraham Jo, Ed Park, Hassan Kianinejad, SJ Kim, Tony Moon, Wade Jeong, An
2024-04-24 Multi-Head Mixture-of-Experts Xun Wu, Shaohan Huang, Wenhui Wang, Furu Wei
2024-04-24 OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal, Mohammad Rastegari
2024-04-24 Align Your Steps: Optimizing Sampling Schedules in Diffusion Models Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis
2024-04-24 SnapKV: LLM Knows What You are Looking for Before Generation Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen
2024-04-23 Learning H-Infinity Locomotion Control cs.RO ‧ Junfeng Long, Wenye Yu, Quanyi Li, Zirui Wang, Dahua Lin, Jiangmiao Pang
2024-04-23 Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer Eric Brachmann, Jamie Wynn, Shuai Chen, Tommaso Cavallari, Áron Monszpart, Daniyar Turmukhambetov, Victor Adrian Prisacariu

匿名用户只展示最新 100 条榜单历史,更多历史数据请登录后查看,支持时光机按天筛选