Latest ML Research: Multimodal Models & Small Language Models
Check out these ML Papers 馃煝 From a new multimodal model by Google to teaching small language models to reason and more! Advancing Long-Context LLMs - an overview of the methodologies for enhancing Transformer architecture modules that optimize long-context capabilities across all stages from pre-training to inference. Paper: https://arxiv.org/abs/2311.12351 馃煝 Mirasol3B - a multimodal model for learning across audio, video, and text which decouples the multimodal modeling into separate, focused autoregressive models. Details: https://blog.research.google/2023/11/scaling-multimodal-understanding-to.html Paper: https://arxiv.org/abs/2311.05698 馃煝 Orca 2: Teaching Small LMs To Reason - shows how to teach smaller language models to reason. Specifically, the LM is thought to use reasoning techniques, such as step-by-step processing, recall-then-generate, recall-reason-generate, extract-generate, and direct-answer methods. Paper: https://arxiv.org/abs/2311.11045 馃煝 The Hitchhiker鈥檚 Guide From Chain-of-Thought Reasoning to Language Agents - summary of CoT reasoning, foundational mechanics underpinning CoT techniques, and their application to language agent frameworks. Paper: https://arxiv.org/abs/2311.11797
