NeuraLumi

NeuraLumi

Saved Searches (Spark)

transformer architecture

1/15/2024•Done

attention is all you need

1/14/2024•Done

deep learning optimization

1/12/2024•Done

Search in Steps

Filters

Show only :

Last 2 years

Last 5 years

Last 7 years

From :

To :

Group by year :

Open access :

	Relevance	Year	Paper	Abstract
	98%	2023	GPT-4 Technical Report OpenAI arXiv	open access We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks.
	95%	2022	Training language models to follow instructions with human feedback Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida NeurIPS	open access Making language models bigger does not inherently make them better at following a user's intent. We show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.
	92%	2023	LLaMA: Open and Efficient Foundation Language Models Hugo Touvron, Thibaut Lavril, Gautier Izacard arXiv	open access We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively.
	90%	2022	Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason Wei, Xuezhi Wang, Dale Schuurmans NeurIPS	open access We explore how generating a chain of thought—a series of intermediate reasoning steps—significantly improves the ability of large language models to perform complex reasoning.
	88%	2021	LoRA: Low-Rank Adaptation of Large Language Models Edward J. Hu, Yelong Shen, Phillip Wallis ICLR	open access We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.
	85%	2023	Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Patrick Lewis, Ethan Perez, Aleksandra Piktus NeurIPS	open access Large pre-trained language models have been shown to store factual knowledge in their parameters. However, their ability to access and precisely manipulate knowledge is still limited. We explore a general-purpose fine-tuning recipe for retrieval-augmented generation.
	83%	2022	Constitutional AI: Harmlessness from AI Feedback Yuntao Bai, Saurav Kadavath, Sandipan Kundu arXiv	open access As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs.
	80%	2023	Sparks of Artificial General Intelligence: Early experiments with GPT-4 Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan arXiv	open access We contend that GPT-4 is part of a new cohort of LLMs that exhibit more general intelligence than previous AI models. We demonstrate GPT-4's capabilities across various domains including mathematics, coding, vision, medicine, law, and psychology.