AIAny - NLP

ReAct: Synergizing Reasoning and Acting in Language Models

2022

Shunyu Yao, Jeffrey Zhao +5

This paper introduces ReAct, an approach that integrates reasoning and acting in large language models (LLMs). ReAct enables LLMs to generate both reasoning traces and task-specific actions in an interleaved manner. This synergy allows reasoning to help induce, track, and update action plans, while actions interface with external sources like knowledge bases to gather more information, overcoming issues of hallucination and error propagation in prior methods.

paper LLM NLP ai-agent google+1

LightRAG

2024

Zirui Guo, Lianghao Xia +3

LightRAG is an open-source framework designed for simple and fast Retrieval-Augmented Generation (RAG), integrating knowledge graphs, vector search, and efficient LLM-based processing to enhance question-answering over large document collections.

RAG LLM NLP github ai-development+5

fairseq

2017

Facebook AI Research (FAIR)

fairseq is an open-source sequence modeling toolkit from Facebook AI Research (FAIR), implemented in Python on top of PyTorch. It provides reference implementations for a wide range of sequence models (Transformer, LSTM, Conv, wav2vec, wav2vec 2.0, etc.) and supports tasks such as machine translation, summarization, language modeling, and speech processing. Key features include multi-GPU and distributed training, fast generation (beam search, sampling, diverse beam), mixed-precision training, parameter/optimizer sharding, and many pre-trained models and examples. The project is MIT-licensed and documented at readthedocs.

github NLP ASR audio translation+2

garak

2023

Leon Derczynski, Erick Galinkin +4

garak is an open-source command-line LLM vulnerability scanner and red-teaming assessment kit. It probes large language models for failures such as hallucination, data leakage, prompt injection, misinformation, toxicity, jailbreaks, and other weaknesses. garak supports many backends (Hugging Face, OpenAI, Replicate, AWS Bedrock, gguf/llama.cpp, etc.), provides a wide range of probes and detectors, and produces structured logs and JSONL reports. Licensed under Apache-2.0.

llm NLP ai-tools github nvidia+3

LLM Transparency Tool

2023

facebookresearch, Igor Tufanov +3

LLM Transparency Tool (LLM-TT) is an open-source interactive toolkit from Facebook Research for analyzing the internal workings of Transformer-based language models. It lets users run inferences, explore contribution graphs tied to selected tokens, inspect representations after any block, and drill down to attention heads, FFN blocks and individual neurons to see how they promote or suppress output tokens. The project provides Docker and local installation instructions and a live demo hosted on Hugging Face.

github llm NLP xai ai-tools+3

Hands-On Large Language Models

2024

Jay Alammar, Maarten Grootendorst +1

Official code repository for the O'Reilly book "Hands-On Large Language Models" by Jay Alammar and Maarten Grootendorst. It provides runnable notebooks, visual explanations, and practical examples across chapters covering tokens and embeddings, transformer internals, text classification, semantic search, fine-tuning, multimodal models, and more. Recommended to run in Google Colab for easy setup.

book llm LLM github tutorial+5

Foundations of LLMs (大模型基础)

2024

ZJU-LLMs

Foundations of LLMs is an open-source book by the ZJU-LLMs team that teaches fundamentals and advanced topics of large language models. It covers language model basics, LLM architecture evolution, prompt engineering, parameter-efficient fine-tuning, model editing, and retrieval-augmented generation. The repo provides chapter PDFs, paper lists, and is updated monthly.

book foundation LLM llm NLP+3

LangExtract

2025

Google

LangExtract is an open-source Python library (by Google) that uses large language models to extract structured information from unstructured text. It emphasizes precise source grounding, scalable processing for long documents, interactive HTML visualizations, and flexible model support (cloud LLMs like Gemini/OpenAI and local models via Ollama).

github google llm NLP ai-library+4

Neural Machine Translation by Jointly Learning to Align and Translate

2014

Dzmitry Bahdanau, Kyunghyun Cho +1

This paper introduces an attention-based encoder–decoder NMT architecture that learns soft alignments between source and target words while translating, eliminating the fixed-length bottleneck of earlier seq2seq models. The approach substantially improves BLEU, especially on long sentences, and matches phrase-based SMT on English-French without additional hand-engineered features. The attention mechanism it proposes became the foundation for virtually all subsequent NMT systems and inspired attention-centric models like the Transformer, reshaping machine translation and sequence modeling across NLP.

30u30 paper NLP translation

Attention Is All You Need

2017

Ashish Vaswani, Noam Shazeer +6

The paper “Attention Is All You Need” (2017) introduced the Transformer — a novel neural architecture relying solely on self-attention, removing recurrence and convolutions. It revolutionized machine translation by dramatically improving training speed and translation quality (e.g., achieving 28.4 BLEU on English-German tasks), setting new state-of-the-art benchmarks. Its modular, parallelizable design opened the door to large-scale pretraining and fine-tuning, ultimately laying the foundation for modern large language models like BERT and GPT. This paper reshaped the landscape of NLP and deep learning, making attention-based models the dominant paradigm across many tasks.

NLP LLM AIGC 30u30 paper+1

Relational recurrent neural networks

2018

Adam Santoro, Ryan Faulkner +8

This paper introduces a Relational Memory Core that embeds multi-head dot-product attention into recurrent memory to enable explicit relational reasoning. Evaluated on synthetic distance-sorting, program execution, partially-observable reinforcement learning and large-scale language-modeling benchmarks, it consistently outperforms LSTM and memory-augmented baselines, setting state-of-the-art results on WikiText-103, Project Gutenberg and GigaWord. By letting memories interact rather than merely store information, the approach substantially boosts sequential relational reasoning and downstream task performance.

foundation 30u30 paper NLP LLM

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

2018

Jacob Devlin, Ming-Wei Chang +2

The BERT (Bidirectional Encoder Representations from Transformers) paper introduced a powerful pre-trained language model that uses deep bidirectional transformers and masked language modeling to capture both left and right context. Unlike prior unidirectional models, BERT achieved state-of-the-art performance across 11 NLP tasks (like GLUE, SQuAD) by enabling fine-tuning with minimal task-specific adjustments. Its impact reshaped NLP by setting a new standard for transfer learning, greatly improving accuracy on tasks such as question answering, sentiment analysis, and natural language inference, and inspiring a wave of follow-up models like RoBERTa, ALBERT, and T5.

NLP paper

Tag

Explore by tags

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

alibaba

amazon

anthropic

audio

blog

book

bytedance

chatbot

chemistry

claude

claude-code

course

deepmind

deepseek

engineering

finance

foundation

foundation-model

gemini

github

google

gradient-booting

grok

huggingface

LLM

llm

math

mcp

mcp-client

mcp-server

meta-ai

microsoft

mlops

NLP

nvidia

ocr

ollama

openai

paper

physics

plugin

pytorch

RL

robotics

science

security

sora

translation

tutorial

vibe-coding

video