BettaFish: An Innovative Multi-Agent Public Opinion Analysis System
Overview
BettaFish, also known as 'Weiyu' (微舆, meaning 'micro-fish' in Chinese, inspired by the resilient Betta fish), is a groundbreaking open-source project designed to democratize public opinion analysis. Developed from the ground up without dependencies on external frameworks, this system empowers users—regardless of technical expertise—to gain deep insights into social sentiments by simply posing questions in natural language. It processes data from over 30 major social media platforms worldwide, including Weibo, Xiaohongshu, Douyin, Kuaishou, and international ones like Twitter/X, delving into millions of user comments to provide a comprehensive, unbiased view of public discourse.
The project's core mission is to shatter information bubbles, restore the authentic shape of public opinion, forecast future trajectories, and support informed decision-making. Unlike traditional dashboards or rigid analytics tools, BettaFish operates through a conversational interface, where users input queries like 'Analyze the brand reputation of Wuhan University,' and the system autonomously generates detailed reports, including interactive HTML outputs and optional PDF exports.
Key Features and Advantages
BettaFish stands out with six major advantages that set it apart from similar products:
-
AI-Driven Global Monitoring: An AI-powered crawler cluster runs 24/7, capturing hotspots not just from surface-level posts but also from vast user comments on platforms like Weibo, Xiaohongshu, and short-video sites. This ensures a real-time, exhaustive coverage of public voices.
-
Composite Analysis Engine Beyond LLMs: While leveraging large language models (LLMs), the system integrates fine-tuned models, statistical analytics, and middleware for multi-model collaboration. This hybrid approach guarantees depth, accuracy, and multifaceted perspectives in analysis.
-
Advanced Multimodal Capabilities: Breaking beyond text and images, BettaFish excels at parsing short videos from Douyin and Kuaishou, extracting structured data like weather, calendars, or stock info from search engines. This multimodal prowess provides a holistic grasp of dynamic sentiments.
-
Agent 'Forum' Collaboration Mechanism: Five specialized agents—Query Agent (for broad searches), Media Agent (multimodal processing), Insight Agent (private database mining), Report Agent (report generation), and a forum moderator—engage in chain-of-thought debates via a 'forum' engine. This fosters collective intelligence, mitigates single-model biases, and enhances decision quality through iterative discussions.
-
Seamless Public-Private Data Fusion: Beyond public sentiment, it offers secure APIs to integrate internal business databases, blending external trends with proprietary insights for tailored vertical applications.
-
Lightweight and Highly Extensible Framework: Built in pure Python with modular design, it supports one-click Docker deployment. Developers can easily customize agents, tools, or prompts—e.g., adapting it for financial market analysis by tweaking APIs.
Architecture and Workflow
The system architecture is elegantly structured around four main engines: QueryEngine (broad news/search), MediaEngine (multimodal content), InsightEngine (database insights with sentiment models), and ReportEngine (multi-round report synthesis). A ForumEngine orchestrates agent interactions.
A typical workflow begins with a user query via Flask/SSE interface:
- Parallel Initialization: Agents launch concurrently for preliminary scans.
- Iterative Deep Research: Agents refine strategies, conduct specialized searches, and collaborate in forum-style debates (multi-round loops with reflection and adjustment).
- Synthesis and Reporting: Results are aggregated into an Intermediate Representation (IR), validated, and rendered into interactive HTML reports with charts, tables, and exports.
Supporting components include MindSpider (social media crawler), SentimentAnalysisModel (fine-tuned BERT/GPT-2/Qwen for multilingual sentiment), and single-agent Streamlit apps for isolated testing.
Deployment and Usage
Getting started is straightforward:
- Docker: Clone the repo, configure
.env(DB, LLMs like OpenAI-compatible APIs), and rundocker compose up -d. Access athttp://localhost:5000. - Source: Use Conda/uv for Python 3.9+, install dependencies, set up PostgreSQL/MySQL, and launch
python app.py. - Crawling: Run MindSpider separately for data ingestion (e.g.,
python main.py --complete). - CLI Tool: Generate reports directly with
python report_engine_only.py.
Customization is encouraged: Modify configs for LLM providers (e.g., Claude, Gemini), sentiment models, or integrate business DBs.
Impact and Community
Launched in late 2025 as a university student's coursework, BettaFish exploded in popularity, amassing 30,000+ GitHub stars within weeks and topping trending lists. It's praised for its framework-free purity, making it ideal for education, research, and prototyping. Discussions thrive on Linux.do and GitHub, with sponsors like AIHubMix and 302.AI providing LLM credits.
Future plans include data-driven prediction models using time-series and graph neural networks, evolving it into a universal analytics engine.
Note: For learning/research only; comply with laws on crawling and data use. Licensed under GPL-2.0.
