Ask questions about current events and get AI-powered answers with sources
A RAG system built on top of a custom news aggregation pipeline. Articles are scraped hourly from multiple sources, summarized by an LLM, and stored with vector embeddings for semantic search.
Pipeline: Apache Airflow orchestrates hourly scraping from AP News, CNN, Fox, and Reuters. Each article is processed through Llama 3.1 for summarization, entity extraction (tickers, companies, sectors), and sentiment scoring.
Search: Your question is embedded using nomic-embed-text:v1.5 and compared against article summaries using pgvector similarity search. Relevant articles are retrieved and used to generate a contextual answer with citations.
Stack: PostgreSQL + pgvector, Apache Airflow 3.x, Ollama (Llama 3.1:8b), FastAPI.
Infrastructure: The backend for this setup spans between two different servers in my apartment. The first being a dedicated server, and the second being my personal computer. Due to the outdated GPU installed in the server, modern CUDA cannot run and hence ollama refuses to run models efficiently. So, all LLM summarization/embedding/entity-extraction is offloaded to my personal computer running a Nvidia 2080 Super.