Overview
This project developed a chatbot that provided information and support for individuals affected by breast cancer, addressing the critical need for accessible and reliable resources. The chatbot leveraged advanced NLP techniques and real-time data extraction from reputable medical websites.
Introduction
Breast cancer is a significant health concern, with 2.3 million cases globally and 685,000 deaths in 2020. In the US alone, there were over 200,000 new cases annually. The project tackled issues such as the lack of accessible resources, overwhelmed patients, healthcare disparities, and financial and geographic barriers.
Role of Chatbot
The chatbot:
- Provided information about breast cancer risk factors, symptoms, and prevention.
- Answered user queries related to breast health.
- Supported patients throughout their healthcare journey.
Design Modules
1. **Data Extraction**
- Used Selenium, Beautiful Soup, and Scrapy for real-time data extraction from websites like Medscape, Medline, American Cancer Society, and CDC.
- Extracted data in a structured format: Question | Answer | Patterns | Tags | Source.
2. **Chatbot Model Design**
- Implemented intent classification using transformers like BERT, RoBERTa, and ALBERT to understand and generate human-like language responses.
3. **Front End and Backend Design**
- Developed a user-friendly chat interface using Streamlit.
- Provided a text input box, message display area, scrolling functionality, and UI elements for interaction.- Integration and Deployment
- Deployed the project using Streamlit to create and share the web app easily.
Intent Classification
Intent classification was crucial for understanding and accurately responding to user queries. By employing advanced techniques such as transformers, the chatbot could interpret the user’s intent and generate appropriate responses.
- Transformers: Models like BERT, RoBERTa, and ALBERT utilized transformers, which leveraged self-attention mechanisms to weigh the importance of different words in a sentence.

Conclusion
This project developed a chatbot that answered breast cancer-related questions using advanced NLP techniques. The team had a clear plan and defined roles to ensure effective project completion. Future enhancements could include implementing a Named Entity Recognition system and extending the chatbot to cover other diseases.