science_live.pipeline.question_processor
#
Science Live Pipeline: Question Processing#
First step of the pipeline that parses and preprocesses natural language questions.
Responsibilities:
Clean and normalize input text
Classify question type (what, who, where, etc.)
Extract key phrases and potential entities
Assess intent confidence
Author: Science Live Team Version: 1.0.0
Module Contents#
Classes#
Parse and preprocess natural language questions. |
Functions#
Check if a question is valid for processing |
|
Preprocess a batch of questions |
Data#
API#
- science_live.pipeline.question_processor.__all__ = ['QuestionProcessor']#
- class science_live.pipeline.question_processor.QuestionProcessor(config: Dict[str, Any] = None)[source]#
Bases:
science_live.pipeline.common.PipelineStep
Parse and preprocess natural language questions.
This is the first step in the pipeline that takes raw natural language questions and prepares them for entity extraction and further processing.
Features:
Question type classification
Text cleaning and normalization
Key phrase extraction
Potential entity identification
Intent confidence assessment
Initialization
- _initialize_patterns() Dict[str, List[str]] [source]#
Initialize question type classification patterns
- async process(question: str, context: science_live.pipeline.common.ProcessingContext) science_live.pipeline.common.ProcessedQuestion [source]#
Process natural language question.
Args: question: Raw natural language question context: Processing context with user info and preferences
Returns: ProcessedQuestion with classified and preprocessed information
Raises: ValueError: If question is empty or invalid
- _classify_question_type(question: str) Tuple[science_live.pipeline.common.QuestionType, float] [source]#
Classify the type of question and assess confidence
- _identify_potential_entities(question: str) List[str] [source]#
Identify potential entities in the question
- get_question_complexity(processed_question: science_live.pipeline.common.ProcessedQuestion) int [source]#
Assess question complexity on a 1-5 scale.
Args: processed_question: The processed question to analyze
Returns: Complexity score from 1 (simple) to 5 (very complex)
- suggest_improvements(processed_question: science_live.pipeline.common.ProcessedQuestion) List[str] [source]#
Suggest improvements to make the question more processable.
Args: processed_question: The processed question to analyze
Returns: List of suggestion strings
- science_live.pipeline.question_processor.is_valid_question(question: str) bool [source]#
Check if a question is valid for processing
- science_live.pipeline.question_processor.preprocess_question_batch(questions: List[str]) List[str] [source]#
Preprocess a batch of questions
- science_live.pipeline.question_processor.__version__ = '1.0.0'#
- science_live.pipeline.question_processor.__author__ = 'Science Live Team'#
- science_live.pipeline.question_processor.__description__ = 'Question processing step for Science Live pipeline'#