Poker Vision and GTO Analysis Tool — Developer Specification Objective Develop an offline application that analyzes recorded poker footage, screenshots, or manually entered hand data to detect the game state and generate GTO-based recommendations for post-session review and training. The system is intended for study and hand analysis only. It must not provide live in-game assistance. Scope The application should: • Detect hole cards, community cards, player positions, pot size, bet sizing, and approximate stack sizes from recorded media. • Reconstruct the full hand state in a structured format. • Pass the reconstructed state into a GTO analysis module or solver-based recommendation engine. • Display recommendations, alternative lines, and relevant metrics in a simple user interface. • Include testing, performance optimization, and documentation for long-term maintainability. Recommended Technologies • Python for rapid prototyping and backend development • OpenCV for image and video processing • PyTorch for model training and inference • YOLO or a similar object detection model for card and chip recognition • NumPy / Pandas for data processing • PySide6 or Tkinter for the desktop interface • Optional solver integration if precomputed strategy outputs are available Development Plan 1. Define requirements and architecture Before implementation, define the exact project goals, constraints, and module boundaries. The developer should specify: • supported input types • poker format(s) • expected accuracy targets • acceptable processing time • target operating system(s) • internal module interfaces The system should be split into the following components: • media input and frame extraction • visual detection • state reconstruction • strategy analysis • user interface • testing and logging 2. Prepare and annotate datasets Create or collect annotated datasets for: • hole cards • board cards • chips or stack indicators • table regions • action areas • pot and numeric displays Tasks include: • defining annotation rules • labeling training images consistently • creating train, validation, and test splits • ensuring examples cover different lighting, resolutions, and layouts • versioning the dataset for reproducibility 3. Build the media input module Develop the media input layer to handle: • recorded video files • screenshots • screen captures • manually entered hand states for testing This module should: • extract frames at configurable intervals • support cropping and region-of-interest selection • normalize images before inference • assign timestamps and frame identifiers for debugging 4. Implement card and stack detection Train and deploy a visual detection model to identify: • card ranks and suits • community cards • chip stacks or stack indicators • relevant UI areas and seat locations Tasks: • train an object detection model such as YOLO • add post-processing to reject invalid detections • estimate stack sizes using visual chip counts or numerical overlays • improve robustness against blur, occlusion, and inconsistent lighting 5. Reconstruct the poker state Convert the detection outputs into a structured hand-state representation. The output schema should include: • hero cards • board cards • seat positions • stack sizes • pot size • current street • action history • bet sizes • effective stack • frame or timestamp reference This layer should also: • combine detections across multiple frames • resolve conflicting information • validate impossible game states • support manual correction in the interface when confidence is low 6. Integrate the GTO analysis module Once the hand state is reconstructed, feed it into a strategy engine. Input fields should include: • position • stack depth • action sequence • board texture • hand combination • betting context The analysis module should return: • recommended action • strategy frequency • EV or decision quality metrics • optional alternative actions for comparison This module may use: • precomputed solver outputs • lookup tables • a trained policy approximation model • a hybrid rules-plus-model architecture 7. Build the user interface Create a simple interface that allows the user to: • upload media • inspect detected cards and stacks • review reconstructed hand states • correct detection mistakes manually • view solver recommendations clearly The UI should prioritize: • readability • fast correction workflows • traceability between frames and detected states • easy debugging for developers 8. Perform end-to-end testing After the interface is functional, run full pipeline tests. Validation should include: • detection accuracy under different lighting and camera angles • state reconstruction consistency • reliability of the strategy output • stability across different input sources • acceptable processing time for offline review 9. Optimize performance Once the pipeline works correctly, optimize: • frame extraction speed • inference time • memory usage • model size • error recovery logic The goal is to make the application stable, efficient, and usable on standard hardware. 10. Ensure continuous integration between modules All modules must exchange data using clearly defined formats and interfaces. Add integration tests to ensure that changes in one component do not silently break another. 11. Run stress tests in different environments Test the application on: • different operating systems • different screen resolutions • different video qualities • different table layouts • longer sessions with many hands This will improve robustness and reduce environment-specific failures. 12. Plan for long-term model maintenance The system should support future retraining and improvement. This includes: • collecting new annotated examples • retraining detection models on updated layouts • maintaining versioned datasets and models • documenting retraining procedures • tracking model performance over time Documentation Requirements Every module must be documented so another developer can understand and maintain the project. Documentation should include: • module purpose • inputs and outputs • dependencies • known limitations • test procedures • setup instructions • deployment notes Delivery Checklist 1. Requirements definition and technology selection 2. Dataset preparation and annotation 3. Media capture and input pipeline 4. Card and stack detection model 5. State reconstruction layer 6. GTO analysis integration 7. User interface 8. Accuracy and latency testing 9. Performance optimization 10. Cross-module integration checks 11. Stress testing across environments 12. Model maintenance and retraining plan 13. Full technical documentation and handoff