Skip to main content

Beta Phase: Core Development

Timeline: November 8, 2025 - January 30, 2026 (12 weeks)
Target: End of January release in affiliate gyms

Overview

The Beta phase will transition from proof-of-concept to production-ready system, implementing the complete computer vision pipeline with enhanced accuracy and robustness. This phase will introduce gender classification, configurable evaluation profiles, and prepare for deployment in multiple affiliate gym environments. The expanded team will deliver a feature-complete system ready for controlled testing with HYROX staff and select affiliates. We'll shift focus from proving feasibility to building a reliable, scalable system that can operate in real-world gym conditions.

Key Activities

The Beta phase will expand development efforts across multiple workstreams to deliver a production-ready system. These activities will transform the Alpha proof-of-concept into a robust, scalable approach suitable for real-world deployment in affiliate gyms.

Planning & Release Strategy

Scope Definition will finalize the feature set for Beta release based on Alpha phase learnings and stakeholder feedback. We'll prioritize features that provide maximum value while maintaining technical feasibility within the timeline, conducting detailed requirements sessions with HYROX stakeholders to understand operational needs and judge workflows. Clear acceptance criteria will be defined for each feature to ensure alignment between development efforts and business expectations.

Production Hardware Selection will complete evaluation and procurement of production-grade cameras and edge computing hardware. Building on Alpha phase testing, we'll finalize specifications for cameras that balance cost, performance, and reliability for long-term deployment while placing purchase orders for pilot equipment with lead times factored into the project schedule. Vendor relationships will be established to ensure ongoing support and volume pricing for future phases.

Gym Selection: We will identify and prepare 3-5 affiliate gyms for pilot deployment, with a strong preference for Chicago-area locations to facilitate coordination of field tests and installations. We'll select locations with varying conditions to test system robustness. Selection criteria include network infrastructure quality, gym layout diversity, and staff technical capability for basic troubleshooting. Each selected gym will receive a site survey to document mounting locations, network access points, and potential challenges. Agreements will be established with gym owners covering equipment installation, data collection, and operational procedures.

Success Metrics Definition: Clear pilot success measurement criteria and KPIs will be established for gym deployments. Metrics will include technical performance (latency, accuracy), operational efficiency (setup time, support tickets), and user satisfaction (judge feedback, athlete acceptance). Baseline measurements will be established during initial deployments to track improvement over time. Regular reporting mechanisms will be implemented to communicate progress to stakeholders.

Feedback Process: We will establish a structured approach for collecting and incorporating user feedback from pilots. This will include regular check-ins with gym staff, automated error reporting from the system, and structured surveys for judges and athletes. A triage process will prioritize feedback based on impact and feasibility, ensuring critical issues are addressed quickly. All feedback will be tracked in a central system with clear resolution paths and communication back to users.

Computer Vision Enhancement

Advanced Positioning will prove squat/stand positioning in different and challenging environments including crowded gyms and varied lighting. The system will be tested with multiple athletes in frame, partial occlusions, and challenging camera angles common in real deployments while advanced algorithms handle edge cases like athletes wearing loose clothing or unusual body proportions. Performance will be validated across diverse populations to ensure fairness and consistency.

Gender Classification: The team will implement ML-based gender detection for applying appropriate squat depth standards per competition rules. The classification system will be trained on diverse datasets to minimize bias and ensure accurate classification across all demographics. Manual override capabilities will be implemented for cases where automatic classification is uncertain or incorrect. Privacy-preserving techniques will ensure no biometric data is stored while maintaining classification accuracy.

Multi-Athlete Tracking: We will develop robust algorithms to handle multiple athletes in frame with individual tracking and assessment. Each athlete will be assigned a unique tracking ID that persists throughout their workout, even if they temporarily leave the frame. The system will correctly associate each squat with the appropriate athlete and maintain separate rep counts and assessments. Spatial segmentation techniques will prevent confusion when athletes are in close proximity.

Occlusion Handling: The system will implement robust algorithms for partial visibility scenarios common in crowded gym environments. Pose estimation will continue functioning even when some body parts are temporarily occluded by equipment or other athletes. Confidence scores will appropriately reflect uncertainty in occluded scenarios while avoiding false negatives for valid squats. Recovery mechanisms will quickly re-establish tracking when full visibility returns.

Portal Development

Session Logs Export: We will build exportable session logs with automatic logging in JSON and CSV formats for analysis and record-keeping. Each session will capture comprehensive data including rep counts, timing, depth measurements, and confidence scores for every squat attempt. Logs will be structured to support both human review and automated analysis with clear schemas documented. Export functionality will support integration with existing HYROX data systems and third-party analytics tools.

Evaluation Profiles: The system will support configurable thresholds, rep count rules, and decision tolerances for different competition levels. Profiles can be adjusted for different age groups, competition categories, or training versus competition scenarios. Changes to profiles will take effect immediately without system restart, allowing real-time adjustments during events. A profile management interface will allow authorized users to create, modify, and activate different configurations.

Gender Override Controls: We will implement manual override capabilities for gender classification when automatic detection is uncertain. The override interface will be simple and quick to use, designed for minimal disruption to judge workflow during competitions. Override decisions will be logged for audit purposes and to improve the automatic classification system. Access controls will ensure only authorized personnel can make override decisions.

Administrator Dashboard: A comprehensive admin interface will be developed for system configuration, monitoring, and management. The dashboard will provide real-time visibility into all active stations, system health metrics, and any alerts requiring attention. Configuration changes can be pushed to multiple stations simultaneously with validation to prevent errors. Historical data and trends will help administrators identify patterns and optimize system performance.

Real-time Monitoring: We will build live monitoring capabilities for multiple stations with centralized visibility and control. Event coordinators can view all stations on a single screen with status indicators for each target zone. Automated alerts will notify staff of any issues requiring intervention such as camera disconnection or processing delays. Performance metrics will be aggregated across stations to identify systemic issues versus isolated problems.

Hardware Deployment

Equipment Sourcing: We will source and procure off-the-shelf equipment for each pilot site based on Beta specifications, including industrial cameras and NVIDIA Jetson devices. Orders will be placed with sufficient lead time to account for shipping and any potential delays in availability. Quality control processes will verify all equipment meets specifications before deployment to pilot sites. Spare equipment will be procured to handle failures and enable rapid replacement during pilots.

Hardware Standardization: The team will determine the optimal number of cameras and computing units per station for production deployment. Standardization will balance coverage quality with cost, settling on configurations that provide reliable detection without unnecessary redundancy. Documentation will clearly specify mounting positions, camera angles, and optimal distances for consistent setup across venues. Power and networking requirements will be standardized to simplify venue preparation.

Deployment Kits: We will create standardized hardware packages that ensure consistent deployment across all pilot locations. Each kit will include all necessary equipment, cables, mounting hardware, and tools needed for installation. Clear labeling and organization will minimize setup time and reduce the chance of installation errors. Protective cases will ensure equipment arrives undamaged and can be safely stored between events.

Remote Management: The system will implement remote monitoring and configuration capabilities for all deployed hardware using Azure IoT Edge or similar platforms. Administrators can check device status, update firmware, and modify configurations without physical access to equipment. Diagnostic tools will enable remote troubleshooting to resolve issues without on-site visits when possible. Secure tunnels will protect remote access while maintaining system security.

Integration & Testing

Camera-to-CV Pipeline: We will establish a robust connection between cameras and CV processing units with automatic recovery from disconnections. The pipeline will handle multiple video formats and automatically adapt to different camera capabilities. Buffering strategies will smooth out network jitter while maintaining low latency for real-time processing. Comprehensive error handling will log issues for debugging while maintaining system operation.

Portal Deployment: The system will enable both local and remote portal access with secure authentication and role-based permissions, potentially leveraging WebRTC for real-time video streaming and WebSocket for low-latency communication. Local access will provide low-latency interaction for judges while remote access enables administration and monitoring. SSL certificates and encrypted communications will protect data in transit between components. Load balancing will ensure responsive performance even with multiple simultaneous users.

Target System Integration: We will connect CV output to existing HYROX scoring and judging systems through well-defined APIs. Integration will be thoroughly tested to ensure data formats are compatible and timing requirements are met. Fallback mechanisms will handle API failures gracefully without disrupting the judging process. Documentation will enable HYROX technical staff to maintain and modify integrations as needed.

End-to-End Testing: Comprehensive testing will validate the complete workflow from athlete movement to score recording. Test scenarios will cover normal operations, edge cases, and failure conditions to ensure robust operation. Performance testing will verify the system maintains required latency under maximum load conditions. User acceptance testing with HYROX staff will validate the system meets operational requirements.

Deliverables

Technical Deliverables

Production-Quality Pipeline: A complete computer vision pipeline processing 60fps video with production-level reliability and performance. Note that as a Beta phase deliverable, this represents a stable candidate architecture that may require refinements based on pilot feedback, distinct from a fully production-ready system. The pipeline will be optimized for edge deployment with efficient resource utilization and thermal management. Comprehensive logging and monitoring will provide visibility into pipeline performance and health. Documentation will detail the pipeline architecture, data flow, and optimization techniques employed.

ML-Based Gender Detection Module: An accurate gender classification system achieving 95% accuracy, developed using TensorFlow or PyTorch frameworks with privacy-preserving implementation. The system will handle edge cases gracefully with confidence scores indicating when manual review may be needed. No biometric data will be stored, with classification results used only for applying appropriate competition standards. The classification model will be regularly updated based on performance monitoring and feedback.

Multi-Athlete Tracking: Robust tracking supporting 3-5 simultaneous athletes with persistent identification throughout their workout. The system will maintain accurate rep counts for each athlete even in crowded, dynamic environments. Visual feedback will clearly indicate which athlete is being tracked and assessed at any moment. Performance will degrade gracefully when the maximum number of athletes is exceeded.

Configurable Evaluation: A flexible evaluation system with adjustable thresholds supporting different competition rules and categories. Configuration changes will be validated to prevent invalid settings that could compromise competition integrity. Multiple profiles can be stored and quickly activated for different event types or training scenarios. All configuration changes will be logged for audit and troubleshooting purposes.

Software Deliverables

Complete Web Portal: Full-featured admin and judge interfaces supporting all required workflows and use cases. The portal will be responsive and work across different devices including tablets used by judges during events. Real-time updates will ensure all users see current information without manual refresh. Comprehensive help documentation will be integrated directly into the interface.

API Implementation: Production-ready APIs for third-party integrations with comprehensive documentation and examples. APIs will follow RESTful principles with clear versioning to support future evolution. Rate limiting and authentication will protect against abuse while ensuring legitimate usage is not impacted. Client libraries in common languages will accelerate integration development.

Automated Testing Suite: Comprehensive test coverage achieving 80% code coverage with automated execution in CI/CD pipeline. Tests will include unit tests for individual components, integration tests for component interactions, and end-to-end tests for complete workflows. Performance tests will validate latency and throughput requirements are maintained. Test results will be automatically reported with clear indication of any failures.

Deployment Automation: Scripts and tools for automated deployment reducing setup time and potential for human error. Deployment packages will be self-contained with all dependencies included or automatically resolved. Rollback capabilities will enable quick recovery if issues are discovered after deployment. Configuration management will ensure consistency across all deployed instances.

Hardware Deliverables

Standardized Hardware Kits: Complete specifications for production hardware with validated performance and reliability. Each kit will be documented with detailed parts lists, assembly instructions, and configuration procedures. Cost optimization will be achieved through bulk purchasing agreements and standardized components. Quality assurance procedures will ensure all kits meet specifications before deployment.

Installation Guides: Step-by-step installation guides with photos and diagrams for consistent setup across venues. Guides will cover multiple mounting scenarios to handle different venue configurations. Troubleshooting sections will address common installation issues with clear resolution steps. Video tutorials will supplement written guides for complex procedures.

Network Architecture: Documented network requirements and configuration for reliable multi-station deployments. Architecture will support both wired and wireless connectivity with appropriate security measures. Bandwidth requirements will be clearly specified to ensure venues can support the system. Network diagnostic tools will help identify and resolve connectivity issues.

Remote Management Tools: Complete tools for remote monitoring, configuration, and troubleshooting of deployed systems. Dashboards will provide at-a-glance status of all deployed equipment with drill-down for details. Automated alerts will notify administrators of issues requiring attention before they impact operations. Remote access will be secured with multi-factor authentication and audit logging.

Documentation

System Administration Guide: Draft guide for system administrators covering operational procedures, with final version relying heavily on automation through Ansible and infrastructure-as-code practices. The guide will include day-to-day operations, troubleshooting procedures, and performance optimization techniques. Security best practices will be documented to ensure system integrity is maintained. Regular maintenance procedures will be outlined to ensure long-term reliability.

Judge Training Materials: Draft training package for judges developed as a shared responsibility between our team, HYROX, and selected pilot judges, including written materials, videos, and hands-on exercises. Materials will cover system operation, common scenarios, and troubleshooting basic issues. Quick reference cards will provide easy access to key information during events. Training effectiveness will be validated through feedback and performance monitoring.

Technical API Documentation: Detailed API documentation with examples enabling third-party integrations and custom development. Documentation will include authentication procedures, endpoint descriptions, and example requests/responses. Rate limits and best practices will be clearly communicated to prevent integration issues. A developer portal will provide interactive API testing capabilities.

Troubleshooting Procedures: Step-by-step procedures for diagnosing and resolving common issues. Procedures will be organized by symptom to enable quick problem identification. Each procedure will include expected resolution time to set appropriate expectations. Escalation paths will be defined for issues requiring vendor or development team support.

Pilot Program

Affiliate Gym Deployments: Successfully deployed systems at 3-5 affiliate gyms with different characteristics and challenges. Each deployment will validate installation procedures and identify venue-specific considerations. Pilot gyms will provide diverse testing environments including different layouts, lighting, and network conditions. Lessons learned from each deployment will be incorporated into procedures and documentation.

Performance Metrics: Comprehensive metrics from real-world usage validating system performance and reliability. Metrics will cover technical performance (latency, accuracy), operational efficiency (setup time, support needs), and user satisfaction. Data will be collected continuously and analyzed to identify trends and improvement opportunities. Regular reports will communicate pilot progress to stakeholders.

User Feedback Compilation: Structured feedback from judges, athletes, and gym staff identifying strengths and improvement areas. Feedback will be collected through surveys, interviews, and observation of system usage. Common themes will be identified and prioritized for addressing in subsequent phases. Positive feedback will be documented for marketing and training purposes.

Improvement Recommendations: Detailed recommendations for Gamma phase based on pilot learnings and feedback. Recommendations will be prioritized based on impact and feasibility within project constraints. Technical improvements will be balanced with operational and user experience enhancements. Clear success criteria will be defined for each recommended improvement.

Staffing Summary

The Beta phase expands the team to eight roles over 12 weeks with a weekly cost of $62,550 and a total phase investment of $750,600.

The Technical Lead continues providing architectural guidance at 100% allocation for $11,000 weekly, ensuring consistent system design as complexity increases.

The Computer Vision AI Lead drives core algorithm development at full allocation for $10,000 per week, focusing on the advanced positioning and multi-athlete tracking capabilities needed for real-world deployment.

The Machine Learning Engineer moves to full-time allocation at $9,000 weekly for intensive model training and optimization work, particularly around gender classification systems.

The Edge Computing Engineer joins at 100% allocation for $8,500 per week to handle performance optimization as processing demands increase significantly.

The Platform Engineer establishes deployment infrastructure at 50% allocation for $4,250 weekly, building the foundation for scalable operations.

The new Full-Stack Engineer builds the complete portal and API layer at 100% allocation for $7,500 per week, creating the user interfaces that will support production operations.

The QA Engineer ensures quality across all components at 50% allocation for $3,500 weekly, establishing testing protocols that will scale to production.

The Project Manager coordinates the expanded team at full allocation for $8,800 per week, managing the increased complexity of pilot deployments and stakeholder coordination. This team composition balances development velocity with quality assurance to deliver a production-ready system.

RoleResponsibilityAllocationWeekly RatePhase Cost
Technical LeadProduction Architecture & Guidance100%$11,000$132,000
Computer Vision AI LeadAdvanced Algorithm Development100%$10,000$120,000
Machine Learning EngineerGender Classification & Training100%$9,000$108,000
Edge Computing EngineerPerformance Optimization100%$8,500$102,000
Platform EngineerDeployment Infrastructure50%$8,500$51,000
Full-Stack EngineerPortal & API Development100%$7,500$90,000
QA EngineerTesting Strategy & Validation50%$7,000$42,000
Project ManagerPilot Coordination & Planning100%$8,800$105,600

Success Criteria

Performance Metrics

Latency Achievement: The system must consistently achieve sub-200ms end-to-end latency at the 90th percentile (p90) in production environment, with p99 targets to be established in the Gamma phase. This will include the complete pipeline from video frame capture to decision display on judge interface. Performance must be maintained even under full load with multiple athletes and all features enabled. Network variability and other real-world factors must be accounted for in measurements.

Accuracy Validation: We must achieve 90% correct squat assessment accuracy when compared to expert human judges. The system should have false positive rates below 5% and false negative rates below 10% to ensure fair competition. Accuracy must be consistent across different demographics, body types, and clothing choices. Edge cases should be handled gracefully with appropriate confidence scoring.

Reliability Target: The system must maintain 99% uptime during pilot testing periods with automatic recovery from failures. Any downtime should be limited to non-critical components with core judging functionality preserved. Mean time to recovery for any failures should be less than 5 minutes. Comprehensive logging must enable rapid diagnosis and resolution of any issues.

Scalability Proof: We must successfully support 5-10 simultaneous stations per venue without performance degradation. The system architecture must demonstrate ability to scale to full event requirements in later phases. Resource utilization should remain below 70% even at maximum load to ensure headroom exists. Load testing must validate performance under 2x expected capacity.

User Acceptance: We need positive feedback from 80% of pilot users including judges, athletes, and gym staff. The system must integrate smoothly into existing workflows without significant disruption. Training time for new users should be less than 30 minutes for basic operation. Support ticket volume should decrease over time as users become familiar with the system.

Risks & Mitigations

Technical Risks

Gender Classification Accuracy: The classification system may exhibit bias or errors in edge cases affecting competition fairness, while also navigating GDPR and privacy regulations around gender data processing. Certain demographics or presentation styles might be consistently misclassified by the ML model. To mitigate, we'll implement manual override capabilities accessible through simple judge interface. We'll also continuously collect training data to improve model accuracy and reduce bias over time.

Multi-Athlete Confusion: The system may incorrectly associate squats with wrong athletes in crowded scenarios. Tracking IDs might be lost or swapped when athletes cross paths or leave frame temporarily. We'll implement robust spatial segmentation and tracking algorithms with visual confirmation of athlete assignment. Clear visual indicators will show judges which athlete is being tracked and allow manual correction if needed.

Deployment Risks

Affiliate Gym Cooperation: Selected gyms may be reluctant to participate or unable to support pilot requirements. Technical limitations, staff availability, or member concerns could impact pilot execution. We'll provide incentives including free equipment and priority support during pilot period. We'll also maintain a backup list of potential pilot sites in case primary selections fall through.

Network Infrastructure: Gym networks may not support system bandwidth and latency requirements. Shared WiFi with members, poor cellular coverage, or restrictive firewalls could impact system operation. We'll design for cellular backup and implement efficient compression to minimize bandwidth needs. We'll also provide detailed network requirement documentation for IT staff at pilot sites.

Scale Risks

Hardware Procurement: Supply chain delays could impact availability of specialized cameras or computing equipment. Global chip shortages or shipping delays might affect project timeline. We'll place orders early with buffer stock to handle potential delays or failures. We'll also identify alternative suppliers and equipment options that meet minimum requirements.

Support Burden: High support requirements during pilot phase could overwhelm available resources. Multiple simultaneous deployments might create competing priorities for limited support staff. We'll create comprehensive self-service documentation and remote diagnostic capabilities. We'll also establish clear escalation procedures and potentially augment support team if needed.