Alpha Phase: Foundation & Proof

Timeline: September 1 - November 7, 2025 (10 weeks)
Target: Early November release in the Chicago Lab

Overview

The Alpha phase will establish the technical foundation through a working session approach, focusing on proving core capabilities and validating our technical approach. This phase will deliver a functional prototype demonstrating real-time squat detection with basic accuracy, operating in a controlled Chicago lab environment. We will validate that computer vision-based squat tracking is technically feasible and can meet HYROX's performance requirements. This proof-of-concept phase will determine whether to proceed with full development or pivot to alternative approaches.

Key Activities

The Alpha phase will encompass five major activity streams that establish the foundation for all subsequent development. These activities will validate technical feasibility while building the infrastructure and processes needed for full-scale development.

Planning & Discovery

Technical Discovery will involve conducting a thorough analysis of HYROX's current technology stack to understand integration requirements and constraints. We'll examine existing scoring systems, network infrastructure, and operational workflows at events while identifying critical integration points and documenting API requirements. Our technical team will compile a comprehensive list of technical questions that need resolution before proceeding to Beta phase. This discovery process will ensure we build a system that seamlessly integrates with HYROX's existing ecosystem rather than requiring wholesale changes to current operations.

Program Setup will establish a complete development environment including source control repositories and collaboration tools. We'll defer CI/CD pipeline implementation until the Platform Engineer joins in the Beta phase, but will configure secure access controls for all team members and stakeholders. We'll ensure proper separation between development, staging, and production environments while setting up project management tools, communication channels, and documentation wikis to facilitate efficient collaboration across the distributed team. This infrastructure investment early in the project will pay dividends throughout all subsequent phases.

Computer Vision Development

Squat Depth Detection will involve developing and validating core algorithms for detecting squat depth using advanced pose estimation techniques, with RTMPose as our baseline model due to its superior real-time performance. The initial implementation will focus on binary detection (valid squat vs. no squat) with clear visual feedback showing joint positions and angles. Our computer vision team will test multiple pose estimation models including MediaPipe, OpenPose, RTMPose, and explore ONNX for model interoperability to determine which provides the best balance of accuracy and performance. This foundational work will establish the technical approach that will be refined throughout subsequent phases.

Live Video Processing will implement near real-time tracking overlays that display pose estimation confidence scores and joint positions directly on the video feed. These overlays will show skeleton tracking, depth measurements, and confidence indicators that help validate the system's decision-making process. The implementation will commit to processing at minimum 15 fps with visual feedback updated in near real-time. This capability will demonstrate to stakeholders that the system can process live video streams effectively while providing transparent insight into how decisions are made.

Pre-recorded Testing will evaluate system performance using a curated dataset of reference videos and pose images captured in controlled environments. This testing will include videos with varying lighting conditions, different camera angles, and athletes of different body types to ensure robust detection. The team will build a comprehensive test suite that can be reused throughout development for regression testing. These controlled tests will establish baseline performance metrics that we'll improve upon in subsequent phases.

Confidence Visualization will display confidence thresholds and accuracy metrics in real-time overlays, making the system's decision process transparent. Visual indicators will show when the system has high confidence in its assessment versus when human review might be needed, including heat maps showing which body parts are being tracked most reliably and graphs displaying confidence scores over time. This transparency will build trust with stakeholders and help identify areas needing improvement.

Portal Development

Video Management will build recording and playback capabilities that allow for training data collection and post-event review. The system will automatically record all processed video streams with metadata including timestamps, detection results, and confidence scores while providing a web-based interface for easy access to recorded sessions. This infrastructure will be essential for continuous improvement of the detection algorithms and provide valuable data for troubleshooting.

Reporting Dashboard will display key metrics including detection accuracy, processing times, and system resource utilization. Without a Full-Stack Engineer in the Alpha phase, this will be a basic implementation that will be enhanced in subsequent phases. Real-time dashboards will show system health, current processing load, and any error conditions that require attention while historical reports track performance trends over time. These reports will provide stakeholders with clear visibility into system performance and reliability.

Logging Infrastructure will implement basic logging that captures essential system events for debugging. Comprehensive logging infrastructure with structured formats and automated analysis will be deferred until the Platform Engineer joins in the Beta phase. Structured logs will use consistent formatting to enable automated analysis and alerting on error conditions while the logging system includes different verbosity levels. This infrastructure will be critical for troubleshooting issues and understanding system behavior in different scenarios.

Manual Annotation Tools will include scaffolding for annotation interfaces to manually tag squat attempts. We plan to integrate external tools like V7 Darwin for establishing ground truth in the Beta phase, noting the per-user licensing costs for professional annotation platforms. Expert reviewers can mark squat depth, identify error cases, and provide ground truth labels for algorithm training while the annotation interface supports efficient workflows including keyboard shortcuts and batch processing. This human-in-the-loop approach will ensure continuous improvement of detection accuracy based on real-world data.

Hardware Setup

Camera Testing & Validation: We will evaluate multiple camera options to determine optimal specifications for field of view, frame rate, and low-light performance, accounting for industrial camera lead times in our procurement schedule. Initial testing will include consumer iPhones and Android devices to assess whether mobile solutions can meet our accuracy and latency requirements, though we expect dedicated cameras will be necessary. The Alpha phase will use temporary mounting systems (tripods, monopods, GorillaPods) for flexibility, with participants advised to handle equipment carefully as setups won't yet be ruggedized. Testing will include consumer webcams, industrial cameras, and specialized computer vision cameras to find the best cost-performance balance. Each camera will be tested for image quality, latency, reliability, and compatibility with our processing pipeline. The evaluation will produce a detailed comparison matrix with recommendations for production deployment.

Computing Platform: The team will initially rely on consumer laptops such as the Lenovo Legion series for development and testing, potentially utilizing Google Colab for cloud-based prototyping, then transition to NVIDIA Jetson or similar edge computing devices to validate production deployment capabilities. We'll benchmark different hardware configurations to ensure they can handle the required processing load while maintaining low latency. This will include testing thermal performance under sustained load and evaluating power consumption in typical deployment scenarios. The selected platform must provide headroom for additional features planned in later phases.

Lab Configuration: We will deploy a single-station setup in the Chicago lab using off-the-shelf equipment with temporary mounting systems to prove the concept. This will include mounting cameras at optimal angles, setting up proper lighting, and creating a controlled testing environment. The lab setup will replicate key aspects of a real gym environment while maintaining controlled conditions for consistent testing. This configuration will serve as the reference implementation for future deployments.

Integration

Camera-to-CV Connection: We will establish a reliable video pipeline from cameras to computer vision processing with minimal latency. This will include implementing efficient video capture, buffering strategies, and error handling for dropped frames or connection issues. The pipeline will support multiple video formats and resolutions while maintaining consistent performance. Robust error recovery ensures the system continues operating even when individual components experience temporary issues.

Local Portal Deployment: The web portal will run locally on the same machine as the CV processing to minimize latency during testing. Without a Full-Stack Engineer in Alpha phase, this will be a minimal implementation focused on core functionality. This configuration eliminates network delays and allows for rapid iteration during development. The local deployment will include all essential features while maintaining the architecture needed for future remote access. This approach validates the complete system workflow from video capture through result display.

Deliverables

Technical Deliverables

Working Prototype will be a functional system processing live video streams with basic squat detection capabilities. Given the exploratory nature of the Alpha phase and potential unknown challenges, these represent goals rather than firm commitments, and some compromises may be necessary to meet the timeline. The prototype will demonstrate end-to-end functionality from camera input to result display, validating our technical approach while integrating all core components. This prototype will serve as the foundation for all subsequent development phases.

Squat Detection Algorithm will achieve basic detection with 80% accuracy in controlled conditions and clear documentation of limitations. The algorithm will correctly identify squat depth for standard movements without occlusion or unusual positioning while we thoroughly document performance metrics including false positive and false negative rates. This baseline algorithm will be progressively refined throughout the project.

Pose Estimation Pipeline will be a complete pipeline demonstrating less than 500ms latency from frame capture to decision output. The pipeline will include all necessary preprocessing, inference, and postprocessing steps required for squat detection while our architecture documentation details data flow and processing stages. This pipeline architecture will scale to support additional features and higher performance requirements.

Technical Architecture Document: Comprehensive system design detailing all components, interfaces, and deployment architecture. The document will include architectural diagrams, component specifications, API definitions, and technology choices with justifications. Security considerations, scalability plans, and integration points will be thoroughly documented. This blueprint guides all subsequent development and ensures team alignment on technical approach.

Software Deliverables

Core Computer Vision Module: Functional pose estimation system with basic accuracy and real-time processing capabilities. The module will be modular and well-documented, allowing for easy enhancement and optimization in later phases. Unit tests will cover critical functionality with continuous integration ensuring code quality. This foundation module will be extended with additional capabilities throughout the project.

Local Web Portal: Basic interface providing video viewing, system configuration, and reporting capabilities. This will be a minimal implementation given the absence of a Full-Stack Engineer in the Alpha phase. The portal will use modern web technologies ensuring responsive performance and cross-browser compatibility. Authentication and authorization frameworks will be implemented even in this early phase to ensure security from the start. This portal will evolve into the full-featured production interface through iterative enhancement.

API Specifications: Draft API documentation for integration with judge interfaces and external systems, created on a best-efforts basis by the Technical Lead. The team will evaluate REST, GraphQL, and gRPC to determine the most appropriate protocol, considering integration patterns used in sports timing systems for HYROX's needs. API versioning strategy will be established to support future evolution without breaking existing integrations. These specifications enable third-party developers to build complementary tools and integrations.

Development Environment: Basic development and testing environments with manual deployment processes. Automated CI/CD pipelines will be implemented when the Platform Engineer joins in the Beta phase. Docker containers will ensure consistent environments across all team members and deployment targets. Automated testing frameworks will be integrated into the CI/CD pipeline to catch issues early. This infrastructure accelerates development velocity and improves code quality throughout the project.

Documentation

Feasibility Report will provide detailed analysis validating our technical approach with evidence from prototype testing. The report will include performance benchmarks, accuracy assessments, and identified technical risks with mitigation strategies while comparing the proposed approach against alternatives including human-only judging. This report will provide stakeholders with the confidence to proceed with full development.

Risk Assessment: Comprehensive evaluation of technical, operational, and business risks with mitigation strategies. Each risk will be scored for probability and impact with specific mitigation actions and responsible parties identified. The assessment will be updated throughout the project as new risks emerge or existing risks are resolved. This proactive risk management approach prevents surprises and ensures project success.

Hardware Recommendations: Detailed specifications for production deployment including cameras, computing hardware, and networking equipment. Recommendations will include multiple options at different price points with clear trade-offs documented. Vendor relationships and pricing negotiations will be initiated to ensure equipment availability for later phases. These recommendations guide procurement decisions and budget planning.

Integration Plan: Step-by-step plan for integrating with HYROX's existing systems and workflows. The plan will identify all integration points, data flows, and required modifications to existing systems. Timeline and resource requirements for integration work will be clearly defined with dependencies identified. This roadmap ensures smooth integration without disrupting current operations.

Demonstration

Live Demo will be a working demonstration in Chicago lab showing real-time squat detection with multiple test subjects. The demo will showcase system capabilities including accurate detection, real-time feedback, and basic reporting features across multiple scenarios including ideal conditions and challenging cases. This hands-on demonstration will build stakeholder confidence and generate excitement for the project.

Video Recordings: Comprehensive recordings of successful detection scenarios for stakeholder review and training purposes. Videos will include various athletes, movements, and conditions to demonstrate system versatility and limitations. Each recording will be annotated with system outputs, confidence scores, and explanations of decision-making. These materials support stakeholder communication and future training efforts.

Performance Metrics Dashboard: Basic real-time display of system performance including latency, accuracy, and resource utilization. This initial version will provide essential metrics with more sophisticated visualizations planned for later phases. The dashboard will provide clear visualization of key metrics with historical trends and alerting for anomalies. Drill-down capabilities will allow investigation of specific events or time periods for detailed analysis. This transparency ensures stakeholders understand system capabilities and limitations.

Stakeholder Presentation: Comprehensive presentation with technical details, business case, and go/no-go recommendation for proceeding to Beta phase. The presentation will include live demonstrations, recorded examples, and detailed project plans for subsequent phases. Cost-benefit analysis and ROI projections will support the business case for continued investment. This presentation serves as the critical decision point for project continuation.

Staffing Summary

The Alpha phase requires a focused team of five key roles working over 10 weeks with a weekly cost of $33,150 and a total phase investment of $331,500.

The Technical Lead provides overall technical vision and architecture at 100% allocation contributing $11,000 per week to ensure cohesive system design.

The Computer Vision AI Lead drives the core pose estimation development at 100% allocation for $10,000 weekly, focusing on the foundational algorithms that will power squat detection.

The Machine Learning Engineer supports model development and optimization at 50% allocation for $4,500 per week, balancing this project with other initiatives while providing essential ML expertise.

The Edge Computing Engineer ensures the system will meet performance requirements on target hardware through 50% allocation at $4,250 weekly, optimizing for the sub-500ms latency targets.

The Project Manager coordinates team efforts and stakeholder communication at 50% allocation for $4,400 per week, maintaining alignment across all work streams. This lean team structure maximizes efficiency during the proof-of-concept phase while ensuring all critical expertise remains available for rapid problem resolution.

Role	Responsibility	Allocation	Weekly Rate	Phase Cost
Technical Lead	Architecture & Technical Vision	100%	$11,000	$110,000
Computer Vision AI Lead	Pose Estimation Development	100%	$10,000	$100,000
Machine Learning Engineer	Model Training & Optimization	50%	$9,000	$45,000
Edge Computing Engineer	Hardware Performance Optimization	50%	$8,500	$42,500
Project Manager	Coordination & Stakeholder Mgmt	50%	$8,800	$44,000

Success Criteria

Technical Validation

Pose Estimation Reliability: The system must prove that pose estimation works reliably in a gym environment with standard lighting and camera positions. We need to demonstrate consistent detection of major body joints with confidence scores above 0.8 for at least 90% of frames. The system should handle different body types and clothing without significant degradation in accuracy. This validation confirms the fundamental technical approach is sound and worth continued investment.

Performance Baseline: We must achieve sub-500ms detection latency in the prototype running on selected edge hardware. This will include the complete pipeline from frame capture through pose estimation to squat decision and result display. The system should maintain this performance while processing video without dropping frames or degrading accuracy. This baseline performance gives confidence in a path toward sub-200ms production target through optimization using TensorRT, Intel OpenVINO, and other acceleration techniques leveraging ONNX format for cross-platform deployment.

Accuracy Target: The prototype must correctly identify the majority of squats in controlled conditions with clear camera views, maintaining a documented set of known root causes for false negatives and false positives. The system will incorporate a target zone detection signal to indicate when athletes reach the required depth, helping qualify the detection window and improve accuracy. False positives should be below 10% to avoid frustrating athletes with incorrect no-reps. The system should provide confidence scores that correlate well with actual accuracy, enabling reliable threshold setting. This accuracy level validates that the approach can meet production requirements with further refinement.

Integration Proof: We must successfully process live camera feeds and display results in real-time. Web portal development will be minimal unless the Technical Lead has bandwidth, with robust implementation deferred until a Full-Stack Engineer joins. The complete workflow from athlete movement to judge display must function reliably without manual intervention. All components must work together seamlessly with proper error handling and recovery mechanisms. This integration validates the overall system architecture and component interactions.

Stakeholder Buy-in: We need to secure approval from key stakeholders to proceed to Beta phase based on demonstrated capabilities. Technical leadership must be confident the approach will meet performance and accuracy requirements. Business stakeholders must see clear value proposition and ROI potential from the investment. This buy-in ensures continued support and resources for subsequent development phases.

Risks & Mitigations

Technology Risks

Pose Estimation Accuracy: Current algorithms may not work reliably with available computing resources or in real gym conditions. The technology might struggle with occlusion, unusual body positions, or rapid movements common in competition. To mitigate this, the Computer Vision Specialist will test multiple pose estimation models in parallel, including both off-the-shelf and custom approaches. We'll also design the system architecture to support algorithm upgrades without major rework, ensuring we can adopt improvements as they become available.

Hardware Limitations: Selected edge computing platforms may not provide sufficient processing power for real-time requirements. Thermal throttling, memory constraints, or I/O bottlenecks could prevent achieving latency targets. The Edge Computing Engineer will benchmark multiple hardware platforms early in the phase, including NVIDIA Jetson, Intel Neural Compute Stick, and Apple Silicon options. We'll also design the software to support distributed processing across multiple devices if single-device performance proves insufficient.

Integration Risks

HYROX System Compatibility: Unknown integration requirements with existing HYROX scoring and event management systems. Legacy systems might have undocumented APIs or architectural constraints that complicate integration. The Technical Lead will conduct early API discovery sessions with HYROX technical staff to identify all integration points and requirements. We'll also design our system with flexible integration adapters that can accommodate various API styles and protocols.

Network Infrastructure: Venue connectivity may be inadequate for system requirements, particularly in older facilities. Network congestion during events with thousands of participants could impact system reliability. We'll design the system for offline operation with periodic synchronization to handle unreliable networks. The architecture will also support multiple network paths including cellular backup to ensure connectivity.

Timeline Risks

Scope Creep: Stakeholders may request additional features or capabilities beyond the planned Alpha scope. Feature requests could delay core development and prevent achieving phase objectives within timeline. The Technical Project Manager will establish clear phase gates with documented scope and formal change control processes. Any scope additions will be evaluated for impact and deferred to appropriate phases to maintain timeline.

Technical Blockers: Unforeseen technical challenges could emerge that require fundamental architecture changes. Complex problems might require expertise not available on the current team. We'll implement time-boxed technical spikes to explore risky areas early with defined fallback approaches if primary solutions prove unviable. The team will also maintain relationships with external experts who can provide consultation if needed.

Overview​

Key Activities​

Planning & Discovery​

Computer Vision Development​

Portal Development​

Hardware Setup​

Integration​

Deliverables​

Technical Deliverables​

Software Deliverables​

Documentation​

Demonstration​

Staffing Summary​

Success Criteria​

Technical Validation​

Risks & Mitigations​

Technology Risks​

Integration Risks​

Timeline Risks​