🔍 Selecting a machine learning model for a real-world project involves breaking down the problem, identifying potential models, and choosing the best solution.
💡 In a scenario where a web app needs to detect intruders without sending images to the cloud, tensorflow.js client-side in a browser is a suitable option.
📷 To detect intruders, the system should be able to identify humans in video frames, excluding cats, dogs, or other animals.
🔎 There are various pre-trained models that can be used to detect a person in an image, including face and hand models.
🔊 Sound can also be used to detect talking, even if the person is not visible.
🤔 Different models like image classification, object detection, body segmentation, pose estimation, face landmark detection, and hand pose estimation have pros and cons that should be considered.
⏱️ The inference speed of the model, measured in milliseconds or frames per second, is important for real-time applications.
🔍 When selecting an ML model, consider the classification time and the amount of memory used.
💻 Knowing your end users' expected working environment is important in deciding which models are suitable.
📱 Consider the device and internet connectivity when choosing a machine learning model.
💡 You can benchmark the ML model yourself by creating a simple website and recording execution time and file size.
⏱️ To calculate execution time, record a timestamp before and after executing the model and calculate the difference.
💾 To check file size, use the Chrome developer tools and inspect the network tab to see the assets being loaded.
📊 The memory usage of ML models should be considered before selecting one.
⏱️ The frames per second (FPS) performance of ML models is important for real-time applications.
💾 File size may be less of an issue if the ML model is loaded once and the device has fast Wi-Fi.
🔍 Further research is required to explore faster body segmentation models.
💡 High RAM usage of face landmarks and hand pose models should be considered for older mobile devices.
🔍 It is necessary to try different machine learning models to find the most suitable one for the intended use case.
🤖 Object detection provides more information with fast performance and lower RAM usage compared to image classification.
👥 Pose estimation can detect a maximum of six people at a time, but may not scale well for larger groups.
🤔 Choosing between pose estimation and object detection models depends on the need for speed and granularity.
🔎 Determining the best model requires investigation and research based on customer needs and the environment.
📷 The next section will guide you in creating a smart camera project using the selected object detection model.