LOGO

Machine Learning for Leaders: 5 Essentials to Know

March 31, 2021
Machine Learning for Leaders: 5 Essentials to Know

The Expanding Accessibility of Machine Learning

Currently, machine learning (ML) is experiencing a period of significant advancement. Sonali Sambhus, who leads developer and ML platform efforts at Square, characterizes this as “the democratization of ML.” This rapid evolution and progress have established ML as a crucial element for business and a catalyst for growth.

Challenges for Leaders New to ML

However, the swift changes within the ML field can be daunting for engineering and team leaders lacking a formal ML education. It’s common to encounter highly capable and assured leaders who find it difficult to engage in productive discussions about ML, despite potentially overseeing teams that develop it.

Extensive Experience in Machine Learning

My own experience spans over twenty years in the ML domain, including contributions to Apple’s development of a large-scale online app and music platform. At Reddit, as the senior director of engineering focused on safety, I leveraged ML to address and mitigate harmful online behaviors.

Insights from Leading ML Professionals

For this article, I consulted with a group of prominent ML leaders. These included Sambhus, Lior Gavish, co-founder of Monte Carlo, and Yotam Hadass, VP of engineering at Electric.ai. Their collective insights have been synthesized into five actionable and readily implementable lessons.

Key Takeaways and Practical Guidance

The following sections will detail these lessons, offering a practical framework for leaders seeking to understand and effectively navigate the world of machine learning. These are designed to be easily understood and applied, regardless of prior ML experience.

1. Strategies for Machine Learning Recruitment

Successfully recruiting for machine learning positions presents unique hurdles for organizations.

A primary difficulty lies in clearly distinguishing machine learning roles from related positions like data analysts, data engineers, and data scientists, due to significant overlap in job descriptions.

Determining the appropriate level of experience is also a common challenge. Truly substantial experience in deploying machine learning solutions to production is relatively rare; many resumes may claim ML experience that actually centers on rule-based systems rather than genuine machine learning models.

When building an ML team, prioritize hiring experienced professionals whenever feasible. However, also explore the potential of internal training programs to address talent gaps. Upskilling existing software engineers into data or ML engineering roles, or investing in ML education for promising candidates, can be highly effective.

5 machine learning essentials nontechnical leaders need to understandAn alternative approach to navigating these recruitment obstacles is to define roles based on these core areas:

  • Product: Seek individuals possessing both technical curiosity and a robust understanding of business and product development. This skillset often outweighs the need for expertise in complex modeling techniques.
  • Data: Prioritize candidates capable of model selection, feature engineering, data modeling and vectorization, and thorough results analysis.
  • Platform/Infrastructure: Focus on individuals who can assess, integrate, and construct platforms to enhance the efficiency of data and engineering teams, including ETL processes, data warehousing, and CI/CD pipelines for ML.

Remember the value of ongoing training; a motivated engineer with the right aptitude can evolve into a valuable ML specialist.

Maintaining connections with industry experts and academic researchers is another valuable strategy for keeping your team informed about the latest advancements in machine learning. High-quality bootcamps can also provide effective upskilling opportunities.

Organizational Design for Machine Learning Teams

Determining the optimal organizational structure for a Machine Learning (ML) team is a crucial decision. This impacts both operational efficiency and the reliability of business outcomes, and should be aligned with the company’s current growth stage and overall size.

Early-Stage Companies (Under 25 Employees)

For organizations in their initial phases, establishing a centralized, shared ML team is generally the most effective and expedient approach. This facilitates the development of essential infrastructure and ensures organizational preparedness.

In these early stages, the ML team should ideally comprise 10% to 20% of the total engineering workforce.

Mid-Stage Companies (25-500 Employees)

As companies mature into the mid-stage, a shift towards vertically integrated teams is recommended. This model offers a significant advantage in achieving a thorough understanding of the problems being addressed, as noted by Gavish.

Vertical integration also fosters sustained concentration and effective prioritization. This is particularly important as ML projects at this stage often require extended timelines and involve a higher degree of uncertainty.

Mature Companies (500+ Employees)

Organizations with over 500 employees should consider establishing a dedicated ML platform and infrastructure team. Square, for instance, with over 2,500 engineers and 100+ data scientists and ML engineers, maintains a separate team of over 15 engineers focused on platform and infrastructure.

ML teams within mature companies are typically aligned with specific business units – such as chatbot development or risk and fraud detection – rather than being organized around particular technologies.

It’s important to acknowledge that the appropriate team size is contingent upon the centrality of ML to the company’s products and services.

Ultimately, the ideal structure is not fixed and should be revisited as the organization evolves.

Machine Learning Pipeline Implementation

The process of deploying and sustaining machine learning (ML) pipelines shares significant similarities with standard software deployment and maintenance procedures.

However, specialized ML expertise is crucial for tasks such as model construction, optimization, evaluation, validation, and version control, alongside continuous performance monitoring.

Essential Pipeline Stages

Successfully establishing, deploying, and maintaining an ML pipeline involves several key steps:

  • Clearly define the business challenge and assess the suitability of an ML-based solution.
  • Thoroughly prepare and refine the datasets used for training and evaluation.
  • Distinguish between issues originating from the data itself and limitations within the model's design.
  • Implement robust testing, debugging, and versioning practices for all models.

Leveraging pre-built software solutions can substantially lower costs and reduce reliance on highly specialized ML engineers.

However, caution should be exercised to avoid inadvertently constructing a complex and unmanageable system lacking clear organization.

Available Tools and Platforms

Although the field is still evolving, platforms like Databricks, AWS SageMaker, Tecton, and Cortex offer substantial time and resource savings.

A diverse range of platforms and libraries are currently available, including TensorFlow, PyTorch, Keras, Scikit-learn, Pandas, and NLTK, each offering unique capabilities.

Selecting the appropriate tools depends on the specific requirements of the ML pipeline and the expertise of the development team.

4. Measuring and Assessing Machine Learning Models

A central difficulty with Machine Learning (ML) lies in ensuring consistent reliability. Determining whether a model is functioning as expected prior to deployment is crucial. Equally important is ongoing monitoring of performance within a production environment and the ability to effectively resolve any arising problems.

The approach to these challenges mirrors established software engineering practices, emphasizing the need for comprehensive observability.

Consistent monitoring and tracking of application performance are essential. Emmanuel Ameisen’s “Building Machine Learning Powered Applications” is a recommended resource for gaining a deeper understanding of these processes, as suggested by Hadass.

A model demonstrating superior performance compared to a baseline – a scenario without ML implementation – and exhibiting both stability and security, represents a viable candidate for production deployment. Prioritizing iterative development over striving for immediate perfection is a sound strategy.

Implementing models behind a feature flag provides a safety net, allowing for rapid deactivation should unforeseen issues emerge. Furthermore, the capacity to operate multiple model versions concurrently through A/B testing in a production setting significantly boosts confidence in new models.

This practice also assures a heightened overall level of reliability.

The quality of the dataset is paramount. It must be carefully constructed to accurately mirror real-world production conditions. Establishing a system for tracing predictions back to historical datasets and comparing them with those generated by prior model iterations is highly beneficial.

Robust metrics and evaluation are necessary to differentiate between effective and ineffective models, addressing key concerns such as:

  • The value delivered to the end user.
  • The safeguarding of sensitive data.
  • The consistent operational stability of the model.
  • The real-world applicability of predictions and recommendations.
  • The capacity to provide clear explanations for the model’s decision-making process.

Common Challenges to Avoid

Certain challenges in machine learning implementation, while appearing obvious, warrant careful consideration and review. Addressing these points can significantly assist your team in making informed decisions during crucial phases of a project.

Avoid the following:

  • Utilizing machine learning for tasks that are not well-suited, such as processes involving a simple, linear progression of actions.
  • Anticipating immediate outcomes. Achieving substantial results with machine learning requires sustained effort and iterative refinement.
  • Prioritizing model performance metrics to the detriment of evaluating overall product success.
  • Underestimating the expenses associated with necessary tooling and infrastructure, which can impede engineering progress.

Over the past ten years, machine learning has become a powerful tool for accelerating technological advancement. Its importance in driving automation, as well as enhancing profitability and growth, is undeniable.

Consequently, it is essential for leaders to develop a strong understanding of machine learning and remain current with its rapid evolution.

Effective integration of machine learning teams within an organization begins with identifying the ideal candidate profile and structuring the team to maximize efficiency and concentration. Team structure is paramount.

Leadership should prioritize guiding teams to develop complete, end-to-end models, incorporating robust observability and monitoring capabilities prior to deployment. Model evaluation should center on product outcomes, rather than solely on model-specific metrics. Proactive monitoring and engagement with external experts can help mitigate common issues arising under pressure and ensure the team remains informed about the latest innovations.

#machine learning#AI#artificial intelligence#leadership#non-technical#business